Skip to content

edlaver/m365-copilot-bun-proxy

Repository files navigation

m365-copilot-bun-proxy

TypeScript/Bun port of the original M365 Copilot Bun Proxy .NET proxy + CLI.

Stack

  • Bun runtime
  • Hono for HTTP routing / reverse-proxy behavior
  • Zod for configuration validation
  • OpenTUI for interactive CLI chat UI

Install

bun install

Harness config examples

Example harness configuration files live in harness-config-examples/.

  • harness-config-examples/opencode.jsonc provides a ready-to-use OpenCode config wired to this proxy on http://localhost:4000/v1/.
  • Copy and adapt these examples for your local harness setup.

Run proxy

bun run start:proxy

To enable debug markdown logs (requires debugPath in config):

bun run start:proxy -- --debug

You can also pass an explicit value:

bun run start:proxy -- --debug=false

Logging level is configured with logLevel in config.json (default info):

  • Each proxy startup with --debug writes logs into a timestamp-named session subfolder under debugPath (for example logs/2026-02-25T16-58-11-123Z/).
  • Request logs (incoming-request, request, substrate-request) are always written when debug logging is enabled.
  • Response log filtering by level:
  • trace: includes substrate-delta
  • debug: includes substrate-response, response, response-headers, and outgoing-response
  • info: includes outgoing-response
  • warning: includes outgoing-response only for HTTP 4xx status
  • error: includes outgoing-response only for HTTP 5xx status

To capture outbound SSE payloads (for text/event-stream responses), set:

{
  "logStreamingResponseBody": true
}

When enabled, and debug logging is active (--debug) with logLevel of debug or trace, the proxy writes outgoing-stream-body markdown logs containing the streamed SSE body.

Default listen URL is http://localhost:4000.

Configuration is loaded from config.json (and config.{env}.json when NODE_ENV is set).

Substrate settings are grouped under the substrate object in config (for example substrate.hubPath).

ignoreIncomingAuthorizationHeader controls whether inbound Authorization headers are used by the proxy. Default is true, which makes the proxy ignore incoming auth and use cached/auto-fetched tokens instead.

playwrightBrowser controls which Playwright browser is used when the proxy auto-acquires a token. Supported values: edge (default), chrome, chromium, firefox, webkit (msedge is also accepted as an alias for edge).

temporaryChat (default true) enables temporary-chat mode for Substrate by appending disableMemory=1 to the websocket hub URL query string. This will prevent Copilot from showing your conversation history in the sidebar.

openAiTransformMode controls how requests are translated for M365 Copilot:

  • simulated (default): sends the full incoming OpenAI JSON payload as a markdown JSON block and asks Copilot to respond in the same endpoint format; proxy extracts JSON from the response block and returns it.
  • mapped: uses the legacy request/response mapping logic.

substrate.earlyCompleteOnSimulatedPayload (default false) controls early websocket completion in simulated mode. It is evaluated in src/proxy/clients.ts, and only triggers once a fully parseable simulated payload is detected. By design, tool-call payloads are excluded from early completion.

substrate.incrementalSimulatedContentStreaming (default false) enables a guarded incremental extractor in the simulated SSE bridge (src/proxy/server.ts) that can emit partial choices[0].message.content before full JSON parse completes.

Simulated Streaming Flag Interaction

These are independent flags at different layers with partial overlap:

  • earlyCompleteOnSimulatedPayload is in the Substrate client loop (src/proxy/clients.ts). It stops reading websocket frames once a fully parseable simulated payload is detected (hasCompleteSimulatedPayload).
  • incrementalSimulatedContentStreaming is in the proxy SSE bridge (src/proxy/server.ts). It can emit partial message.content before full payload parse by using the incremental extractor (src/proxy/openai.ts).

How they combine:

  • earlyComplete=false, incremental=false: parse-then-emit behavior.
  • earlyComplete=true, incremental=false: still parse-then-emit, but upstream websocket may end earlier for plain text simulated payloads.
  • earlyComplete=false, incremental=true: partial content can stream early; websocket still runs normally.
  • earlyComplete=true, incremental=true: fastest plain-text path; incremental emits early text, then websocket can stop early once payload is complete.

Important caveats:

  • Incremental mode is auto-disabled for strict tool-validation flows and structured response format (src/proxy/server.ts).
  • Incremental mode suppresses itself if tool_calls is detected mid-stream (src/proxy/server.ts).
  • earlyCompleteOnSimulatedPayload does not early-complete tool-call payloads by design (src/proxy/clients.ts).

Use CONFIG__openAiTransformMode=mapped if you need to revert to the legacy behavior.

You can override config values via env vars with the CONFIG__ prefix, for example:

CONFIG__listenUrl=http://localhost:4010 bun run start:proxy

Example: force automatic token acquisition to use Chrome instead of Edge:

CONFIG__playwrightBrowser=chrome bun run start:proxy

To override nested values, use double underscores for each path segment, for example:

CONFIG__substrate__hubPath=wss://substrate.office.com/m365Copilot/Chathub bun run start:proxy

API endpoints

  • POST /v1/chat/completions
  • POST /openai/v1/chat/completions
  • GET /v1/models
  • GET /openai/v1/models
  • POST /v1/responses
  • POST /openai/v1/responses
  • GET /v1/responses
  • GET /openai/v1/responses
  • GET /v1/responses/{response_id}
  • GET /openai/v1/responses/{response_id}
  • DELETE /v1/responses/{response_id}
  • DELETE /openai/v1/responses/{response_id}

Available models

The proxy accepts any OpenAI-compatible model string, but for Substrate transport it maps known model IDs to a tone value in the outgoing websocket invocation payload.

tone is the Copilot UI option to pick a model type, out of:

  • "Auto" => magic
  • "Quick Response" => Chat
  • "Think Deeper" => Reasoning
  • "GPT5.2 Quick" => Gpt_5_2_Chat
  • "GPT5.2 Think deeper" => Gpt_5_2_Reasoning
  • "GPT5.4 Quick" => Gpt_5_4_Chat
  • "GPT5.4 Think deeper" => Gpt_5_4_Reasoning

Model to Substrate tone mapping:

  • m365-copilot -> magic
  • m365-copilot-auto -> magic
  • m365-copilot-magic -> magic
  • Any unknown model value -> magic
  • m365-copilot-quick -> Chat
  • m365-copilot-reasoning -> Reasoning
  • m365-copilot-gpt5.2-quick -> Gpt_5_2_Chat
  • m365-copilot-gpt5.2-reasoning -> Gpt_5_2_Reasoning
  • m365-copilot-gpt5.4-quick -> Gpt_5_4_Chat
  • m365-copilot-gpt5.4-reasoning -> Gpt_5_4_Reasoning

Notes:

  • If model is omitted, the proxy uses defaultModel from config (defaults to m365-copilot).
  • GET /v1/models (and GET /openai/v1/models) returns the full supported model list above.

Chat Completions Tool Calling

The proxy supports OpenAI-style tools and tool_choice for POST /v1/chat/completions.

Example request:

curl -s http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-m365-transport: substrate" \
  -d '{
    "model": "m365-copilot",
    "messages": [
      { "role": "user", "content": "What is the weather in London?" }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Lookup weather by city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": { "type": "string" }
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Example tool-call response shape:

{
  "id": "chatcmpl_...",
  "object": "chat.completion",
  "created": 1739986369,
  "model": "m365-copilot",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_...",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\":\"London\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Strictness behavior:

  • If tool_choice is required or a specific function, the proxy returns 400 invalid_tool_output when no valid tool-call JSON can be extracted from assistant output.
  • If tool_choice is auto (or tools are not strictly required), the proxy falls back to a normal assistant text completion when tool-call JSON is not found.

Input normalization notes:

  • JSON-stringified message.content, tool payloads, and function arguments are parsed best-effort and re-serialized to canonical minified JSON when valid.
  • Assistant message content containing serialized tool_calls structures is preserved as tool-call context for downstream Copilot prompt construction.

Responses API usage

Create response:

curl -s http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "x-m365-transport: substrate" \
  -d '{
    "model": "m365-copilot",
    "input": "Write a TypeScript function that validates UUIDs."
  }'

Continue a conversation using previous_response_id:

curl -s http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "x-m365-transport: substrate" \
  -d '{
    "model": "m365-copilot",
    "previous_response_id": "resp_abc123",
    "input": "Now add tests."
  }'

Streaming (stream: true) emits SSE events:

  • response.created
  • response.in_progress
  • response.output_item.added
  • response.output_text.delta
  • response.output_text.done
  • response.output_item.done
  • response.completed
  • error (SSE error event on stream failure)

By default, the proxy ignores inbound Authorization and attempts to use a cached token or auto-acquire one via Playwright for chat/responses requests.

The browser used for that auto-acquisition is controlled by playwrightBrowser in config (or CONFIG__playwrightBrowser in env).

To allow pass-through Authorization headers from clients, set:

CONFIG__ignoreIncomingAuthorizationHeader=false bun run start:proxy

Build executable

bun run build

This produces a single-file executable in dist/ and copies config.json alongside it.

Run CLI

bun run cli -- help
bun run cli -- status
bun run cli -- chat
bun run cli -- chat --api responses
bun run cli -- token set --token "<jwt>"

By default, CLI chat requests do not send an Authorization header. The proxy handles token acquisition when needed. Use --token or YARPILOT_TOKEN only when you want to force a specific token from the CLI.

In chat mode, the CLI supports these slash commands:

  • /status (token + connection status)
  • /api (show current API mode)
  • /api completions or /api responses (toggle endpoint)
  • /token (paste a new token)
  • /cleartoken (clear cached token)
  • /exit (quit)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors