m365-copilot-bun-proxy

TypeScript/Bun port of the original M365 Copilot Bun Proxy .NET proxy + CLI.

Stack

Bun runtime
Hono for HTTP routing / reverse-proxy behavior
Zod for configuration validation
OpenTUI for interactive CLI chat UI

Install

bun install

Harness config examples

Example harness configuration files live in harness-config-examples/.

harness-config-examples/opencode.jsonc provides a ready-to-use OpenCode config wired to this proxy on http://localhost:4000/v1/.
Copy and adapt these examples for your local harness setup.

Run proxy

bun run start:proxy

To enable debug markdown logs (requires debugPath in config):

bun run start:proxy -- --debug

You can also pass an explicit value:

bun run start:proxy -- --debug=false

Logging level is configured with logLevel in config.json (default info):

Each proxy startup with --debug writes logs into a timestamp-named session subfolder under debugPath (for example logs/2026-02-25T16-58-11-123Z/).
Request logs (incoming-request, request, substrate-request) are always written when debug logging is enabled.
Response log filtering by level:
trace: includes substrate-delta
debug: includes substrate-response, response, response-headers, and outgoing-response
info: includes outgoing-response
warning: includes outgoing-response only for HTTP 4xx status
error: includes outgoing-response only for HTTP 5xx status

To capture outbound SSE payloads (for text/event-stream responses), set:

{
  "logStreamingResponseBody": true
}

When enabled, and debug logging is active (--debug) with logLevel of debug or trace, the proxy writes outgoing-stream-body markdown logs containing the streamed SSE body.

Default listen URL is http://localhost:4000.

Configuration is loaded from config.json (and config.{env}.json when NODE_ENV is set).

Substrate settings are grouped under the substrate object in config (for example substrate.hubPath).

ignoreIncomingAuthorizationHeader controls whether inbound Authorization headers are used by the proxy. Default is true, which makes the proxy ignore incoming auth and use cached/auto-fetched tokens instead.

playwrightBrowser controls which Playwright browser is used when the proxy auto-acquires a token. Supported values: edge (default), chrome, chromium, firefox, webkit (msedge is also accepted as an alias for edge).

temporaryChat (default true) enables temporary-chat mode for Substrate by appending disableMemory=1 to the websocket hub URL query string. This will prevent Copilot from showing your conversation history in the sidebar.

openAiTransformMode controls how requests are translated for M365 Copilot:

simulated (default): sends the full incoming OpenAI JSON payload as a markdown JSON block and asks Copilot to respond in the same endpoint format; proxy extracts JSON from the response block and returns it.
mapped: uses the legacy request/response mapping logic.

substrate.earlyCompleteOnSimulatedPayload (default false) controls early websocket completion in simulated mode. It is evaluated in src/proxy/clients.ts, and only triggers once a fully parseable simulated payload is detected. By design, tool-call payloads are excluded from early completion.

substrate.incrementalSimulatedContentStreaming (default false) enables a guarded incremental extractor in the simulated SSE bridge (src/proxy/server.ts) that can emit partial choices[0].message.content before full JSON parse completes.

Simulated Streaming Flag Interaction

These are independent flags at different layers with partial overlap:

earlyCompleteOnSimulatedPayload is in the Substrate client loop (src/proxy/clients.ts). It stops reading websocket frames once a fully parseable simulated payload is detected (hasCompleteSimulatedPayload).
incrementalSimulatedContentStreaming is in the proxy SSE bridge (src/proxy/server.ts). It can emit partial message.content before full payload parse by using the incremental extractor (src/proxy/openai.ts).

How they combine:

earlyComplete=false, incremental=false: parse-then-emit behavior.
earlyComplete=true, incremental=false: still parse-then-emit, but upstream websocket may end earlier for plain text simulated payloads.
earlyComplete=false, incremental=true: partial content can stream early; websocket still runs normally.
earlyComplete=true, incremental=true: fastest plain-text path; incremental emits early text, then websocket can stop early once payload is complete.

Important caveats:

Incremental mode is auto-disabled for strict tool-validation flows and structured response format (src/proxy/server.ts).
Incremental mode suppresses itself if tool_calls is detected mid-stream (src/proxy/server.ts).
earlyCompleteOnSimulatedPayload does not early-complete tool-call payloads by design (src/proxy/clients.ts).

Use CONFIG__openAiTransformMode=mapped if you need to revert to the legacy behavior.

You can override config values via env vars with the CONFIG__ prefix, for example:

CONFIG__listenUrl=http://localhost:4010 bun run start:proxy

Example: force automatic token acquisition to use Chrome instead of Edge:

CONFIG__playwrightBrowser=chrome bun run start:proxy

To override nested values, use double underscores for each path segment, for example:

CONFIG__substrate__hubPath=wss://substrate.office.com/m365Copilot/Chathub bun run start:proxy

API endpoints

POST /v1/chat/completions
POST /openai/v1/chat/completions
GET /v1/models
GET /openai/v1/models
POST /v1/responses
POST /openai/v1/responses
GET /v1/responses
GET /openai/v1/responses
GET /v1/responses/{response_id}
GET /openai/v1/responses/{response_id}
DELETE /v1/responses/{response_id}
DELETE /openai/v1/responses/{response_id}

Available models

The proxy accepts any OpenAI-compatible model string, but for Substrate transport it maps known model IDs to a tone value in the outgoing websocket invocation payload.

tone is the Copilot UI option to pick a model type, out of:

"Auto" => magic
"Quick Response" => Chat
"Think Deeper" => Reasoning
"GPT5.2 Quick" => Gpt_5_2_Chat
"GPT5.2 Think deeper" => Gpt_5_2_Reasoning
"GPT5.4 Quick" => Gpt_5_4_Chat
"GPT5.4 Think deeper" => Gpt_5_4_Reasoning

Model to Substrate tone mapping:

m365-copilot -> magic
m365-copilot-auto -> magic
m365-copilot-magic -> magic
Any unknown model value -> magic
m365-copilot-quick -> Chat
m365-copilot-reasoning -> Reasoning
m365-copilot-gpt5.2-quick -> Gpt_5_2_Chat
m365-copilot-gpt5.2-reasoning -> Gpt_5_2_Reasoning
m365-copilot-gpt5.4-quick -> Gpt_5_4_Chat
m365-copilot-gpt5.4-reasoning -> Gpt_5_4_Reasoning

Notes:

If model is omitted, the proxy uses defaultModel from config (defaults to m365-copilot).
GET /v1/models (and GET /openai/v1/models) returns the full supported model list above.

Chat Completions Tool Calling

The proxy supports OpenAI-style tools and tool_choice for POST /v1/chat/completions.

Example request:

curl -s http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-m365-transport: substrate" \
  -d '{
    "model": "m365-copilot",
    "messages": [
      { "role": "user", "content": "What is the weather in London?" }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Lookup weather by city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": { "type": "string" }
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Example tool-call response shape:

{
  "id": "chatcmpl_...",
  "object": "chat.completion",
  "created": 1739986369,
  "model": "m365-copilot",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_...",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\":\"London\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Strictness behavior:

If tool_choice is required or a specific function, the proxy returns 400 invalid_tool_output when no valid tool-call JSON can be extracted from assistant output.
If tool_choice is auto (or tools are not strictly required), the proxy falls back to a normal assistant text completion when tool-call JSON is not found.

Input normalization notes:

JSON-stringified message.content, tool payloads, and function arguments are parsed best-effort and re-serialized to canonical minified JSON when valid.
Assistant message content containing serialized tool_calls structures is preserved as tool-call context for downstream Copilot prompt construction.

Responses API usage

Create response:

curl -s http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "x-m365-transport: substrate" \
  -d '{
    "model": "m365-copilot",
    "input": "Write a TypeScript function that validates UUIDs."
  }'

Continue a conversation using previous_response_id:

curl -s http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "x-m365-transport: substrate" \
  -d '{
    "model": "m365-copilot",
    "previous_response_id": "resp_abc123",
    "input": "Now add tests."
  }'

Streaming (stream: true) emits SSE events:

response.created
response.in_progress
response.output_item.added
response.output_text.delta
response.output_text.done
response.output_item.done
response.completed
error (SSE error event on stream failure)

By default, the proxy ignores inbound Authorization and attempts to use a cached token or auto-acquire one via Playwright for chat/responses requests.

The browser used for that auto-acquisition is controlled by playwrightBrowser in config (or CONFIG__playwrightBrowser in env).

To allow pass-through Authorization headers from clients, set:

CONFIG__ignoreIncomingAuthorizationHeader=false bun run start:proxy

Build executable

bun run build

This produces a single-file executable in dist/ and copies config.json alongside it.

Run CLI

bun run cli -- help
bun run cli -- status
bun run cli -- chat
bun run cli -- chat --api responses
bun run cli -- token set --token "<jwt>"

By default, CLI chat requests do not send an Authorization header. The proxy handles token acquisition when needed. Use --token or YARPILOT_TOKEN only when you want to force a specific token from the CLI.

In chat mode, the CLI supports these slash commands:

/status (token + connection status)
/api (show current API mode)
/api completions or /api responses (toggle endpoint)
/token (paste a new token)
/cleartoken (clear cached token)
/exit (quit)

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
harness-config-examples		harness-config-examples
src		src
tests		tests
.gitignore		.gitignore
.prettierrc.json		.prettierrc.json
README.md		README.md
bun.lock		bun.lock
config.json		config.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

m365-copilot-bun-proxy

Stack

Install

Harness config examples

Run proxy

Simulated Streaming Flag Interaction

API endpoints

Available models

Chat Completions Tool Calling

Responses API usage

Build executable

Run CLI

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

m365-copilot-bun-proxy

Stack

Install

Harness config examples

Run proxy

Simulated Streaming Flag Interaction

API endpoints

Available models

Chat Completions Tool Calling

Responses API usage

Build executable

Run CLI

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages