TypeScript/Bun port of the original M365 Copilot Bun Proxy .NET proxy + CLI.
- Bun runtime
- Hono for HTTP routing / reverse-proxy behavior
- Zod for configuration validation
- OpenTUI for interactive CLI chat UI
bun installExample harness configuration files live in harness-config-examples/.
harness-config-examples/opencode.jsoncprovides a ready-to-use OpenCode config wired to this proxy onhttp://localhost:4000/v1/.- Copy and adapt these examples for your local harness setup.
bun run start:proxyTo enable debug markdown logs (requires debugPath in config):
bun run start:proxy -- --debugYou can also pass an explicit value:
bun run start:proxy -- --debug=falseLogging level is configured with logLevel in config.json (default info):
- Each proxy startup with
--debugwrites logs into a timestamp-named session subfolder underdebugPath(for examplelogs/2026-02-25T16-58-11-123Z/). - Request logs (
incoming-request,request,substrate-request) are always written when debug logging is enabled. - Response log filtering by level:
trace: includessubstrate-deltadebug: includessubstrate-response,response,response-headers, andoutgoing-responseinfo: includesoutgoing-responsewarning: includesoutgoing-responseonly for HTTP 4xx statuserror: includesoutgoing-responseonly for HTTP 5xx status
To capture outbound SSE payloads (for text/event-stream responses), set:
{
"logStreamingResponseBody": true
}When enabled, and debug logging is active (--debug) with logLevel of debug or trace, the proxy writes outgoing-stream-body markdown logs containing the streamed SSE body.
Default listen URL is http://localhost:4000.
Configuration is loaded from config.json (and config.{env}.json when NODE_ENV is set).
Substrate settings are grouped under the substrate object in config (for example substrate.hubPath).
ignoreIncomingAuthorizationHeader controls whether inbound Authorization headers are used by the proxy. Default is true, which makes the proxy ignore incoming auth and use cached/auto-fetched tokens instead.
playwrightBrowser controls which Playwright browser is used when the proxy auto-acquires a token. Supported values: edge (default), chrome, chromium, firefox, webkit (msedge is also accepted as an alias for edge).
temporaryChat (default true) enables temporary-chat mode for Substrate by appending disableMemory=1 to the websocket hub URL query string. This will prevent Copilot from showing your conversation history in the sidebar.
openAiTransformMode controls how requests are translated for M365 Copilot:
simulated(default): sends the full incoming OpenAI JSON payload as a markdown JSON block and asks Copilot to respond in the same endpoint format; proxy extracts JSON from the response block and returns it.mapped: uses the legacy request/response mapping logic.
substrate.earlyCompleteOnSimulatedPayload (default false) controls early websocket completion in simulated mode. It is evaluated in src/proxy/clients.ts, and only triggers once a fully parseable simulated payload is detected. By design, tool-call payloads are excluded from early completion.
substrate.incrementalSimulatedContentStreaming (default false) enables a guarded incremental extractor in the simulated SSE bridge (src/proxy/server.ts) that can emit partial choices[0].message.content before full JSON parse completes.
These are independent flags at different layers with partial overlap:
earlyCompleteOnSimulatedPayloadis in the Substrate client loop (src/proxy/clients.ts). It stops reading websocket frames once a fully parseable simulated payload is detected (hasCompleteSimulatedPayload).incrementalSimulatedContentStreamingis in the proxy SSE bridge (src/proxy/server.ts). It can emit partialmessage.contentbefore full payload parse by using the incremental extractor (src/proxy/openai.ts).
How they combine:
earlyComplete=false,incremental=false: parse-then-emit behavior.earlyComplete=true,incremental=false: still parse-then-emit, but upstream websocket may end earlier for plain text simulated payloads.earlyComplete=false,incremental=true: partial content can stream early; websocket still runs normally.earlyComplete=true,incremental=true: fastest plain-text path; incremental emits early text, then websocket can stop early once payload is complete.
Important caveats:
- Incremental mode is auto-disabled for strict tool-validation flows and structured response format (
src/proxy/server.ts). - Incremental mode suppresses itself if
tool_callsis detected mid-stream (src/proxy/server.ts). earlyCompleteOnSimulatedPayloaddoes not early-complete tool-call payloads by design (src/proxy/clients.ts).
Use CONFIG__openAiTransformMode=mapped if you need to revert to the legacy behavior.
You can override config values via env vars with the CONFIG__ prefix, for example:
CONFIG__listenUrl=http://localhost:4010 bun run start:proxyExample: force automatic token acquisition to use Chrome instead of Edge:
CONFIG__playwrightBrowser=chrome bun run start:proxyTo override nested values, use double underscores for each path segment, for example:
CONFIG__substrate__hubPath=wss://substrate.office.com/m365Copilot/Chathub bun run start:proxyPOST /v1/chat/completionsPOST /openai/v1/chat/completionsGET /v1/modelsGET /openai/v1/modelsPOST /v1/responsesPOST /openai/v1/responsesGET /v1/responsesGET /openai/v1/responsesGET /v1/responses/{response_id}GET /openai/v1/responses/{response_id}DELETE /v1/responses/{response_id}DELETE /openai/v1/responses/{response_id}
The proxy accepts any OpenAI-compatible model string, but for Substrate transport it maps known model IDs to a tone value in the outgoing websocket invocation payload.
tone is the Copilot UI option to pick a model type, out of:
- "Auto" =>
magic - "Quick Response" =>
Chat - "Think Deeper" =>
Reasoning - "GPT5.2 Quick" =>
Gpt_5_2_Chat - "GPT5.2 Think deeper" =>
Gpt_5_2_Reasoning - "GPT5.4 Quick" =>
Gpt_5_4_Chat - "GPT5.4 Think deeper" =>
Gpt_5_4_Reasoning
Model to Substrate tone mapping:
m365-copilot->magicm365-copilot-auto->magicm365-copilot-magic->magic- Any unknown model value ->
magic m365-copilot-quick->Chatm365-copilot-reasoning->Reasoningm365-copilot-gpt5.2-quick->Gpt_5_2_Chatm365-copilot-gpt5.2-reasoning->Gpt_5_2_Reasoningm365-copilot-gpt5.4-quick->Gpt_5_4_Chatm365-copilot-gpt5.4-reasoning->Gpt_5_4_Reasoning
Notes:
- If
modelis omitted, the proxy usesdefaultModelfrom config (defaults tom365-copilot). GET /v1/models(andGET /openai/v1/models) returns the full supported model list above.
The proxy supports OpenAI-style tools and tool_choice for POST /v1/chat/completions.
Example request:
curl -s http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-m365-transport: substrate" \
-d '{
"model": "m365-copilot",
"messages": [
{ "role": "user", "content": "What is the weather in London?" }
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Lookup weather by city",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}
}
}
],
"tool_choice": "auto"
}'Example tool-call response shape:
{
"id": "chatcmpl_...",
"object": "chat.completion",
"created": 1739986369,
"model": "m365-copilot",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_...",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"London\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}Strictness behavior:
- If
tool_choiceisrequiredor a specificfunction, the proxy returns400 invalid_tool_outputwhen no valid tool-call JSON can be extracted from assistant output. - If
tool_choiceisauto(or tools are not strictly required), the proxy falls back to a normal assistant text completion when tool-call JSON is not found.
Input normalization notes:
- JSON-stringified
message.content, tool payloads, and function arguments are parsed best-effort and re-serialized to canonical minified JSON when valid. - Assistant message content containing serialized
tool_callsstructures is preserved as tool-call context for downstream Copilot prompt construction.
Create response:
curl -s http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-H "x-m365-transport: substrate" \
-d '{
"model": "m365-copilot",
"input": "Write a TypeScript function that validates UUIDs."
}'Continue a conversation using previous_response_id:
curl -s http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-H "x-m365-transport: substrate" \
-d '{
"model": "m365-copilot",
"previous_response_id": "resp_abc123",
"input": "Now add tests."
}'Streaming (stream: true) emits SSE events:
response.createdresponse.in_progressresponse.output_item.addedresponse.output_text.deltaresponse.output_text.doneresponse.output_item.doneresponse.completederror(SSE error event on stream failure)
By default, the proxy ignores inbound Authorization and attempts to use a cached token or auto-acquire one via Playwright for chat/responses requests.
The browser used for that auto-acquisition is controlled by playwrightBrowser in config (or CONFIG__playwrightBrowser in env).
To allow pass-through Authorization headers from clients, set:
CONFIG__ignoreIncomingAuthorizationHeader=false bun run start:proxybun run buildThis produces a single-file executable in dist/ and copies config.json alongside it.
bun run cli -- help
bun run cli -- status
bun run cli -- chat
bun run cli -- chat --api responses
bun run cli -- token set --token "<jwt>"By default, CLI chat requests do not send an Authorization header. The proxy handles token acquisition when needed. Use --token or YARPILOT_TOKEN only when you want to force a specific token from the CLI.
In chat mode, the CLI supports these slash commands:
/status(token + connection status)/api(show current API mode)/api completionsor/api responses(toggle endpoint)/token(paste a new token)/cleartoken(clear cached token)/exit(quit)