Skip to content

Regression (v13.4.0): chroma-mcp still dies instantly on Windows — #2701 cmd.exe quoting fix is incomplete #2716

@Nelie-Taylor

Description

@Nelie-Taylor

Bug: chroma-mcp subprocess dies instantly on Windows — the v13.4.0 cmd.exe quoting fix (#2701) is incomplete

Follow-up to #2696 (closed as "Shipped in v13.4.0 via #2701"). The quoting fix does not resolve the issue on Windows. Please reopen or track here.

Related issues

Summary

On Windows, every Chroma sync fails with MCP error -32000: Connection closed. The chroma-mcp subprocess dies in ~18ms, long before it ever reaches the Chroma server. This happens regardless of whether Chroma (Docker/remote) is up and reachable.

The root cause is how the worker spawns chroma-mcp on Windows: it wraps the command in cmd.exe /c uvx .... The dependency specs onnxruntime>=1.20 and protobuf<7 contain > and <, which cmd.exe interprets as redirection operators.

The v13.4.0 fix (PR for "chroma-mcp cmd.exe quoting") added per-arg quoting for [<>|&^()], but this does not fix the problem and arguably makes it worse — because the MCP SDK spawns via an argv array with shell: false, so child_process applies its own Windows quoting on top of the manually-injected quotes, producing a malformed command line.

Environment

  • claude-mem: 13.4.0 (worker bundle confirmed running from the updated plugin dir)
  • OS: Windows 11 Pro 26200
  • Runtime: bun (worker), Node v24 (verification)
  • uvx: 0.9.30
  • chroma-mcp: 0.2.6
  • Chroma server: chromadb/chroma:latest (1.0.0) in Docker, 127.0.0.1:8000, /api/v2/version1.0.0 (verified reachable)
  • Settings: CLAUDE_MEM_CHROMA_ENABLED=true, CLAUDE_MEM_CHROMA_MODE=remote, host 127.0.0.1, port 8000, ssl false

Observed log (Docker confirmed UP at this timestamp)

[10:09:44.121] [INFO ] [CHROMA_SYNC] Syncing observation {observationId=735, documentCount=6, project=Nghi}
[10:09:44.122] [INFO ] [CHROMA_MCP] Connecting to chroma-mcp via MCP stdio
  {command=C:\Windows\system32\cmd.exe,
   args=/c uvx --python 3.13 --with "onnxruntime>=1.20" --with "protobuf<7"
        chroma-mcp==0.2.6 --client-type http --host 127.0.0.1 --port 8000 --ssl false}
[10:09:44.141] [WARN ] [CHROMA_MCP] Connection failed, killing subprocess tree to prevent zombie {error=MCP error -32000: Connection closed}
[10:09:44.141] [ERROR] [CHROMA_MCP] Connection attempt failed MCP error -32000: Connection closed
[10:09:44.141] [ERROR] [CHROMA] SDK chroma sync failed, continuing without vector search

Subprocess lifetime: .122.141 = 19ms. It never contacts port 8000.

Root cause

Current Windows spawn logic in the worker (minified worker-service.cjs, connectInternal):

// n = isWindows
let s = n ? (process.env.ComSpec || "cmd.exe") : "uvx";
let i = n ? ["/c", "uvx", ...e.map(Lse)] : e;
// ...
this.transport = new StdioClientTransport({ command: s, args: i, env: r, cwd: homedir(), stderr: "pipe" });

with the v13.4.0 quoting helper:

const jse = /[<>|&^()]/;
function Lse(t) { return jse.test(t) ? `"${t.replace(/"/g, '\\"')}"` : t; }

The problem: StdioClientTransport spawns with shell: false and an argv array. On Windows, child_process already does its own command-line quoting for each arg. Manually wrapping args in " via Lse means the literal quote characters get re-escaped by child_process, corrupting the final command line cmd.exe receives.

Reproduction (isolated, no claude-mem)

Spawning cmd.exe /c uvx ... via child_process.spawn with shell: false, mirroring the SDK:

1. Raw args (pre-13.4.0 behavior):

spawn("cmd.exe", ["/c","uvx","--python","3.13","--with","onnxruntime>=1.20","--with","protobuf<7","chroma-mcp==0.2.6","--client-type","http","--host","127.0.0.1","--port","8000","--ssl","false"], { shell:false })
// → EXIT code=1, stderr: "The system cannot find the file specified."
//   (cmd.exe treats `>` as redirect to a file named `=1.20`)

2. Lse-quoted args (v13.4.0 behavior):

spawn("cmd.exe", ["/c","uvx","--python","3.13","--with",'"onnxruntime>=1.20"',"--with",'"protobuf<7"', ...], { shell:false })
// → EXIT code=1, stderr: "The directory name is invalid."
//   (child_process re-quotes the already-quoted args → malformed command line)

3. Spawn uvx.exe DIRECTLY, no cmd.exe (proposed fix):

spawn("C:/Users/<user>/.local/bin/uvx.exe", ["--python","3.13","--with","onnxruntime>=1.20","--with","protobuf<7","chroma-mcp==0.2.6","--client-type","http","--host","127.0.0.1","--port","8000","--ssl","false"], { shell:false })
// → process stays alive, "Installed 106 packages in 2.88s", clean stdout (chroma-mcp logs to stderr).
//   MCP stream is uncorrupted. Works.

Proposed fix

Do not route through cmd.exe on Windows. Spawn the uvx executable directly so shell metacharacters (>, <) are passed as literal argv values and never parsed by a shell:

// Resolve uvx.exe (PATHEXT-aware) once, then spawn directly on all platforms.
const command = uvxPath;     // e.g. resolved `uvx.exe`, or `uvx` on POSIX
const args = e;              // raw args, NO manual quoting, NO `["/c","uvx",...]`
this.transport = new StdioClientTransport({ command, args, env, cwd: homedir(), stderr: "pipe" });

Notes:

  • Drop the Lse quoting entirely — it only exists to compensate for the cmd.exe wrapper.
  • If the cmd.exe wrapper was added because uvx can be a .cmd/.bat shim (which child_process can't spawn directly with shell:false), resolve the real executable via PATHEXT (prefer uvx.exe), or set shell: true only as a fallback when the resolved target is a .cmd/.bat. Spawning the .exe directly avoids the shell entirely and is the robust path.

Impact

  • Vector search is completely non-functional on Windows. Every observation/summary/prompt sync logs an error and falls back to no-vector-search.
  • Affects all Windows users on both local (persistent) and remote (http) Chroma modes — the failing layer is the subprocess spawn, before any client-type-specific logic runs.

Workaround until fixed

Set CLAUDE_MEM_CHROMA_ENABLED=false. Memory capture and injection continue to work via SQLite FTS; only semantic vector search is unavailable. Re-enabling later triggers a backfill, so no data is lost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions