Background model MiniMax-M2.5 is expensive and uncontrollable

### Summary

Command Code uses background models (MiniMax-M2.5, Kimi-K2.5) for internal tasks such as title generation, tool call naming, and taste learning. These calls are **not user-selectable**, yet they appear in usage logs and **consume credits**. In my sessions, these background calls account for a significant share of total cost — sometimes roughly half the spend for a task — because tool descriptions are being generated/processed on what seems like every other request.

### Problems

1. **MiniMax-M2.5 is not cost-effective for trivial tasks.** Title generation and tool naming are lightweight operations that do not require a 400B+ parameter model. Much smaller and cheaper alternatives (e.g. Qwen 3 Flash, DeepSeek V4 Flash, Gemma 3 4B, Phi-4-mini) would handle these tasks at a fraction of the cost.

2. **Users have no control over the background model.** The `/model` selector only affects the primary conversation model. There is no way to choose, downgrade, or disable the background model — yet the resulting token usage is billed to the user's account.

3. **Frequency of background calls inflates cost.** Usage logs show tool-desc/summary calls being made repeatedly (appearing every other request in some sessions). Even at low per-call cost, this adds up significantly over a session.

### Expected Behavior

* Background helper tasks should use the cheapest viable model available, not a mid-tier general-purpose one.
* Users should be able to see which model was used for each call in the usage dashboard (currently it just shows "MiniMax-M2.5" alongside the selected model without explanation).
* Ideally, users should be able to opt out of, or at least select the model for, these background operations.

### Actual Behavior

* MiniMax-M2.5 (a 400B+ parameter model) is used for trivial text summarization tasks.
* These calls are invisible during the session — they only show up in the usage dashboard after the fact.
* The cumulative cost of these background calls can rival or exceed the cost of the primary model calls.

### Suggestions

* Switch background tasks to a small, fast, cheap model (Qwen 3 Flash, DeepSeek V4 Flash, or similar sub-cent-per-million-token models).
* Surface background model usage separately in the UI so users understand what they're being billed for.
* Consider making taste learning and title generation opt-in per session, not just globally via `/taste` (as suggested in #326).

### Related

* #326 — "Unexpected usage of multiple models during a single session" (closed as non-issue, but the underlying concern about cost transparency remains)

### Environment

* **Command Code version:** latest
* **OS:** Windows
* **Shell:** PowerShell


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Background model MiniMax-M2.5 is expensive and uncontrollable #440

Summary

Problems

Expected Behavior

Actual Behavior

Suggestions

Related

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Background model MiniMax-M2.5 is expensive and uncontrollable #440

Description

Summary

Problems

Expected Behavior

Actual Behavior

Suggestions

Related

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions