Multi-Backend Pool - typemux-cc

Design Rationale

Instead of running a single backend and restarting on venv changes, typemux-cc maintains a pool of concurrent backend processes — one per venv. This eliminates restart overhead when switching between projects in a monorepo.

Before: Single Backend

Switch venv → kill process → spawn new → re-index
5-10s downtime per switch
Lost state on every transition

After: Multi-Backend Pool

Keep multiple backends alive
Instant switching (already indexed)
Preserve state per venv

Pool Configuration

Basic Settings

Parameter	CLI Flag	Environment Variable	Default	Description
Max backends	`--max-backends`	`TYPEMUX_CC_MAX_BACKENDS`	`8`	Upper limit on concurrent backend processes
Backend TTL	`--backend-ttl`	`TYPEMUX_CC_BACKEND_TTL`	`1800` (30 min)	Idle timeout in seconds (0 = disabled)

typemux-cc --max-backends 4 --backend-ttl 900

Recommended Settings by Use Case

Monorepo with 2-3 projects

--max-backends 4 --backend-ttl 1800

Keeps 1 backend per project + 1 spare
30-minute TTL cleans up unused backends

Large monorepo (5+ projects)

--max-backends 8 --backend-ttl 3600

Higher pool size for frequent project switches
1-hour TTL for long coding sessions

Single project (no monorepo)

--max-backends 2 --backend-ttl 0

Minimal pool (1 backend + 1 spare for nested venvs)
Disable TTL (backend never evicted)

Memory-constrained environments

--max-backends 2 --backend-ttl 600

Limit concurrent backends to 2
Aggressive 10-minute TTL

Backend Instance Structure

Each backend in the pool is represented by a BackendInstance (src/backend_pool.rs:44-55):

pub struct BackendInstance {
    pub writer: LspFrameWriter<ChildStdin>,  // Write JSON-RPC to backend
    pub child: Child,                        // Process handle
    pub venv_path: PathBuf,                  // Key: which venv this serves
    pub session: u64,                        // Unique session ID
    pub last_used: Instant,                  // For LRU tracking
    pub reader_task: JoinHandle<()>,         // Background reader task
    pub next_id: u64,                        // Next request ID
    pub warmup_state: WarmupState,           // Warming or Ready
    pub warmup_deadline: Instant,            // When to transition to Ready
    pub warmup_queue: Vec<RpcMessage>,       // Queued requests during warmup
}

The session field is critical for stale message detection. See Architecture: Session-Based Detection.

LRU Eviction Strategy

When the pool reaches max_backends and a new backend is needed, typemux-cc evicts the least recently used (LRU) backend.

LRU Selection Algorithm

Code reference: src/backend_pool.rs:156-174

pub fn lru_venv(&self, pending_count_fn: impl Fn(&PathBuf, u64) -> usize) -> Option<PathBuf> {
    // First try: find LRU among backends with 0 pending requests
    let no_pending_lru = self
        .backends
        .iter()
        .filter(|(venv, inst)| pending_count_fn(venv, inst.session) == 0)
        .min_by_key(|(_, inst)| inst.last_used)
        .map(|(venv, _)| venv.clone());

    if no_pending_lru.is_some() {
        return no_pending_lru;
    }

    // Fallback: LRU among all backends
    self.backends
        .iter()
        .min_by_key(|(_, inst)| inst.last_used)
        .map(|(venv, _)| venv.clone())
}

Eviction Sequence

What Happens During Eviction

Cancel pending requests: All client requests waiting for responses from this backend receive a cancellation error (-32800)
Clean up backend requests: Remove any pending backend→client requests from tracking
Clear diagnostics: Send empty diagnostic messages to client to clear stale errors from evicted backend
Shutdown process: Send LSP shutdown + exit to backend, kill process if unresponsive
Abort reader task: Stop the background task reading from backend’s stdout

If you see frequent evictions in logs (grep "Evicting LRU backend" /tmp/typemux-cc.log), increase --max-backends to reduce thrashing.

TTL-Based Eviction

Backends idle for longer than backend_ttl are automatically evicted to free resources.

TTL Sweep Mechanism

Interval: 60 seconds (hardcoded in src/proxy/mod.rs event loop)
Check: Instant::now() - last_used >= backend_ttl
Safety: Skip backends with pending requests (both client→backend and backend→client)

Code reference: src/proxy/pool_management.rs:95-166

pub async fn evict_expired_backends(&mut self, ...) -> Result<(), ProxyError> {
    let expired = self.state.pool.expired_venvs();
    if expired.is_empty() {
        return Ok(());
    }

    for venv_path in expired {
        let session = ...;

        // Skip if there are pending client→backend requests
        let pending_count = self.state.pending_requests
            .values()
            .filter(|p| p.venv_path == venv_path && p.backend_session == session)
            .count();
        if pending_count > 0 {
            continue;  // Don't evict — backend is in use
        }

        // Skip if there are pending backend→client requests
        let pending_backend_count = self.state.pending_backend_requests
            .values()
            .filter(|p| p.venv_path == venv_path && p.session == session)
            .count();
        if pending_backend_count > 0 {
            continue;  // Don't evict — backend is in use
        }

        // Safe to evict — no pending requests
        ...
    }
}

TTL Behavior Examples

Default TTL (30 min)
Short TTL (10 min)
Disabled TTL

# Backend idle for 30 minutes → evicted
typemux-cc  # Uses default --backend-ttl 1800

Timeline:

10:00 AM: User opens file in project-a → backend spawned
10:30 AM: User switches to project-b → project-a backend becomes idle
11:00 AM: TTL expires, project-a backend evicted (pool size: 1)

# Aggressive eviction for memory-constrained systems
typemux-cc --backend-ttl 600

Timeline:

2:00 PM: User opens file in project-x → backend spawned
2:10 PM: User idle (no requests to project-x)
2:11 PM: Next TTL sweep evicts project-x backend

# Never auto-evict (only LRU eviction when pool full)
typemux-cc --backend-ttl 0

Backends remain in pool indefinitely until:

Pool reaches max_backends → LRU eviction
typemux-cc process exits
Backend crashes

Session Tracking

Session ID Generation

Each backend gets a unique, monotonically increasing session ID when spawned: Code reference: src/backend_pool.rs:176-180

pub fn next_session_id(&mut self) -> u64 {
    self.next_session += 1;
    self.next_session
}

Starting from 0, each new backend (including re-spawns after crashes) gets the next ID: 1, 2, 3, …

Session Validation

Every message received from a backend includes its session ID. Before processing, the proxy checks: Code reference: src/proxy/backend_dispatch.rs:23-49

let is_current = self
    .state
    .pool
    .get(&venv_path)
    .is_some_and(|inst| inst.session == session);

if !is_current {
    // Discard stale message from evicted/crashed backend
    return Ok(());
}

Why Session IDs Matter

Without Session IDs

Problem: Backend crashes, new backend spawned with same venv path. Old responses arrive → forwarded to client → wrong data.Example:

Backend session 1 serves project-a/.venv
Client sends request ID 42
Backend session 1 crashes
New backend session 2 spawned for project-a/.venv
Old response from session 1 arrives (wrong index state)
❌ Client receives stale response

With Session IDs

Solution: Responses from old sessions are discarded.Example:

Backend session 1 serves project-a/.venv
Client sends request ID 42 (recorded as session 1)
Backend session 1 crashes
New backend session 2 spawned for project-a/.venv
Old response from session 1 arrives
✅ Proxy discards (session 1 != session 2)
Client receives cancellation error for request 42

Pool State Inspection

Current Pool Status

To see which backends are in the pool:

# Enable debug logging
RUST_LOG=debug typemux-cc

# Tail logs
tail -f /tmp/typemux-cc.log

Look for log lines like:

[INFO] Creating new backend for venv=/path/to/project-a/.venv session=1
[INFO] Backend already in pool, reusing session=1
[INFO] Evicting LRU backend venv=/path/to/project-b/.venv session=2

Pool Activity Monitoring

# Real-time pool changes
grep -E "(Creating new backend|Evicting|Backend warmup)" /tmp/typemux-cc.log | tail -f

# Count current backends (approximation from logs)
grep "Creating new backend" /tmp/typemux-cc.log | tail -n 8

# Session lifecycle
grep "session=" /tmp/typemux-cc.log | grep -E "(Starting|completed|Evicting)"

Memory Considerations

Each backend process (pyright/ty/pyrefly) typically uses:

Backend	Typical Memory	Peak Memory	Notes
pyright	100-300 MB	500 MB	Higher for large codebases
ty	200-400 MB	800 MB	Rust-based, aggressive caching
pyrefly	150-350 MB	600 MB	Similar to pyright

Rule of thumb: Allow ~500 MB per backend. For --max-backends 8, reserve ~4 GB RAM for the pool.

Memory-Constrained Recommendations

If running on systems with limited RAM (e.g., 8 GB with other applications):

# Conservative pool size + aggressive TTL
typemux-cc --max-backends 2 --backend-ttl 600

Or monitor with:

# Watch memory usage
watch -n 5 'ps aux | grep -E "(pyright|ty|pyrefly)" | grep -v grep'

Performance Tuning

Monorepo with Frequent Switches

Problem: Switching between 5 projects every few minutes. Solution:

# Large pool + long TTL
typemux-cc --max-backends 8 --backend-ttl 3600

Single Project with Nested Venvs

Problem: Main project + test venv + docs venv (3 total). Solution:

# Small pool + no TTL
typemux-cc --max-backends 4 --backend-ttl 0

CI/CD or Short-Lived Sessions

Problem: Running typemux-cc in automated environments (tests, CI). Solution:

# Minimal pool + disable TTL (process exits soon anyway)
typemux-cc --max-backends 2 --backend-ttl 0

Start with defaults (--max-backends 8 --backend-ttl 1800) and adjust based on grep "Evicting" /tmp/typemux-cc.log frequency.

​Design Rationale

Before: Single Backend

After: Multi-Backend Pool

​Pool Configuration

​Basic Settings

​Recommended Settings by Use Case

​Backend Instance Structure

​LRU Eviction Strategy

​LRU Selection Algorithm

​Eviction Sequence

​What Happens During Eviction

​TTL-Based Eviction

​TTL Sweep Mechanism

​TTL Behavior Examples

​Session Tracking

​Session ID Generation

​Session Validation

​Why Session IDs Matter

Without Session IDs

With Session IDs

​Pool State Inspection

​Current Pool Status

​Pool Activity Monitoring

​Memory Considerations

​Memory-Constrained Recommendations

​Performance Tuning

​Monorepo with Frequent Switches

​Single Project with Nested Venvs

​CI/CD or Short-Lived Sessions

Design Rationale

Pool Configuration

Basic Settings

Recommended Settings by Use Case

Backend Instance Structure

LRU Eviction Strategy

LRU Selection Algorithm

Eviction Sequence

What Happens During Eviction

TTL-Based Eviction

TTL Sweep Mechanism

TTL Behavior Examples

Session Tracking

Session ID Generation

Session Validation

Why Session IDs Matter

Pool State Inspection

Current Pool Status

Pool Activity Monitoring

Memory Considerations

Memory-Constrained Recommendations

Performance Tuning

Monorepo with Frequent Switches

Single Project with Nested Venvs

CI/CD or Short-Lived Sessions