siteboon/claudecodeui · commit 10f721cf · architecture deconstruction

How CloudCLI's streaming pipe is plumbed

CloudCLI is a single Node/Express process that serves a React SPA + a WebSocket gateway. Two completely independent pipes run side by side: a chat pipe on /ws where the server calls the Claude SDK in-process (no subprocess!) and pushes normalized JSON events down to a React card UI, and a shell pipe on /shell where the server spawns the agent CLI inside a real node-pty pseudo-terminal and tunnels raw ANSI bytes both directions into xterm.js. Same WebSocket library, same JSON envelope shape — completely different rendering paths, completely different input mechanics. This is the map.

The most surprising fact: for the chat panel, Claude is not spawned as a child process. The server import { query } from @anthropic-ai/claude-agent-sdk and runs the agent in-process via an async generator — no PTY, no spawn(), no stdout listening. That's why the chat panel has structured tool cards and the shell panel has raw ANSI: they are literally not the same code path.

Chat pipe /ws · in-process SDK

Browser sends a JSON claude-command over WebSocket · server invokes Claude SDK directly · async generator yields events · normalizer types them · writer pushes JSON down WS · React dispatches on kind to render tool / message / thinking cards.

Browser · React ChatComposer.tsx {type:'claude-command', command,options} handleChatConnection server/modules/websocket queryClaudeSDK() SDK.query() · in-process for await (msg) normalizeMessage() → NormalizedMessage {kind, ...} WebSocketWriter ws.send(JSON.stringify) no backpressure React dispatch switch(msg.kind) ToolRenderer Bash · Edit · TodoWrite ... WS /ws SDK events JSON over /ws

Walkthrough · chat pipe

1
src/components/chat/ChatComposer.tsx

Browser composes a chat-command JSON message and sends it over the /ws WebSocket. The whole prompt goes in one shot — there is no token-by-token typing on the wire; the command field is the entire user message. options carries sessionId (for resume), cwd, model, etc.

show source · message shape
// wire shape sent over /ws
{
  type: 'claude-command',
  command: 'help me refactor this file...',
  options: {
    sessionId: 'a1b2c3...',    // optional · for resume
    cwd:       '/workspace',
    model:     'sonnet',
    permissionMode: 'bypassPermissions',
  },
}
2
server/claude-sdk.js :15, :614–627

The pivotal moment. The server does NOT spawn claude as a subprocess. It calls the SDK's async generator query() directly. No PTY, no spawn(), no stdio piping. The SDK itself invokes the Claude Code binary internally — but from CloudCLI's perspective it's just for await (const message of queryInstance) in the same Node process.

Why this matters for ainb: if you want ainb to drive Claude via CloudCLI's wire format, you're driving a Node process that links the SDK — not a CLI invocation. The "container holds a Claude process" mental model from claude -p doesn't apply here. The container holds a Node server that talks to Anthropic.
show source · in-process SDK call
// server/claude-sdk.js
import { query } from '@anthropic-ai/claude-agent-sdk';

const queryInstance = query({
  prompt: command,
  options: sdkOptions,   // resume, model, permissionMode, etc.
});

for await (const message of queryInstance) {
  // each iteration = one SDK event
  // message.type: 'stream_event' | 'tool_use' | 'session_init' | ...
  ...
}
3
server/modules/sessions/services/sessions.service.ts · normalizeMessage()

Every raw SDK event is passed through sessionsService.normalizeMessage('claude', msg, sid) which maps it to a provider-agnostic shape with a kind discriminator: stream_delta (token text), tool_use (tool starting), tool_result (tool output), thinking (reasoning), session_created (new sid), complete (turn done), error, permission_request.

This normalizer is what allows Gemini, Cursor, and Codex (which DO spawn as subprocesses with --output-format stream-json) to feed the same React UI — every provider's events end up in the same NormalizedMessage shape.

4
server/modules/websocket/services/websocket-writer.service.ts :24–28

Normalized message gets serialized with JSON.stringify and pushed over the WS via ws.send(). No buffering, no chunking, no backpressure. If the browser stalls, frames queue in the OS socket buffer and eventually back-pressure the Node event loop. The writer just guards on readyState === WS_OPEN_STATE and fires.

Production-grade limitation: a stalled browser can pin the Node event loop. CloudCLI's design assumes the browser keeps up. ainb integration should consider this when running large outputs.
show source · the entire send path
// server/modules/websocket/services/websocket-writer.service.ts
send(data: unknown): void {
  if (this.ws.readyState === WS_OPEN_STATE) {
    this.ws.send(JSON.stringify(data));
  }
}
// ↑ that's literally the whole thing. no queue. no rate limit.
5
src/components/chat/hooks/useChatRealtimeHandlers.ts · ToolRenderer.tsx

Browser receives the JSON, dispatches on msg.kind. stream_delta accumulates text · tool_use spawns a tool card · tool_result fills it · thinking renders a collapsible reasoning block · complete ends the turn. The chat panel is not a terminal — there's no ANSI parsing here. Each event becomes a typed React component.

ToolRenderer.tsx maps every Claude tool name to a display: Bash → collapsible command+output, Edit/Write/ApplyPatch → diff viewer, TodoWrite → todo list, Task → subagent container with nested children, AskUserQuestion → answerable card.

Shell pipe /shell · raw PTY tunnel

Browser xterm.js → onData → WS {type:'input'} → server pty.write() · agent CLI in PTY → pty.onData → WS {type:'output'} → xterm renders ANSI. Bidirectional raw bytes. Same WebSocket library as chat, completely different semantics.

Browser · xterm.js Shell.tsx · useShellTerminal FitAddon · WebglAddon onData → keystrokes WebSocket /shell {type:'input',data} {type:'output',data} shell-websocket .service.ts pty.write(data) pty.onData → ws.send node-pty spawn('bash','-c',...) xterm-256color claude / codex / etc interactive (no -p) tty=true ptySessionsMap 30-min reconnect window ${projectPath}_${sid} input output JSON over WS spawn / signal resume on reconnect

Walkthrough · shell pipe

1

xterm.js's onData handler captures every keystroke as raw bytes — including escape sequences for arrow keys, Ctrl-C (\x03), paste content, etc. xterm has already translated keyboard events into the proper byte sequences before they hit onData; the frontend just forwards.

show source · the entire input forwarding
// useShellTerminal.ts
const dataSubscription = nextTerminal.onData((data) => {
  sendSocketMessage(wsRef.current, {
    type: 'input',
    data,         // raw byte string · Ctrl-C, arrows, paste, all
  });
});
2

Server side is symmetrically thin: type: 'input' messages get written straight into the PTY master. node-pty handles the rest — it forwards the bytes to the slave end where claude (or codex, gemini, etc.) is running as if a real user were typing.

Resize events {type: 'resize', cols, rows} get a separate handler that calls shellProcess.resize(cols, rows). A ResizeObserver on the browser keeps xterm and the PTY in sync.

show source · the PTY write
// shell-websocket.service.ts
if (data.type === 'input') {
  if (shellProcess) {
    shellProcess.write(readString(data.data));
  }
  return;
}
3

The PTY itself is spawned with a bash -c wrapper that tries to resume the most recent session first, falling back to a fresh claude if resume fails. name: 'xterm-256color' means the agent sees a real terminal capable of 256-color ANSI rendering.

show source · PTY spawn
// shell-websocket.service.ts · spawning the interactive shell
const shellProcess = pty.spawn('bash', ['-c',
  `claude --resume "${sessionId}" || claude`
], {
  name: 'xterm-256color',
  cols, rows,
  cwd: projectPath,
  env: { ...process.env, FORCE_COLOR: '1' },
});
4
server/modules/websocket/services/shell-websocket.service.ts · onData handler

The reverse direction: shellProcess.onData fires for every byte chunk the PTY emits. Server wraps it as {type: 'output', data} and sends down the WS. xterm.js receives it and writes directly to the terminal renderer — ANSI escape codes, cursor moves, colors, everything renders natively in the browser.

Two parsers, one transport. Same WebSocket lib. Same JSON envelope. The chat pipe parses kind and renders cards; the shell pipe ignores message structure beyond type and dumps data into xterm. The "renderer" is what makes them feel different.
5
shell-websocket.service.ts · ptySessionsMap :26–29

When the user closes the browser tab, the PTY doesn't die immediately. The handler sets a 30-minute setTimeout to kill it. If the user reconnects within that window (same projectPath + sessionId key), the existing PTY handle is reused and the xterm buffer replays.

This is how CloudCLI makes long-running interactive sessions survive flaky network without losing the agent's state. It's also a memory cost — every disconnected PTY lives for 30 minutes by default.

Auth · just a Shell pipe in disguise

"Sign in to Claude" doesn't have its own handler. CloudCLI opens a Shell pipe, runs claude /login in a PTY, scrapes the OAuth URL from terminal output, and pops it to the browser as a clickable link.

1
src/components/provider-auth/view/ProviderLoginModal.tsx :27–30

Modal mounts <StandaloneShell command='claude --dangerously-skip-permissions /login' isPlainShell>. Same shell pipe, just with a fixed initial command.

2
server/modules/websocket/services/shell-websocket.service.ts :297–335 · emitAuthUrl()

The server watches the PTY output buffer for URLs matching https?://.... When detected, it sends an extra {type: 'auth_url', url} message to the browser. The browser sets authUrl state, which renders a clickable overlay.

User opens the URL in a normal tab, completes Anthropic OAuth, the callback is captured by the Claude CLI inside the PTY (because claude /login ran a temp local server inside the container). PTY exits with code 0, modal closes.