features
Inline control channel
DLE/STX framing multiplexes raw terminal bytes and structured control frames onto one WebSocket.
The most important low-level component in uterm is the control-channel framing in control_channel.py. Unlike systems that use a second WebSocket (or HTTP side-channel) for metadata, uterm multiplexes everything into one stream.
Wire format
DLE(0x10) is the escape character.- Data: raw terminal bytes have their
DLEbytes doubled (0x10 0x10) on the wire. - Control frames:
DLE STX [8-hex length] : [JSON].
That’s it. The same parser runs in Python (server) and TypeScript (browser).
Why a single stream
Multi-channel proxies eventually race: a resize control frame and the bytes after a resize arrive out of order, the snapshot drifts from the visible buffer, and presence updates flicker. Inlining the control frames makes ordering trivial — the parser sees them in the exact position the producer emitted them.
What rides on it
- Resize and heartbeat
- Hijack state, lease ownership, role announcements
- Presence (join/leave, adjective-animal identity, HSL color)
- Annotations placed by humans or AI agents
- Chat lines in the DeckMux channel
- Screen snapshots (when a participant joins mid-session)
Every one of those is just a JSON payload between DLE STX and the next data byte.