I run an AI agent (Hermes) that posts to social media, scrapes LinkedIn, downloads YouTube videos, and does web research. All of it needs a real browser — not a headless ephemeral one, but a persistent headful Chromium with cookies, sessions, and a display.
Here’s how it’s built, what’s running inside the container, and everything that broke.
The Stack
A single Docker container running on an 8GB VPS:
┌─────────────────────────────────────────────┐
│ Docker container (browser) │
│ │
│ Xvfb (virtual display :100) │
│ ↓ │
│ Chromium (headful, port 9222) │
│ ↓ │
│ x11vnc (port 5900) ← websockify (port 6080) │
│ ↓ │
│ noVNC ← manual browser UI │
│ │
│ nginx (port 9223 → 9222) │
│ CDP proxy + WebSocket URL rewrite │
│ │
│ Watchdog (anon RSS monitor, health checks) │
│ │
│ tini (PID 1, reaps zombies) │
└─────────────────────────────────────────────┘
Two ports are exposed to the host (bound to 127.0.0.1 only):
| Port | What | Purpose |
|---|---|---|
9223 | nginx CDP proxy | Playwright, Puppeteer, MCP, raw websocket |
6080 | noVNC (via websockify) | Manual browser in a browser tab |
Tailscale serve exposes both to my tailnet:
| |
I can drive the browser from my laptop in a different country through <tailscale-hostname>:9223.
The Dockerfile
Not a black-box Browserless image — a custom build from debian:bookworm-slim:
| |
Key decisions:
- Debian slim instead of the Browserless image — smaller attack surface, no Node.js runtime we don’t need
- Chromium from Debian repos, not Google Chrome — no custom APT repo, stable version tracking
- tini as PID 1 — Chromium spawns child processes (renderers, GPU, network) and tini reaps them properly so we don’t accumulate zombies
- Non-root user (
browser, UID 1000) — Chromium refuses to run as root anyway, and it’s the right thing to do - CJK + emoji fonts — without these, pages render with tofu boxes and bot detection triggers on font fingerprinting
docker-compose.yml — The Resource Tuning
This is the file that took the most iteration. On an 8GB host, you have to be precise:
| |
The numbers that matter:
| Setting | Value | Why |
|---|---|---|
mem_limit | 6144m (6 GB) | Chrome + Xvfb + nginx + VNC stack needs headroom for 5-10 tabs |
memswap_limit | 7168m (7 GB) | 1 GB of swap — enough to absorb a temporary spike without OOM kill |
mem_reservation | 4096m (4 GB) | Guarantees Chrome has at least 4 GB before the kernel starts reclaiming |
shm_size | 4g | /dev/shm is where Chrome stores shared memory between processes — 64 MB default is a guaranteed crash |
BROWSER_MEMORY_HIGH_WATERMARK_MB | 5500 | Watchdog threshold — above this, restart Chrome before the OOM killer does |
These numbers came from watching docker stats during real usage. A headful Chromium with 3-5 tabs, a logged-in LinkedIn session, and an active CDP connection settles around 3.5-5 GB of anonymous RSS. The 5.5 GB threshold gives about 500 MB of breathing room before the 6 GB cap.
Chrome flags inside start.sh
| |
remote-debugging-address=127.0.0.1— critical. Without this, CDP only listens on localhost even inside the containermax-old-space-size=1024— V8 heap limit per renderer. 512 MB was too small for heavy pages like LinkedInrenderer-process-limit=8— prevents Chrome from spawning a new renderer per tab foreverdisk-cache-size=500000000— persistent cache across restarts since/datais a bind mountdisable-features— strips out Media Router, sync, autofill, and other services that do network I/O in the background
The CDP Proxy: nginx.conf
Raw Chromium exposes CDP on port 9222. But the webSocketDebuggerUrl in /json/version looks like ws://127.0.0.1:9222/devtools/browser/... — useless for any client not on localhost. nginx fixes this:
| |
The sub_filter rewrites Chromium’s hardcoded WebSocket URLs to whatever host the client used to connect. This is what makes Tailscale work — a client connecting to <tailscale-hostname>:9223 gets WebSocket URLs pointing to the same hostname, not 127.0.0.1.
Without this proxy, external CDP clients get a webSocketDebuggerUrl they can’t reach, and you spend an hour debugging why Puppeteer connects but never sends commands.
The Boot Sequence: start.sh
This is the heart of the container — 347 lines of bash that brings up the whole stack in order:
1. Xvfb :100 (virtual framebuffer, 1365x840)
2. Chromium (CDP on 9222, persistent profile)
3. x11vnc (attaches to Xvfb display, listens on 5900)
4. websockify (bridges VNC → WebSocket on 6080, serves noVNC HTML)
5. nginx (CDP proxy on 9223)
6. Watchdog loop (every 60 seconds)
Each step waits for the previous one to be healthy before continuing. Chromium won’t launch until Xvfb is confirmed running (xdpyinfo). The VNC stack won’t start until Chromium is responding on /json/version. The watchdog won’t begin until every component is up.
The boot takes about 15-20 seconds from docker compose up to “Manual UI: http://127.0.0.1:6080/vnc.html”.
The Watchdog
The watchdog loop runs every 60 seconds and checks four things:
- Are all processes alive? Xvfb, x11vnc, websockify, nginx — if any died, exit the container so Docker restarts it
- Is Chromium running? If
pgrep chromiumreturns nothing, launch a fresh instance - Is CDP healthy? A curl to
http://127.0.0.1:9223/json/version— this catches the “WebSocket silently broken” state where HTTP works but WS doesn’t - Is memory over threshold? If anonymous RSS exceeds
BROWSER_MEMORY_HIGH_WATERMARK_MB, restart Chromium
The memory check is the most interesting part:
| |
Why anon and not memory.current? The cgroup memory.current counter includes the page cache — file-backed pages that the kernel can reclaim under memory pressure. If you track memory.current, legitimate file I/O (like Chromium writing to its disk cache) triggers false watchdog kills. Anonymous pages can’t be reclaimed — when they grow, it’s real memory pressure. That’s what the OOM killer acts on, so the watchdog should too.
When RAM exceeds the threshold, the watchdog:
- Sends
SIGTERMto all Chrome processes - Waits up to 10 seconds for graceful shutdown
- Sends
SIGKILLif processes are still alive - Cleans profile lock files (
SingletonLock,SingletonCookie,SingletonSocket) - Launches a fresh Chromium with the same profile
The profile survives the restart because /data/profile is a bind mount on the host. All cookies, sessions, and local storage persist.
noVNC — The Manual UI
The manual access chain:
Browser tab (your laptop)
│ http://<tailscale-hostname>:6080/vnc.html
│
▼
websockify (WebSocket → raw TCP bridge)
│ port 6080 → localhost:5900
│
▼
x11vnc (VNC server attached to Xvfb display :100)
│
▼
Xvfb :100 (headless X server, 1365x840, 24-bit color)
│
▼
Chromium (renders into Xvfb framebuffer)
noVNC is served by websockify itself (--web /usr/share/novnc), so there’s no separate HTTP server. The entire UI is a single HTML page that connects back over WebSocket.
This is how I log into LinkedIn, X, Instagram, and Google — open the noVNC page, click through the auth flow, and close the tab. The session stays alive in the persistent profile until the watchdog restarts Chrome (which only happens under memory pressure or CDP failure).
How the AI Agent Drives It
Hermes connects through CDP for automation:
| |
The agent uses different tools depending on the task:
| Agent Tool | CDP Method | Use Case |
|---|---|---|
browser_navigate | Page.navigate | Open a URL |
browser_snapshot | Accessibility.getFullAXTree | Read page content as text |
browser_click | Runtime.evaluate → element.click() | Click buttons |
browser_type | Input.dispatchKeyEvent | Type into fields |
browser_console | Runtime.evaluate | Run arbitrary JS, extract data |
browser_cdp | Any raw CDP method | Escape hatch for anything else |
For social posting, the agent runs cron jobs that fire at scheduled intervals, connect to the browser over CDP, compose and publish posts, and report back.
What Broke (And How We Fixed It)
Chrome 147 broke all cookie injection
Network.setCookie, Network.setCookies, Storage.setCookies — all return success but don’t set cookies. Chrome 147 tightened the cookie security model, and CDP cookie methods are now effectively useless for session injection.
Fix: Stop injecting cookies. The container keeps a persistent authenticated profile. I log in once through noVNC, and the agent attaches to the same running browser instance. This is actually more reliable — sites detect session discontinuity and flag it as bot behavior.
CDP WebSocket silently dies after days of uptime
/json/version responds fine over HTTP, but all WebSocket protocol messages time out. Playwright, Puppeteer, raw Python — all break identically. The WebSocket upgrade returns HTTP 101, but no CDP frames are ever dispatched.
Fix: The watchdog’s /json/version health check catches HTTP failures but not this. The real fix is docker restart browser. Happens roughly once every 4-7 days.
Every platform uploads files differently
- X/Twitter: Native
HTMLInputElement.prototype.filessetter fails for images (works for video). Fallback: simulate aDragEventwithDataTransferonto the compose textbox. Images must be base64 ≤ 650KB, compressed through Pillow at quality 65. - Instagram: React inputs reject
innerHTMLandexecCommand('insertText'). Onlykeyboard.type(text, {delay: 50})works — character by character with a 50ms delay. - TikTok: The native prototype setter works perfectly.
files.lengthis correct. The most automation-friendly of all platforms. - Threads: Same React issue as Instagram. Drag-and-drop on the compose textbox is the only reliable path.
nginx frame size cap breaks large uploads
nginx caps WebSocket frames at roughly 186KB. Uploading a video means splitting the base64 into chunks and sending multiple Runtime.evaluate frames. The Python implementation must use sock.sendall() — sock.send() can return before the full buffer is transmitted, and partial WebSocket frames are garbage.
Tailscale serve syntax changed between versions
Tailscale 1.96.x uses --tcp flag syntax:
| |
The Full File Layout
Everything lives in ~/browser/:
~/browser/
Dockerfile # Build from debian:bookworm-slim
docker-compose.yml # Resource limits, port bindings, volumes
nginx.conf # CDP proxy with WebSocket URL rewrite
start.sh # Boot sequence + watchdog
healthcheck.sh # Curls CDP + noVNC, checks both respond
data/ # Bind-mounted persistent volume
profile/ # Chromium user profile (cookies, sessions, cache)
browser-logs/ # stdout/stderr from all processes
Rebuild and start:
| |
The container binds data to ./data on the host. That means:
~/browser/data/profile/— Chromium profile survives container rebuilds~/browser/data/browser-logs/— full logs from Chrome, Xvfb, nginx, VNC
What’s Next
The rig is stable. It posts, scrapes, downloads, and monitors. But there’s more to do:
- Multi-profile: Separate Chrome profiles per platform so a rate limit or shadow-ban on one doesn’t take down the others
- CDP WebSocket health probe: The current
/json/versioncheck doesn’t catch the silent WebSocket failure. Needs an actualRuntime.evaluateround-trip - Residential proxy rotation: Some platforms geofence or rate-limit by IP
- CAPTCHA automation: Currently solved manually through noVNC
For now, it works — and that’s enough.
This browser runs in Docker on an 8GB VPS, connected to Hermes (my AI agent) via Chrome DevTools Protocol. Everything described here is running in production as of May 2026. If you’re building something similar, I’ve probably hit whatever bug you’re currently debugging.