I used to open LinkedIn every morning and do the same thing: search “frontend developer” in Tangerang, then Jakarta, then “Indonesia remote,” then Singapore, then Australia. Scroll. Filter. Open tabs. Close tabs. Try to remember which ones I already looked at yesterday.
I don’t do that anymore. My AI agent does it — every single morning — and drops a formatted report into a Discord channel before I’ve even finished my coffee.
Here’s everything that went into building it.
The System at a Glance
Five cron jobs, staggered across the morning, each targeting a different job market:
| Time (WIB) | Job | Market | Type |
|---|---|---|---|
| 09:00 | LinkedIn Frontend Jobs — Indonesia | Tangerang, Jakarta Utara, Jakarta (3 locations) | Onsite + PIK |
| 10:00 | LinkedIn Frontend Jobs — Indonesia Remote | All Indonesia | Remote |
| 11:00 | LinkedIn Frontend Jobs — Singapore | Singapore | Remote |
| 12:00 | LinkedIn Frontend Jobs — Australia | Australia | Remote |
| 12:00 | LinkedIn Frontend Jobs — Australia (Lever/GH/Workable) | 50+ Aussie tech companies via direct API | Remote |
Each one is a self-contained Hermes cron job. The scheduler spins up a fresh agent session, loads the browser container, and drops the agent into a detailed prompt that tells it exactly what to do.
The Stack
| Component | Choice | Why |
|---|---|---|
| Agent framework | Hermes (my own AI agent) | Built-in cron scheduler, Discord delivery, tool orchestration |
| Scheduler | Hermes cron | Cron expressions, automatic delivery, error tracking |
| Model | deepseek-v4-flash | Cheap, fast, good enough for structured web scraping |
| Browser engine | Chromium 147 (Docker) | Persistent login session, noVNC + CDP |
| Browser access | Chrome DevTools Protocol (CDP) via nginx proxy | Programmatic page navigation, DOM snapshots, scroll |
| Delivery | Discord channel | I check it on mobile, gets threaded neatly |
The browser container is a custom build — not Browserless, not Playwright’s ephemeral browser. It’s a persistent headful Chromium running inside Docker with a logged-in LinkedIn profile. The same container powers all my social media automation (X, Instagram, Threads) — LinkedIn is just another tab.
Docker container (browser)
├── Xvfb (virtual display :100)
├── Chromium 147 (headful, port 9222)
│ └── Logged into LinkedIn (ekoprstyofficial@...)
├── nginx (port 9223 → 9222, CDP proxy + WebSocket URL rewrite)
├── noVNC (port 6080, manual debugging)
└── tini (PID 1)
The agent talks to Chromium through browser_cdp — raw CDP commands sent to http://127.0.0.1:9223. No Playwright. No Puppeteer. Direct protocol access.
How the Crawl Works
Each cron job follows the exact same 7-step workflow for every search query:
Step 1: Navigate to LinkedIn search with filters
| |
The URL encodes three critical filters:
f_WT=2→ Remote only (for remote jobs) or omitted for onsitef_TPR=r604800→ Past 7 days (604,800 seconds)sortBy=DD→ Most recent first
Step 2: Wait for the page to render
A simple sleep 4 between navigation and first snapshot. LinkedIn’s search results are JavaScript-heavy — hitting the DOM too early gets you a spinner.
Step 3: Extract job cards from DOM
The agent takes a browser_snapshot(full=true) — a full accessibility tree dump of the page. Each job card contains:
- Title (
h3.base-search-card__title) - Company (
h4.base-search-card__subtitle) - Location (
span.job-search-card__location) - Posting age (
timeelement withdatetimeattribute) - Job URL (extracted from anchor
href) - Job ID (extracted from
urn:li:jobPosting:NNNNNNNNNN)
Step 4: Scroll for more results
LinkedIn’s virtualized DOM only renders about 7 jobs at a time. The agent scrolls 3-4 times after each snapshot to reveal more listings. This is the biggest limitation — LinkedIn might report “227 results” but the DOM only holds a fraction at any moment.
Step 5: Click into promising listings
For the top 3-5 most relevant jobs, the agent navigates to the individual job posting page to verify:
- Is it truly remote? (Job description must contain “remote”, “work from home”, “WFH” explicitly)
- Is it actually frontend? (React, Vue, Angular, TypeScript, JavaScript — not “Data Engineer” masquerading in search results)
- Are there salary ranges? (LinkedIn sometimes shows these on the detail page)
Step 6: Deduplicate
Same job appearing across multiple search queries? Tracked by LinkedIn job ID. Included once.
Step 7: Repeat for all queries
Each cron job runs 4-7 search queries with different keyword variations and locations. Total crawl time per job: 5-15 minutes depending on LinkedIn’s response speed.
The PIK Problem
Pantai Indah Kapuk (PIK) — the area in North Jakarta where I’m based — does not exist as a LinkedIn location.
Search for location=Pantai%20Indah%20Kapuk and LinkedIn silently falls back to a broad Jakarta search with zero indication that your location was rejected. I learned this the hard way after two days of getting “0 results” and assuming there were genuinely no jobs.
The fix is a two-layer approach:
Layer 1: Search with valid administrative locations. PIK is administratively part of Tangerang (specifically Kecamatan Penjaringan, Jakarta Utara — but LinkedIn recognizes Tangerang and Jakarta Utara as locations). The agent searches with:
Tangerang → 2 keyword passes
Jakarta Utara → 2 keyword passes
Jakarta → 2 keyword passes (wide net)
Layer 2: Post-filter in description text. After getting results, the agent clicks into individual job listings and scans the description HTML for PIK-specific keywords:
"PIK", "Pantai Indah Kapuk", "PIK 2", "Pantjoran PIK", "14470"
PIK matches get a special 🏙️ PIK badge in the report so I can spot them instantly.
This same pattern applies to any informal area name — always verify it against LinkedIn’s location autocomplete first, and search with the closest valid administrative location.
The Filtering Intelligence
Raw LinkedIn search results are noisy. A search for “frontend developer” in Jakarta returns Data Engineers, SREs, AI Red Team Testers, and “Event & Partnerships Coordinators.” The agent needs to know what to keep and what to discard.
Include (passes filter)
- Frontend, front-end, front end
- React, Vue, Angular, Svelte, Next.js, Nuxt
- JavaScript, TypeScript
- Web Developer, UI Engineer
Exclude (dropped immediately)
- Pure backend (Backend Engineer, Data Engineer, SRE, DevOps)
- Non-dev roles (Product Manager, Sales, Marketing, SEO, Event Coordinator)
- Senior Director / VP level
- Internships
- Roles requiring partial onsite presence (for remote-only jobs)
The agent verifies exclusion by: (1) checking the job title against keyword patterns, (2) reading the description snippet in the listing, and (3) for ambiguous cases, clicking into the detail page.
The Four Markets
🇮🇩 Indonesia Onsite (Tangerang, Jakarta Utara, Jakarta)
6 search queries across 3 locations with 2 keyword passes each. The hardest market to scrape because LinkedIn’s broad keyword search is notoriously bad at distinguishing job categories in the Jakarta tech market. Multiple queries with different phrasings (frontend developer, react OR vue OR angular, "frontend", front-end developer) are necessary to catch everything.
🇮🇩 Indonesia Remote
Remote frontend jobs open to Indonesian residents. Small but growing market — typically 2-5 new postings per week. Companies hiring: Deel, micro1, Crossing Hurdles, Affinity Labs. Most are global companies hiring APAC-remote, not Indonesian companies.
🇸🇬 Singapore Remote
Higher volume, higher salary — Singapore tech companies regularly hire remote across APAC. The f_WT=2 remote filter works well here because Singapore’s location taxonomy is clean.
🇦🇺 Australia Remote
Two-pronged approach:
- LinkedIn crawl — same browser CDP method, searching for remote + Australia
- Direct API crawl — hits 50+ Australian tech company job boards via their ATS APIs (Lever, Greenhouse, Workable) using Node.js
fetch(). This catches jobs that never appear on LinkedIn.
The API crawl is the hidden gem. Company career pages are the most up-to-date source — postings often hit the ATS days before they appear on aggregators. The Node.js script runs in parallel across all companies:
| |
For Greenhouse boards, ?content=true is appended to the API URL to get full job descriptions, enabling keyword scanning for visa/sponsorship language without fetching each job individually.
Technical Challenges
LinkedIn’s Virtualized DOM
LinkedIn uses virtual scrolling — the DOM only contains ~7 job cards at a time, even when the sidebar shows “227 results.” The agent has to scroll, snapshot, scroll, snapshot in a loop. Each scroll triggers a re-render that might load different cards than expected. This is the single biggest limitation — jobs that exist in the search results but aren’t in the current DOM viewport are invisible.
Tool Call Budget
Each cron job has a maximum tool call budget (capped by Hermes’ iteration limit). A single crawl with 4 search queries, each requiring 2-3 navigations, 2-3 snapshots, 3-4 scrolls, plus clicking into individual job detail pages can easily hit 40+ tool calls. The agent has to prioritize: wide net on search queries, deep inspection only on the most promising listings.
Some crawls time out mid-search. The agent is designed to produce a report from whatever data it has collected — partial crawl is better than no crawl.
Cookie Injection Doesn’t Work
Chrome 147 (the latest stable at time of writing) has tightened cookie security. All CDP cookie manipulation methods (Network.setCookie, Network.setCookies, Storage.setCookies) are blocked or return errors in headful mode. You cannot inject a LinkedIn session cookie programmatically.
The solution: maintain a persistent Chrome profile that stays logged in. The Docker container’s /data/profile/Default/Cookies file is mounted as a volume and survives container restarts. Log in once manually (via noVNC at :6080) and the session persists indefinitely. If LinkedIn expires the session, you log in again through the noVNC UI.
Rate Limiting
LinkedIn doesn’t explicitly block the crawls — probably because the agent uses a real browser with normal human scroll delays (3-5 seconds between actions). But the volume is noticeable: 5 cron jobs × ~4 searches × ~5 navigations each = 100+ LinkedIn page loads every morning. So far, no bans. The key is the delays — an instant crawl would trip rate limits immediately.
What the Reports Look Like
Here’s a real output from the Indonesia Remote crawl on June 4th:
## 🇮🇩 INDONESIA REMOTE FRONTEND JOBS REPORT
**Completed:** Thu Jun 4, 2026 · 09:14 WIB
### 🟢 CONFIRMED REMOTE · NEW (< 24h)
| # | Title | Company | Salary | Age | Type |
|---|-------|---------|--------|-----|------|
| 1 | Front-end Developer (Typescript) | micro1 | — | ~19h | Contract |
| 2 | Full Stack Developer | Deel | — | ~12h | Full-time |
| 3 | Full-Stack Software Engineer | Crossing Hurdles | $30-90/hr | ~17h | Contract |
### Excluded
| Title | Company | Reason |
|-------|---------|--------|
| Data Engineer Staff | Cetta | Data, not frontend |
| Senior Backend Engineer | BTSE | Backend only |
| Swift Engineer | Crossing Hurdles | iOS, not frontend |
The badges tell me at a glance: 🟢 is new today, 🟡 is from this week. The exclusion table is just as useful as the inclusion list — it proves the agent looked at everything and made deliberate decisions, not omissions.
The Real Value
1. First-Mover Advantage
Jobs posted 2 hours ago vs 2 days ago have dramatically different applicant pools. Being one of the first 10 applicants matters. The agent runs at 9 AM — most recruiters post jobs during business hours the previous day, meaning I see them before the morning refresh crowd.
2. Market Visibility
I now know exactly how many remote frontend jobs are available in Indonesia, Singapore, and Australia at any given moment — and how that changes week to week. Last week: 3 remote frontend roles across all Indonesia. The week before: 7. This is market intelligence I wouldn’t have otherwise.
3. Zero Cognitive Load
I don’t think about job hunting anymore. It happens automatically while I build products, write code, and work on my own business. The reports are there when I want them — I can scroll back through Discord history to see every single crawl for the past month.
4. The Exclusions Matter
Seeing what’s NOT available is valuable. The “Data Engineer, Data Engineer, Backend Engineer, Backend Engineer, Platform Engineer, Backend Engineer” pattern across multiple weeks confirms Indonesia’s frontend market skews backend/heavy. Relocating (or targeting remote-only) becomes the obvious strategic move — and I have the data to prove it.
Cost
| Item | Cost |
|---|---|
| Model (deepseek-v4-flash, 5 crawls/day) | ~$0.05-0.15/day |
| VPS (browser container memory overhead) | Already running for other automation |
| LinkedIn account | Free |
| My time | Zero |
The browser container uses about 800MB RAM with Chromium running. It’s already running 24/7 for social media automation — the LinkedIn crawls are an incremental load on an existing resource.
Token usage per crawl: roughly 5,000-15,000 input tokens (the prompt + snapshot data) and 500-2,000 output tokens (the formatted report). At deepseek-v4-flash pricing, that’s fractions of a cent per crawl.
What’s Next
Automated application. The next logical step. The agent already has a browser with a logged-in LinkedIn session. “Easy Apply” jobs could be submitted automatically if the profile is pre-filled. The hard part isn’t the automation — it’s the quality control. Auto-submitting to a job you’d reject on reading the full description is worse than not applying at all. I’m working on a two-stage process: agent flags candidates, I review and approve, agent submits.
Company list expansion. The direct API crawl needs a maintained list of company ATS boards. New companies get added as I discover them. This is a maintenance cost that compounds — the list grows, the crawl stays relevant.
Cross-referencing with Glassdoor/Levels.fyi. Salary data, interview process, company reputation. Right now the agent only reports what LinkedIn shows — but the API endpoints exist for a richer intelligence layer.
The Bigger Picture
This isn’t just about job hunting. It’s a pattern: persistent browser + cron-scheduled agent + structured prompt = any recurring web task, fully automated.
The same approach powers my tech news briefing (every 4 hours), my meme poster (daily), my product research scanner (weekly), and a dozen other automations. The only difference between each one is the prompt.
Your morning routine probably has a few LinkedIn searches in it too. It doesn’t have to.
Built with Hermes Agent, a persistent Chromium Docker container, and a bunch of carefully-tuned prompts. The full browser automation rig is documented here.