
ChatGPT doing it’s best Cloudflare
Full-stack Cloudflare Part 3 — Content Orchestration at the Edge
This is part 3/3 of a series of posts about building on Cloudflare. Here is part 1 and part 2.
Sports fans are impatient and sports content is unpredictable. If you’re building a platform that can simultaneously monitor AND produce coverage for a league, it needs to be fast and reactive. Traditionally, this might involve a process of manually scaling servers up and down, constantly trying to anticipate when a major news story might break. Or if you’re feeling ambitious it could mean managing a complicated Kafka or Kubernetes setup. At PressBox, we side-stepped both approaches by building a content pipeline on the edge leveraging Cloudflare Workers, Queues, and Workflows.
Why?
The Old Way Wasn’t Working
Why build on Cloudflare Workers instead of just spinning up an EC2 instance? The typical setup is familiar: deploy to AWS us-east-1 (crossing your fingers), set up your database in the same region, maybe add a CDN for static assets. This works fine for predictable and steady traffic patterns or for workloads that are specific and have known constraints.
But as I mentioned in the intro, sports traffic is weird. You might cruise along at baseline for weeks or months, then suddenly the playoffs (or The Decision) hit and you’re dealing with 10x normal load. With traditional infrastructure, you either overprovision (and waste money) or underprovision (and crash during the biggest moments).
Edge computing isn’t new, but it still seems underutilized and/or misunderstood by many tech companies. The gravity to run a traditional server farm is strong, especially if the code running on your servers requires a specific environment. But if you’re comfortable with JavaScript (everyone has to include it somewhere in their stack anyway) then it opens up so many opportunities. It allows us to not think about traditional servers anymore. Traffic spike during the World Cup final? The edge handles it. Quiet Tuesday afternoon? We’re not paying for idle servers. Each tenant on our platform has wildly different traffic patterns, and the edge just… deals with it.
Workflows
We found a lot of utility from Cloudflare Workflows. Regular workers are meant for short & specific tasks. That’s fine for serving content, but AI processing takes longer. Workflows can run for minutes/hours/days, survive restarts, and coordinate between different services. Without workflows, we’d be building our own job queue system, implementing retry logic, handling state management across failures. Workflows handle all of that for us.
How?
We broke our system into six specialized workers. Each one does one thing well, and they communicate through queues and service bindings. Here’s the flow:
Ingestion → AI Enrichment → Curation → Scoring → Notifications → Publishing
Let’s walk through each stage.
Stage 1: Getting Content In
First challenge: monitoring content from everywhere. News sites, social media, video platforms, RSS feeds. Some sources publish constantly, others update weekly. Each has its own format and quirks.
We run a scheduler that checks each source on a custom interval. When it’s time to check a source, we drop a message in a queue:
{
org: "premier-league",
sourceId: "bbc-sport",
type: "rss"
}The ingestion worker grabs these messages in batches and processes them:
export default {
async queue(batch, env) {
for (const message of batch.messages) {
try {
await processSource(message.body);
message.ack(); // Done, remove from queue
} catch (error) {
// Network error? Retry later
if (isTemporary(error)) {
message.retry({ delaySeconds: 60 });
} else {
// Bad config? 404? Don't retry
message.ack(); // Give up
}
}
}
},
};Be pragmatic: retry the transient timeout, fail fast on 404.
We dump raw content into R2 so downstream workers can process it without hitting the original source again.
Stage 2: Making Content Smart with AI
Raw HTML isn’t super useful. We need structured data: headlines, summaries, entities, sentiment. This is where AI comes in and workflows become essential.
Workflows let us break AI processing into steps. If one step fails, we retry just that step, not the whole process:
class EnrichmentWorkflow extends WorkflowEntrypoint {
async run(event, step) {
const { contentId } = event.payload;
// Get the content
const content = await step.do("fetch", async () => {
return await fetchContent(contentId);
});
// Check if we've seen this before
const cached = await step.do("cache-check", async () => {
return await checkCache(content.url);
});
if (cached) {
return cached;
}
// Run AI enrichment
const enriched = await step.do("ai-process", async () => {
return await runAI(content);
});
// Save for next time
await step.do("cache-save", async () => {
await saveCache(content.url, enriched);
});
return enriched;
}
}Each step gets automatic retries and checkpointing. If the AI call times out, we retry just that step. The workflow picks up where it left off.
Stage 3: Human-in-the-Loop Curation
AI is good but not perfect. We involve human editors who can review and approve content before it goes live. We built a curation interface that lets them:
Approve or reject AI-generated summaries
Fix entity extraction mistakes
Adjust relevance scores
Set publishing schedules
The curation worker handles all the CRUD operations for this interface, storing decisions that feed back into our ML models. Over time, the system learns what each organization considers publishable.
Stage 4: Scoring What Matters
A trade rumor about a star player might be huge news for one league but irrelevant to another. We run tenant-specific ML models that score content based on:
Category relevance
Source reputation
How fresh the news is
Historical engagement patterns
Scores can influence everything from homepage placement to which stories make it into email digests.
Stage 5 & 6: Getting Content Out
The last two workers handle distribution. We can send alerts to editors, generate email digests, publish audio, or trigger a push notification. The only limit is the imagination of the league or sports organization we’re working with. All customizable based on their scheduling needs.
Worker Communication
We use three communication patterns:
Queues for async work: When worker A’s output becomes worker B’s input, we use a queue. Decouples the workers, buffers traffic spikes, gives us automatic retries.
Service bindings for sync calls: When we need an immediate response (like fetching config), we use service bindings. These are basically RPC calls that stay within Cloudflare’s network.
Workflows for orchestration: Multi-step processes where each step might fail independently? That’s workflow territory.
Making Multi-Tenancy Work
All of our customers get complete data isolation. We built tenant-scoped APIs that dynamically connect to the right database:
function getContentAPI(env, org) {
// Each org has their own database
const dbUrl = env[`DB_${org.toUpperCase()}`];
return {
async getContent(id) {
const db = createClient(dbUrl);
return await db.query("SELECT * FROM content WHERE id = ?", id);
}
};
}Same pattern for vector stores, ML models, everything. Strong isolation without complex filtering logic everywhere.
Our Learnings
Workflows & Queues were the answer to many problems we’ve faced over time. As we built the platform, we ended up in several situations that didn’t scale or work over time:
Synchronous worker chains: This is a very common pattern when building with workers. Worker A calls Worker B calls Worker C. Whenever we notice this happening we can quickly eject and just drop something in a queue or kick off a workflow.
The Super Worker: Another common pattern is to cram everything into one giant worker. This is feasible now using frameworks that run comfortably in a worker like Hono or React Router. But we prefer purpose-built workers that can be deployed independently. It also helps with cold-starts.
Observability: You can’t just SSH into a server and tail logs. Cloudflare does have an Observability dashboard, but we haven’t found it very reliable. Especially for real-time stuff. So we’ve had to be a little creative with structured logging, and custom metrics/events. Sentry and PostHog have been great for this.
Conclusion
We know this type of infrastructure isn’t for everyone. There is comfort in the traditional approach of spinning servers up and down and fiddling with CPU and RAM. It also doesn’t work if you’re not comfortable writing almost everything in JavaScript/TypeScript. We still have specialized needs that require this traditional approach. But that has been the exception, not the rule.
But this approach has paid off for us. Our infrastructure costs are low and we’re able to focus on what really matters — building quickly and improving the experience for our customers.
Want to see this in action? Book a demo to check out how PressBox handles content for professional sports organizations.