The Solo SaaS AI Stack: A Real Cost Breakdown (Actual Receipts)

I run four SaaS products solo on under $200/month in AI costs. Here

Every founder I talk to has the same anxiety about AI costs: "I don't know what I'll spend until it's already too much." The horror stories — a $2,400 OpenAI bill from a runaway loop, a $400/day vector-DB charge that nobody noticed for a week — make people paranoid. So they either over-provision (wasting money) or under-provision (wasting velocity).

This post solves that. Below is the real, line-by-line AI cost of running the four-product Autopilot Labs ecosystem — Site Autopilot, Traffic Autopilot, Backlink Autopilot, and Your Travel Companion — solo, in February 2026. No founder folklore, no inflated estimates from $30M-funded blog posts. The actual receipts.

The headline number: ~$180/month, four products

Total AI spend across all four products: roughly $180/month, including text generation, image generation, embeddings, and the occasional Whisper transcription. That number scales linearly with paid users, not with active users — the difference matters more than people realise.

Here's the breakdown:

Service	Use case	Monthly cost
Claude (Haiku / Sonnet / Opus tiered)	Site generation, blog drafts, SEO copy	$92
Gemini Nano Banana	Hero images for blogs, OG images	$31
OpenAI GPT-4o-mini	Cheap classifications & summaries	$14
OpenAI text-embedding-3-small	Topic clustering, dedupe checks	$3
Whisper (occasional)	Voice-note transcription for Travel Companion	$2
Cloudflare Pages + R2 + Workers	Hosting all four sites + assets	$0
MongoDB Atlas (free tier)	App database	$0
Resend (free tier)	Transactional + digest email	$0
Stripe	Billing (only charges per-tx, not flat)	~$8
Domains (4 × $11/yr amortised)	—	$4
Total		~$154–180

Let me unpack the choices behind each line, because that's where the real lessons live.

Why Claude is the biggest line item (and why I'm okay with it)

$92/month is Claude. That's by far the largest cost. It's also the cost I'd defend the loudest if someone tried to cut it.

Here's the thing: Claude is not a commodity for content generation. Most AI-written content reads like AI-written content because most founders use GPT-4 (or worse, GPT-4o-mini) for the entire generation pipeline. The cost-per-token feels cheap until you realise the output is brand-damaging.

Claude — specifically Sonnet 4.5 for blog content and Opus 4.5 for the highest-stakes pieces — generates copy that doesn't read like AI. That's not a marketing claim, that's a tactile observation after running 200+ A/B-tested generations across both. The structural difference: Claude resists the "thesis → three bullet points → conclusion" trap that GPT-4o falls into reliably. You can prompt around it, but the prompts get long and brittle. Easier to just pay a few cents more per generation.

Tiering matters a lot. The Site Autopilot product uses Haiku 4.5 for Free users, Sonnet for Starter/Growth, Opus for Pro/Agency. That tiering isn't just a paywall mechanic — it's a cost-shape mechanic. Free users generate cheap content. Paying users get expensive content. The unit economics work because the people willing to pay $99/month get the model that justifies the price.

Why Nano Banana over DALL-E 3 or Imagen

$31/month is Gemini Nano Banana (3.1-flash-image-preview) for hero images, OG images, and per-blog illustrations.

I tested all three image models extensively before committing:

DALL-E 3: gorgeous, but ~3x more expensive per image, and rate-limited in ways that bottleneck batch generation
Imagen 3: also expensive, and inconsistent on text-in-image (which I need for OG images)
Gemini Nano Banana: 80% of DALL-E 3's quality at 30% of the cost, and the text rendering is shockingly reliable

The decision came down to cost-per-generation × generations-per-month. A blog post needs one hero image. We generate roughly 80 blog posts a month across all four products. At DALL-E 3 prices, that's ~$120/month just for hero images. Nano Banana brings it to $30. The visual difference is real but not commercially meaningful for SEO-driven blog content.

Choose your image model based on the worst image it will produce, not the best one. The best is for portfolio shots. The worst is what 80% of your users will see.

The line items that should be bigger (but aren't)

Embeddings: $3/month

Embedding spend is laughably small relative to its strategic value. OpenAI's text-embedding-3-small is $0.02 per 1M tokens. We use it for topic clustering (which blog ideas overlap?), dedupe checks (is this draft 80% similar to an existing post?), and basic semantic search. It runs constantly. Total cost: ~$3/month.

If you're not using embeddings, you're either (a) over-paying Claude to do work embeddings could do, or (b) shipping content that's accidentally duplicative because you couldn't compare it cheaply.

Hosting: $0/month

All four marketing sites + the Site Autopilot React app run on Cloudflare Pages. The first 500 builds/month are free. The first 100k requests/day are free. We've never come close to hitting either limit. The combined hosting bill for the entire ecosystem is genuinely $0, and will stay $0 until we're doing real volume — at which point we'll happily pay $5/month for Cloudflare Pages Pro.

If you're paying Vercel or Netlify $20/month for static-site hosting and your site is mostly HTML, you're being charged for someone else's pricing problem.

Database: $0/month

MongoDB Atlas's M0 free tier (512MB, shared cluster) holds everything for all four products. Yes, really. Each product has a few thousand documents at most. We'll need to upgrade eventually, but "eventually" is at least a year away at current growth, and the upgrade is $9/month for 5GB. Not a load-bearing cost.

The line items that could bloat (and how to prevent it)

Three places where I've seen indie founders blow $1000+ months without realising:

1. Streaming completions without max-tokens caps

If you let users stream responses with no max_tokens limit, Claude will happily generate a 30,000-token output for a question that needed 200 tokens. Always cap it. The Site Autopilot codebase caps every Claude call. Reading the prompts you sent and the prompts your users sent is the only way to find runaway loops before they show up on the invoice.

2. Re-generating instead of caching

If a user lands on your generated landing page and your stack regenerates the page content on each request, you're paying Claude tokens per pageview. Generate once, store the output, serve the stored version. Sounds obvious, but I've audited two founder codebases this month that were burning $200+/month on re-generation.

3. The "small but constant" job

A nightly background job that calls Claude 50 times for "freshness updates" feels innocent. At Sonnet 4.5 pricing, that's ~$3/night = $90/month, just for a job you forgot you wrote. Audit your cron jobs quarterly. Most of them can be cut in half by batching.

How the cost shape will change at 1000 paying users

At current numbers (~150 paying users across the ecosystem), AI is ~$180/month. Linear extrapolation says at 1000 paying users we'd be at ~$1,200/month. That's wrong — costs will be lower per-user, not the same.

Three reasons:

Prompt caching kicks in. Claude offers prompt caching where repeated prompt prefixes (e.g. the long system message you reuse for every generation) cost ~10% of normal pricing. At scale that's a 30–50% cost reduction on text generation
Volume tier discounts. Claude has enterprise pricing that triggers at high MTU. Worth negotiating around $500/month spend
Better defaults. At 1000 users you have enough data to know which features actually drive revenue. You cut the ones that don't, and the cost-per-active-user drops

The mental model: AI costs scale sublinearly with users if you let them. Linearly if you don't. Superlinearly only if you have a serious bug.

What I'd cut first if revenue dropped 50%

Useful exercise: in a forced-austerity scenario, what would I cut, in order?

Opus 4.5. Drop the Pro/Agency tier to Sonnet 4.5. Saves ~$25/month. Customers might not even notice — the quality gap is real but subtle
Nano Banana on free-tier blogs. Use a CSS-gradient hero placeholder for Free-tier generations. Saves ~$15/month
GPT-4o-mini classifications. Most can be done with cheaper or rule-based logic. Saves ~$10/month

Three cuts, $50/month saved, zero impact on paying customer experience. The discipline of asking "what would I cut?" — even when you don't need to — surfaces the lines you should already have cut.

The takeaway

The fear of unbounded AI costs is overblown for solo founders running well-instrumented apps. The real risks are mostly architectural: uncapped streams, re-generation instead of caching, and stealth cron jobs. Fix those three and you'll stay under $500/month even at hundreds of paying users.

The other piece: spend where it shows up in product quality, save where it doesn't. Claude is worth paying for. Imagen is not. Hosting is not. Embeddings are absurdly cheap relative to their leverage.

If your AI bill is over $500/month and you have fewer than 500 paying users, there's a cost-optimisation problem hiding somewhere in your codebase. Spend a Saturday afternoon auditing — it almost always finds $200/month of waste in under 4 hours of work.