A founder I know got rejected from a pitch night last week. Not because his numbers were bad — he's at $11,000 MRR, growing, and profitable. The feedback was literally: "what do you even need the money for?"

He laughed about it later, but he wasn't wrong to apply. The VC world is built around the idea that software businesses have huge fixed costs, need to burn cash for years before they work, and can only scale if you set fire to a seed round first. The whole pitch-deck aesthetic assumes AWS bills in the tens of thousands, a managed Kubernetes cluster, a data warehouse, a growth team, and the kind of infrastructure that needs a dedicated engineer just to keep the lights on.

Here's the thing. None of that is actually necessary. You can run a real, profitable, growing SaaS on a tech stack that costs about the same as a Netflix subscription. I'm not talking about a toy project. I'm talking about something with paying customers, real uptime, sub-second latency, and a margin profile VCs would hate because there's nothing for them to fix.

This is the playbook I'd use tomorrow if I were starting over with my own money. Every line item is named. Every price is real. Every choice is opinionated, and I promise you won't agree with all of them. That's the point.

Last updated: April 2026

The Philosophy: Most of What You Think You Need, You Don't

The default advice in 2026 reads like a checklist from a Gartner report. Kubernetes. Managed Postgres. Redis cluster. Dedicated observability platform. CI/CD orchestration. Multi-region failover. Feature flag service. By the time you've stood up the "minimum viable platform," you've spent six weeks writing YAML, burned through your first check, and haven't shipped a single feature that a customer cares about.

The best antidote I know is Dan McKinley's classic "Choose Boring Technology" essay. The argument in one sentence: every new technology you adopt has a cost, and you only have so many "innovation tokens" to spend. Spending them on infrastructure means you have fewer to spend on the product — which is the only thing your customers ever actually care about.

DHH has been making a louder version of this argument for years, culminating in Basecamp's exit from AWS that saved the company roughly $2 million annually. Gergely Orosz's Pragmatic Engineer newsletter is another good source of this same medicine: most startups are 18 months ahead of where their infrastructure needs to be, and they pay for it in runway.

OK. Sermon over. Let's talk about the actual stack.

Use a Lean Server

The naive way to launch a web app in 2026 is to fire up AWS, provision an EKS cluster, set up an RDS instance, add a NAT Gateway, and accidentally spend $300 a month before a single user has visited your landing page. The smart way is to rent a single Virtual Private Server and call it a day.

I use Hetzner Cloud. The CX22 instance is 2 vCPU, 4GB of RAM, 40GB of NVMe, and 20TB of included bandwidth for €4.59/month — roughly $4.90 USD. DigitalOcean and Linode have comparable boxes in the $5-10 range. Vultr rounds out the big three if you want a second opinion.

Forget AWS. You aren't going to need it for a long time, and their pricing model is a labyrinth designed to extract upgrades. The goal of your infrastructure is to serve requests, not to generate invoices. When you have one server, you know exactly where the logs are, exactly why it crashed, and exactly how to restart it. That knowledge alone is worth more than whatever magic you think you're buying from the big clouds.

4GB of RAM sounds terrifying to modern web developers raised on 16GB Kubernetes nodes, but it's more than enough if you know what you're doing. If you need breathing room, add a swapfile. The Twelve-Factor App principles still apply — they just apply harder when you've got real constraints.

Use a Lean Language

Once you've committed to a single small box, your language choice matters more than it used to. You can run Python or Ruby as your main backend, but you'll spend half your memory just booting the interpreter and managing worker processes. I write my backends in Go.

Go is fast, strictly typed, and — critically for 2026 — it's incredibly easy for LLMs to reason about, which matters when you're pairing with AI every day. But the real magic is deployment. There's no pip install dependency hell, no virtualenv, no Dockerfile you have to pretend to understand. You compile your entire application into a single, statically linked binary on your laptop, scp it to your server, and run it.

Here's what a complete, production-ready web server looks like in Go. No frameworks required:

package main
 
import (
    "fmt"
    "net/http"
)
 
func main() {
    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        fmt.Fprintf(w, "Hello, your MRR is safe here.")
    })
 
    // This will comfortably handle tens of thousands of requests
    // per second on a $5 VPS.
    http.ListenAndServe(":8080", nil)
}

That's not pseudocode. That's the real thing. It's what TechEmpower's benchmarks have been quietly showing for years: Go's net/http stack outperforms most framework-heavy setups in Node, Python, and Ruby by a factor of ten or more on equivalent hardware.

If you don't want Go, Bun plus TypeScript is a reasonable second choice in 2026 — it's fast, it handles transpilation and packaging natively, and the deployment story is nearly as clean as Go's. Rust is the maximalist option if you're allergic to runtime errors and have the time. What you shouldn't do is start with a heavyweight framework that assumes infinite memory and needs a team of DevOps engineers to deploy.

Use SQLite for Everything (Yes, Really)

I always start a new venture with SQLite as the main database. Hear me out — this is not as insane as it sounds.

The enterprise mindset insists you need an out-of-process database server. The truth is, a local SQLite file communicating over the C interface is orders of magnitude faster than a TCP hop to a Postgres instance. SQLite's own documentation on when to use it explicitly recommends it for any site with fewer than 100K hits per day, which is approximately every startup I know. It's the most widely deployed database in the world, and it's tested to aviation-grade standards.

The concurrency objection used to be real. It's not anymore. Turn on Write-Ahead Logging and your readers stop blocking writers and your writers stop blocking readers:

PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;

That's it. You can now comfortably handle thousands of concurrent users off a single .db file on NVMe storage. Ben Johnson's post on SQLite at scale is the definitive read if you want to go deeper.

The durability question used to be the real blocker. Then Litestream came along and solved it by continuously streaming your SQLite WAL to S3-compatible storage. If your VPS catches fire, you rehydrate from object storage in under a minute. Ben Johnson — Litestream's creator — now works on LiteFS at Fly.io, which takes the same idea further and makes SQLite a genuinely distributed database.

If you insist on Postgres, Supabase's free tier gives you 500MB of managed Postgres for $0. Neon's free tier is similar. Both are fine. Just don't pay $70/month for "enterprise" Postgres before you have product-market fit. Turso is another option if you want distributed SQLite without running it yourself.

Use Local AI for the Batch Stuff

If you already have a graphics card sitting in a box under your desk, you have essentially unlimited AI credits. This changes the math on everything.

The naive approach to anything AI-flavored is to throw every request at the OpenAI API, swallow the bill, and hope you didn't write a bug in your prompt loop that triggers 10,000 retries. The smart approach is to do the heavy, non-latency-sensitive work on your own hardware.

A used RTX 3090 or 4090 on eBay or Facebook Marketplace runs $700-$1,100 in 2026. Drop it in any desktop, install Ollama or vLLM, and you can serve open models like Llama 3.3 70B, Qwen 2.5, or DeepSeek V3 at real production speeds.

Two-step upgrade path:

Start with Ollama. One command (ollama run qwen2.5:32b) and you have a working local model. It's the right environment for iterating on prompts and trying out different models without committing.

Graduate to vLLM for production. Once you have a prompt that works, Ollama's concurrency story falls over. vLLM uses PagedAttention to serve many concurrent requests efficiently on a single GPU — it locks the GPU to one model but serves it dramatically faster. For anything batch-heavy, vLLM is the correct answer.

The LocalLLaMA subreddit is the single best information source on what's actually working on consumer hardware right now. I check it every morning. Simon Willison's blog is the other indispensable source — he's been quietly benchmarking local models against hosted APIs for years.

Use OpenRouter for the Smart Stuff

You can't do everything locally. Sometimes you need frontier-level reasoning for a user-facing interaction, and a 70B open model isn't going to cut it. Instead of juggling billing accounts and rate limits across Anthropic, OpenAI, Google, and whoever else, use OpenRouter.

One API key. One OpenAI-compatible integration. Access to every major frontier model — Claude Sonnet 4.6, GPT-5, Gemini 2.5, Llama 3.3, Mistral Large, and everything in between — all behind the same interface.

The real win is fallback routing. When Anthropic's API has a bad Tuesday afternoon (which, if you haven't noticed, it periodically does), my app automatically falls back to an equivalent OpenAI model. My users never see an error. I don't have to write retry logic. This alone is worth the switching cost from direct provider SDKs.

For heavier orchestration across multiple model providers, LiteLLM is the open-source alternative if you want to self-host the gateway. Both work. OpenRouter is the lower-friction option for solo founders.

Use Copilot, Not the $500/Month Agentic IDE

Every week there's a new AI coding tool that promises to refactor your codebase autonomously for $300 a month. Cursor is great. Claude Code is great. Windsurf is great. I've used all of them, and for most jobs, none of them are worth the subscription cost compared to the one tool everyone ignores.

I'm using GitHub Copilot in standard VS Code. It's been my primary driver since 2023 and my monthly bill hasn't moved. Microsoft's pricing model for Copilot charges per request, not per token — which means a single prompt can trigger an agent that chews through your entire codebase, refactors dozens of files, and runs for half an hour, and it still costs roughly four cents. Simon Willison has written about this arbitrage and it still hasn't been closed.

The optimal strategy is dead simple: write brutally detailed prompts with strict success criteria, tell the agent to "keep going until all tests pass," hit enter, and go make coffee while Satya Nadella subsidizes your compute. Is it the best experience on the market? No. Claude Opus 4.6 in Claude Code is noticeably smarter. But "noticeably smarter" at $200/month vs "good enough" at $20/month is not a close call when you're bootstrapping.

For a deeper look at the 2026 AI coding tools landscape, Latent Space runs regular comparisons that are worth skimming.

Deployment: Boring, Reliable, Cheap

I deploy with a Bash script. I'm not kidding.

#!/bin/bash
# deploy.sh
set -e
go build -o ./bin/app ./cmd/server
rsync -avz --delete ./bin/ root@my-vps:/opt/app/bin/
ssh root@my-vps 'systemctl restart app'

That's the whole thing. systemd handles process management. Caddy terminates TLS and handles certs automatically via Let's Encrypt. A GitHub Actions workflow runs this script on every push to main. Total deploy time from commit to live: about twelve seconds. Total rollback time: one SSH session.

You don't need Kubernetes. You don't need Helm charts. You don't need ArgoCD. You don't need a feature-flag SaaS. You need a script that builds your binary, copies it over, and restarts the service. If your whole infrastructure can't be understood by a competent developer in fifteen minutes, you've overcomplicated it, and the complexity will cost you every single day until you rip it out.

For monitoring, Better Stack's free tier gives you ten uptime monitors and a status page for zero dollars. Sentry's free tier handles 5,000 errors a month. Plausible self-hosted or PostHog's free tier covers analytics. None of these cost money until you have real revenue to spend.

The Objections You're About to Raise

"What about backups?" Litestream to Cloudflare R2. $1/month for storage. Point-in-time recovery from object storage. Done.

"What about CI/CD and rollbacks?" GitHub Actions runs your tests and the deploy script. Rollback is git revert and another push. For anything fancier, you can write a blue-green deploy in 50 lines of Bash. This is not a hard problem.

"What about scale?" You can't scale what you haven't built. Ship the thing. Serve the customers you have. Upgrade when it breaks. WhatsApp ran on 32 servers when it had 450 million users and Instagram ran on a tiny team before it was acquired for a billion dollars. You are not them, and you need less infrastructure than they did.

"What about compliance?" SOC 2, HIPAA, and friends are mostly about policies, access controls, and audit logs, not about whether you run Kubernetes. Vanta and Drata will walk you through the actual requirements. Most of them are satisfied by a VPS with proper access controls and logging.

"What about hiring engineers who want to use modern tools?" This one is real, and the honest answer is you're hiring someone whose main interest is your product, not your stack. Engineers who demand Kubernetes before you have product-market fit are engineers who will burn your runway on platform work and leave when it gets boring.

"What about uptime?" Hetzner publishes a 99.9% SLA. That's 43 minutes of downtime per month. Your customers will not notice the difference between 99.9% and 99.99% at your scale. They will notice if you run out of runway because you were paying $8,400/month for infrastructure on $6,200 MRR.

Running the Numbers

Let's add it up. A realistic lean stack in 2026 looks like this:

Layer	Service	Monthly
Server	Hetzner CX22	~$5
Object storage	Cloudflare R2	~$1
Domain	Cloudflare Registrar	~$1
TLS	Let's Encrypt via Caddy	$0
Database	SQLite + Litestream	$0
Auth	Self-hosted library	$0
Email	Resend free tier	$0
Analytics	Plausible self-hosted / PostHog free	$0
Errors	Sentry free tier	$0
Uptime	Better Stack free tier	$0
Deployment	GitHub Actions free tier	$0
AI (batch)	Local GPU (amortized)	~$0
AI (live)	OpenRouter	Pay-per-use
Copilot	GitHub Copilot	~$10-20
Total		~$20/month

That's your infrastructure budget for a growing SaaS. The variable cost is OpenRouter usage, which scales with real revenue — the definition of good unit economics. At $10K MRR and reasonable usage, you're still under $300/month of total spend, and your gross margin is somewhere around 97%. That's the number VCs quietly hate because there's nothing to optimize and nothing to invest in.

The Bottom Line

The tech industry has a vested interest in convincing you that building a real software business requires complex orchestration, six-figure AWS bills, and millions in venture capital. It doesn't. It never did. The people who loudly argue otherwise are usually the people selling you the complex orchestration.

By running a single VPS, a statically compiled binary, a local GPU for batch work, and the raw speed of SQLite, you can bootstrap something that costs less than a gym membership and still gives you infinite runway. You give yourself the time to actually solve your users' problems instead of sweating your burn rate. You skip the board meetings, the quarterly reviews, and the slow death of "optimizing for growth" before you've even figured out what the product is.

None of this is the right answer for every company. If you're selling to Fortune 500 procurement, they will want SOC 2 and SAML and all the enterprise boxes ticked, and you will end up adding some of them. If your workload genuinely needs a thirty-node GPU cluster, you will have to build one. But 95% of SaaS startups are not in that situation — they just think they are because the marketing is loud.

Build the boring stack. Ship the thing. Serve the customers. Upgrade when it actually breaks, not a moment before.

If you're building anything in the AI / conversational space specifically, Chatsby is what I'd wire into the "live AI" slot of this stack so I didn't have to build it myself. Everything else above still applies.

Share this article:

How I'd Run a $10K MRR SaaS on a $20/Month Tech Stack in 2026