Guardrails for
AI coding.
All green. Login’s broken. Nobody told you.
Ship code you can trust, even the parts you didn’t write yourself.
A guardrail is a rule in plain English. Guardline shows which ones your tests actually cover, and which ones are Gaps.
Built in the open by an indie running several live AI-built apps.
Correct credentials → land on the dashboard.
auth.login.test.ts:21 · expect(redirect).toBe('/app')
Wrong password → generic “Username or password is incorrect.”
auth.login.test.ts:39 · toMatch(/incorrect/) · no field leak
Unknown email → the same generic message. Never reveal the account exists.
auth.login.test.ts:54 · same copy as wrong-password case
Empty fields → inline validation, no request sent.
auth.login.test.ts:67 · expect(fetch).not.toHaveBeenCalled()
Lockout after 5 attempts — declined → the Boundary (post-MVP, on purpose)
login → 5 proposed → decline lockout → 4/4 covered.
You asked AI to tweak the signup form. It also “tidied” the auth helper. Login broke. Every test stayed green. You heard it from a user.
Guardline catches it before you ship. The card turns red.
Green is a vibe.
- Tests pass, but nothing says what they protect.
- You skim the diff and hope.
- Silent breakage ships.
Green is a contract.
- Every guardrail names the test behind it.
- You check what it should do, in plain English.
- A broken guardrail goes red before you ship.
The features that actually scare you.
Login was the first guardrail. Payments and uploads are where a silent edit does the most damage, so they get the same plain-English rules and the same cited proof.
Successful charge → subscription goes active.
checkout.test.ts:18
Card declined → clear error, nothing charged.
checkout.test.ts:33
Stripe retries the webhook → don’t charge twice.
Gap — no test
Image under 5 MB → uploaded and shown.
upload.test.ts:12
A 20 MB file or an .exe → rejected with a reason.
upload.test.ts:27
Filename with ../ → can’t escape the folder.
Gap — no test
Author the guardrails. The Reviewer cites the proof.
Point it at a repo you already have. It finds the test behind each guardrail and flags the ones that have none. Then guardline verify keeps your CI honest.
- 1
Describe it in plain English
The AI suggests the guardrails, including the ones you’d forget.
- 2
The Reviewer maps each to a test
It finds the test that backs each one and cites the assertion.
- 3
Covered, or a Gap
Backed behaviors go green. The rest are Gaps you can close. Anything out of scope goes to the Boundary.
$ guardline verify auth/login ● Correct credentials → dashboard login.test.ts:21 ● Wrong password → generic error login.test.ts:39 ▲ Unknown email → same message Gap — no test ● Empty fields → no request login.test.ts:67 3/4 covered · 1 Gap · 1 in Boundary ✗ verify failed — close the Gap to pass CI
AI editors make code faster.
Guardline makes it accountable.
Review intent, not diffs
Read a handful of plain-English guardrails instead of a 300-line diff.
Test-backed, or an honest Gap
Every guardrail cites its test, or shows up as a Gap.
See the edge of the guarantee
The Boundary makes “out of scope” a visible decision.
Keep using Cursor or Claude Code. Guardline sits on top and tells you what still holds.
If you can describe it, you can ship it.
Shipping software you trust shouldn’t mean reading every diff.
Indie builders
Ship AI code fast without being its unpaid QA.
Small teams
“What we agreed” and “what shipped” become one thing.
Non-engineers
Build real, verified software in plain language.
Tests-first engineers
Keep red-green-refactor. Lose the typing tax.
I’m Dima. I run a few small SaaS apps, mostly built with AI.
One edit took down a login I’d shipped weeks earlier, and nothing flagged it. I heard about it from a user.
So I started building the tool I wish I’d had: write the guardrails down once, and trust they hold while you sleep. I’m doing it in the open.
You’re right to be skeptical.
Why trust the Reviewer? Can’t an AI hallucinate?
You don’t have to take its word for it. Every covered guardrail links to the exact test and assertion behind it, so you can check it in seconds. If nothing backs a rule, Guardline calls it a Gap instead of quietly passing it.
Isn’t this just BDD / Cucumber?
There’s no Gherkin to hand-write and no glue code to wire up. The AI suggests the behaviors and maps each one to a test you already have. Anything with no test becomes a Gap, not a silent pass. That’s the part BDD always missed.
Won’t writing behaviors be as hard as the code?
You don’t start from a blank page. The AI suggests them and you keep the ones that are right. Describing what you want is easier than building it.
Does it replace Cursor / Claude Code?
No. Keep your editor. Guardline runs on top of whatever you use, in any language. Most setups make “green” mean the whole suite passed. Guardline makes green mean this exact behavior is backed by this exact test.
Performance, security, accessibility?
More of this is behavioral than you’d think: “p95 under 500ms,” “works with just a keyboard,” “passwords never logged.” Whatever’s left, you put in the Boundary on purpose.
Languages, launch, price?
JS/TS first, with more on the way. It’s not live yet, so join the waitlist. Free to start, and paid only once it’s doing real work for you.
What did an AI last break for you without telling you?
Be there when the first
guardrail goes green.
Guardline isn’t live yet. Just one honest email the day you can point it at your own repo.