When to Rebuild vs. Refactor: A Decision Framework for SaaS Teams

Every SaaS team eventually hits the moment where the codebase stops feeling like a tool and starts feeling like a constraint. Features that should take a week take a month. New engineers spend their first six weeks reading code. The phrase "we'll clean this up later" appears in 40 pull requests.

The question that follows — rebuild or refactor? — gets treated as philosophical. Most content defaults to "it depends," followed by generic guidance about team size and timeline. That's not useful when you're staring at a $3M ARR product and a codebase your lead engineer wants to burn down.

Here's a framework with specific signals.

Refactor Is the Right Default — Until It Isn't

The bias toward refactoring is correct. Joel Spolsky's 2000 essay on why rewrites always fail still applies. You lose the embedded business logic that lived in code nobody documented. You incur months of feature freeze. You build a second system that needs to match the behavior of the first while also being better — which is harder than it sounds.

Engineers who want to rewrite are often the engineers who weren't there for the original decisions. Rewrites fail because the team discovers, six months in, why certain things were built the way they were.

The default breaks when specific, measurable signals indicate that refactoring has a structural floor.

Signals That Indicate Refactor Won't Work

Performance floor due to architecture, not code quality. If your P99 response time is 1.8 seconds and the bottleneck is a synchronous ORM call chain that can't be made async without restructuring the entire data layer — that's a structural problem, not a dirty-code problem. Refactoring individual functions won't move the needle.

The threshold: if your critical path has more than three sequential blocking I/O operations and the business logic doesn't require that sequencing, the architecture is the constraint. Adding TypeScript types and cleaning up variable names won't fix it.

Type system debt that's load-bearing. There's a specific failure mode in codebases built in JavaScript and then partially converted to TypeScript: any types used not as a shortcut but as structural load-bearing. Downstream code was written assuming the shape of any. Typing it correctly requires auditing 200 downstream callsites.

Concrete signal: run grep -r ": any" --include="*.ts" | wc -l against your codebase. If the count exceeds 5% of total TypeScript lines and is concentrated in core domain logic — not just API boundary types — you have structural type debt, not cosmetic debt.

ORM mismatch with your access patterns. If you're on Prisma or Sequelize but your access patterns require complex aggregations, window functions, or CTEs — and you're patching N+1 problems with .include() chains — you're working around the tool rather than with it. Each workaround adds surface area for bugs.

When more than 30% of your production queries use $queryRaw or the equivalent escape hatch, the ORM is no longer serving your access patterns. Fixing this usually means rebuilding the data layer, which typically means the surrounding application logic too.

No tests and the original authors have left. A zero-test codebase where the original authors are gone is a different risk profile than a messy-but-tested codebase with institutional knowledge. You can't refactor safely without a test harness, and building one requires understanding behavior that's currently undocumented. Sometimes that cost exceeds a targeted rewrite.

Signals That Favor Refactor

The business logic is correct and well-understood. If your pricing logic, tenant isolation rules, and data model are sound — even if the implementation is messy — refactoring is almost always cheaper than rewriting. The business logic is the hard part. If it's already right, preserve it.

The team knows the codebase. Engineers who've been in a codebase for two years have absorbed thousands of micro-decisions that don't appear in architecture diagrams. A rewrite discards that. If your team is stable and has that context, a structured refactor plan preserves the capital.

The performance problem is query-level, not architecture-level. Slow pages caused by missing indexes, unoptimized JOINs, or queries loading 10,000 rows when they should load 20 are fixable without a rebuild. A technical debt audit distinguishes architectural debt from execution debt. P99 above 500ms is actionable — but the action depends on where in the stack the time is spent.

You have a clear modular seam. If the problem is isolated to one service, one domain, or one integration layer, extract and rebuild that component without touching the rest. The Strangler Fig pattern applies: build the new version alongside the old, route traffic to it, deprecate the old. This is a component rebuild, not a system rebuild.

The Decision Matrix

Run through these questions before committing to either path:

Can you measure the performance floor of refactoring? If P99 will still exceed your target after removing all the easy wins, the architecture is the constraint — not the implementation.
Is the business logic correct but the implementation broken? Favor refactor.
Is the implementation technically sound but the business logic wrong or incomplete? Favor rewrite — you'll be changing it anyway.
Does the codebase have type coverage above 80% and test coverage above 60% on critical paths? If yes, refactoring is tractable. Below those numbers with no institutional knowledge, recalculate.
Can you scope this to a component rebuild rather than a system rebuild? Always prefer it over a full rewrite.

What a Rebuild Actually Costs

The optimistic estimate for a full application rewrite at the $2M–$5M ARR stage is 4–6 months. The realistic estimate — accounting for replicating behavior you don't fully understand — is 8–14 months. During that window you're maintaining two codebases, the feature roadmap is frozen, and the team is under pressure from a project that always feels behind schedule.

If the rebuild decision is correct, those costs are worth it. If it isn't, you've spent a year and landed in a different set of problems.

The decision deserves more rigor than a whiteboard session. If you want an outside assessment of your specific codebase, the architecture diagnostic surfaces these signals with measurements rather than intuition.

For the upstream decisions that prevent this conversation from happening in the first place, see web architecture decisions that scale.