3 min read
Technical Debt in Mission-Critical Health Systems: When to Pay It Down and When to Live With It

After five years, TIBU Health has significant technical debt. Our EHR has architectural decisions from 2020 that don’t scale well at our current patient volumes, and our database has grown organically in ways that made sense at the time but create friction now. And yet we’re serving 60,000+ patients reliably. Managing debt is about knowing when to pay it down and when to wait.

When to pay down technical debt immediately

1. Security vulnerabilities

Debt that creates risks - unpatched libraries, weak encryption, exposed PII - we fix immediately. Healthcare data breaches are catastrophic and often irreversible.

  • Example: We paused all feature work for two weeks to upgrade an outdated auth library that had a known vulnerability. It wasn’t a popular decision at the time, but it was the right one.

2. Compliance blockers

Debt that prevents us from meeting legal requirements - like Kenya’s Data Protection Act - gets prioritised above almost everything else. Non-compliance can shut operations down, and regulators don’t care that we were busy shipping features.

3. Critical system instability

If debt is causing downtime or making core features unreliable - EHR, payments, the queue - it needs to be fixed.

  • Example: We spent a month rewriting our queue management system to fix a race condition that caused a weekly crash. Feature development stopped entirely. It was the right call.

When to live with technical debt (for now)

  • Non-critical UI polish: If the workflow is functional, cosmetic improvements can wait.
  • Redundant data structures: If sync logic works reliably, refactoring data models mid-scale is often riskier than the debt itself.
  • Legacy integrations: We have a stable adapter for a partner that delivers results as PDF attachments by email. It’s ugly, but it works. Touching it introduces more risk than it removes.
  • Low-usage features: Features used by fewer than 5% of users with messy underlying code can linger. They’re not hurting anyone.

How we decide

We use four questions to prioritise:

  1. Does this create security or compliance risk? → Fix immediately
  2. Does this cause system instability? → Fix soon
  3. Does this meaningfully slow down development? → Prioritise next quarter
  4. Is this annoying but stable? → Document and live with it

Managing debt strategically

  • Maintain a register: We track every significant piece of debt with its impact, estimated fix cost, and priority. Invisible debt accumulates faster.
  • Allocate sprint time: We protect roughly 20% of each sprint for refactoring and debt reduction. When it’s not explicitly allocated, it doesn’t happen.
  • Prevent new debt deliberately: Code reviews flag shortcuts, and those shortcuts get documented when they’re intentional - so future engineers know they’re looking at a known trade-off, not a mistake someone forgot about.