Database Design Doesn't Start With Table. It Starts With Consequences
Here's what most backend engineers get wrong about schema design and how to fix it before you hit production. (6 min)
This post is an introduction to database design, a foundational concept for backend engineers, highlighting its importance and how to get it right before it’s too late.
Share this post & I’ll send you some rewards for the referrals.
Imagine a team launches a new product. Traffic grows. Everything’s fine.
Then one day, a query takes 8 seconds instead of 80ms. Engineers start adding indexes. Caching layers. Read replicas.
None of it helps enough.
The problem wasn't performance. It was bad database design. And by the time they realized it, they had 200 million rows in production and a Slack channel dedicated to database incidents.
Database decisions are long-term bets.
Get them wrong early, and you'll spend years firefighting.
Get them right, and your system scales quietly while you build features.
Your Schema Will Outlive Your Code. Plan Accordingly.
A careless column type. A missing foreign key constraint. An unnecessary denormalization because “it might be faster”.
These choices feel small when you're building v1. But databases are permanent in ways code isn't.
You can refactor a service in a weekend.
A schema migration on a table with 500 million rows in PostgreSQL?
That's a multi-month project involving table locks, replica lag monitoring, and very careful coordination with the ops team, who definitely don't want to hear from you at 2 AM.
The teams that win long-term treat table design with the same respect they give code architecture. Maybe more, because you can't just roll back a schema change when 50 million user records depend on it.
Start With Reality, Not With Tables.
Most engineers rush straight to CREATE TABLE before understanding what they're actually modeling.
Don't do this.
Spend real time mapping the business domain.
What entities exist in the real world? What relationships matter to the business? What constraints does reality enforce that your database should mirror?
If you’re building a job board, understand that a job posting isn’t just a row. It has a lifecycle: draft, published, filled, and expired. It belongs to a company. It attracts applications. Each application connects a candidate to a posting.
The database should enforce what the domain requires.
If your schema allows states that can't exist in reality, eventually someone will put your data into one of those states.
Normalization Is Your Friend (Until It Isn’t)
Normalization means organizing data to eliminate redundancy and dependency issues.
Third normal form (3NF) is the target - every column depends on the primary key, the whole key, and nothing but the key.
In practice, this means that a user's data is in one table, addresses in another, and orders in a third. Then, we join them when needed.
This prevents update anomalies. We change a user’s email once, not in seventeen places. We can also safely delete an order without losing customer data.
But over-normalization can destroy read performance.
If your application shows “user profile with last 10 orders” on every page load, and that requires five joins across normalized tables, you’re paying a cost.
A good rule of thumb is:
Let query patterns guide your final design.
Start normalized.
Denormalize deliberately when you have real performance data.
To mitigate the above-mentioned problem, we might want to add a materialized view that flattens this query and then refresh it on writing.
Or we can do something else, depending on the business case and needs.
Denormalization is a tool. Use it when the tradeoff is clear.
Indexes: The Contract You Make With Future You
An index is a data structure that makes lookups fast by avoiding full table scans.
Without an index on users.email, a query like WHERE email = 'user@example.com' checks every row. With an index, it jumps directly to the match.
But indexes aren’t free.
Every index slows down writes. The database must update the index on every INSERT, UPDATE, or DELETE. Indexes consume disk space. And poorly chosen indexes bloat over time, requiring maintenance.
The strategy I follow:
Index foreign keys and columns used in WHERE, JOIN, and ORDER BY clauses
Avoid indexing columns with low cardinality (e.g., boolean flags)
Use composite indexes when queries filter on multiple columns together
Monitor and rebuild indexes periodically to prevent fragmentation
Hint: Use pg_stat_user_indexes to monitor index usage in PostgreSQL.
Index deliberately.
Measure the impact using E
PLAIN ANALYZE.Remove what doesn't serve you.
Constraints Are The Best Documentation You’ll Ever Write
Foreign keys. NOT NULL. UNIQUE. CHECK constraints.
These aren’t optional guardrails. They’re how you encode business rules into the database itself.
A foreign key constraint ensures referential integrity. If orders.user_id references users.id, you can’t have an orphaned order pointing to a deleted user.
A NOT NULL constraint guarantees a value exists. No defensive null checks in application code.
A CHECK constraint validates data at write time. Age must be positive. Status must be one of five valid values.
This matters because constraints catch bugs before they corrupt data.
Constraints add minor write overhead, maybe 1-2%, or sometimes less. But they give you guaranteed correctness.
If your database allows invalid data, eventually you’ll have invalid data.
Performance Is A Design Choice, Not a Tuning Exercise
I’ve seen teams spend weeks chasing query plans, adding caches, and tuning connection pools.
Most of the time, the real problem is the schema.
A single redesign could fix the issue forever: moving a frequently-joined column into the main table, splitting a bloated table into hot and cold storage, adding a covering index, and partitioning by time range.
Here's what makes database design genuinely hard: you don't know the full business context at the start. Requirements evolve. Query patterns change.
The answer isn’t perfection upfront. It’s about designing for change.
Keep things simple. Use clear naming conventions. Avoid premature optimization. And critically, don't be afraid to migrate the schema when you learn something new about your domain or your access patterns.
A well-designed schema can handle 10x growth with minor adjustments. A poorly designed one requires rewrites.
What I Wish Someone Had Told Me
If I could give my younger self advice, I'd say this:
Practice designing schemas for real systems. Pick LinkedIn, Twitter, or Airbnb. Map the entities. Draw the relationships. Think through the queries. Do it on paper before you touch a database.
Learn SQL deeply. Not just SELECT and JOIN. Window functions. CTEs. Query plans. Execution order. Understand what the database does with your query.
Start simple, but extensible. Don't build for scale you don't have. But leave room to grow. Design your schema so that adding columns or splitting tables doesn't require a full rewrite.
Measure everything. You can't optimize what you don't measure. Track query times. Index usage. Table sizes. Slow query logs.
The best engineers I know treat database design like architecture. They think through the consequences of their decisions. They plan for failure modes. They build systems that last, not just systems that work.
And they're never ashamed to run a migration when they learn something new.
Because databases don't forget your mistakes, but they do forgive your improvements.
Follow me on LinkedIn | Twitter(X) | Threads
Thank you for supporting this newsletter.
Consider sharing this post with your friends and get rewards.
You are the best! 🙏
Press the Like button if you found this post helpful



Great post.
One thing I usually do is keep a 'jsonb' column for cases where I need to support dynamic or evolving models in Postgres. That way I can keep the core schema well-defined — PKs and stable columns — while still having flexibility for attributes that might change over time.
I know this can easily turn into a “magic column” if you're not careful, but I try to keep the main domain properly modeled and handle validation at the application layer.
Actually brilliant article - i would call this “First Principles OLTP Design” - essential for the new developer in 2026 (every AI tool practitioner).