Data & Integrity

Database Design & Schema

Intermediate

The schema outlives the application. Code gets rewritten and services get replaced, but the data and its shape stay for years and gather dependencies the whole time. Design the schema as the lasting contract it is: correct, constrained, and clear about what the data means. A bad schema is the most expensive mistake to undo.

A well-designed schema makes invalid states impossible to represent. It lets the database enforce rules the application would otherwise have to remember everywhere. Constraints, keys, and types are not red tape. They are the last line of defence for data integrity, and the one that holds even when the application code has a bug.

We run Dapper over SQL Server, with no ORM doing the thinking for us. This makes careful schema design more important, not less. There is no migrations-from-models safety net, so the schema is exactly what we write. For a multi-tenant AML platform, tenancy and integrity must be designed into the tables themselves.

Model for correctness

Integrity left to the app CREATE TABLE Payments (
Id uniqueidentifier,
Amount float,
Currency varchar(50),
CustomerId uniqueidentifier
);

Float loses pennies. There are no keys or constraints, no tenant column, and the currency is free text. Every rule about this data now lives only in code, and only until someone forgets it.

The database enforces the rules CREATE TABLE Payments (
Id uniqueidentifier PRIMARY KEY,
TenantId uniqueidentifier NOT NULL,
CustomerId uniqueidentifier NOT NULL REFERENCES Customers(Id),
Amount decimal(19,4) NOT NULL CHECK (Amount >= 0),
CurrencyCode char(3) NOT NULL,
CreatedUtc datetime2 NOT NULL
);
CREATE INDEX IX_Payments_Tenant_Customer ON Payments(TenantId, CustomerId);

Money is exact, the engine guarantees relationships and ranges, and tenant scoping is present and indexed.

Random GUID as the clustered key Id uniqueidentifier DEFAULT NEWID() PRIMARY KEY -- random
// app side: new Customer { Id = Guid.NewGuid() }

Each new row's key is random, so SQL Server inserts it into the middle of the clustered index. This causes endless page splits, heavy fragmentation, and a cache full of half-empty pages. On a busy table this quietly slows insert throughput.

Time-ordered UUIDv7 Id uniqueidentifier PRIMARY KEY -- value supplied by the app
// app side: new Customer { Id = Guid.CreateVersion7() }

UUIDv7 includes a timestamp, so values are roughly sequential. Inserts append to the end of the index like an identity column, while staying globally unique and safe to generate client-side (no round trip, no central sequence).

Design for change and scale

Self-review checklist

Why it matters: Data is the most lasting and most valuable thing we own, and the schema is the only place integrity can be guaranteed rather than hoped for. A schema that enforces correctness prevents bugs the application cannot, while a careless one quietly corrupts data for years before anyone notices. In regulated finance, corrupted data is a reporting failure waiting to happen.