Data & Integrity

Data Modelling & Persistence

Intermediate

How you model the domain and how you talk to the store are two of the most important decisions in a system. A model that mirrors the real business makes correct code natural and wrong code awkward. A careful persistence layer keeps data access safe, fast, and tenant-isolated. Get these right and most data bugs are never written.

Data modelling is about capturing the domain truthfully, with the right entities, relationships, and boundaries, so the structure of the data carries meaning instead of fighting it. Persistence is the careful layer between that model and the database. It gives parameterised, tenant-scoped, efficient access that does not spread SQL concerns through the whole codebase.

We use Dapper. It gives precise control over SQL, but we must handle safety ourselves. So two things are non-negotiable: every query is parameterised, and every tenant-owned query carries its tenant predicate. The Finperiti audit is a reminder that an unparameterised query or a missing tenant filter is not a style issue. It is an injection hole or a cross-tenant data breach.

Model the domain honestly

DoModel entities, relationships, and boundaries to reflect the real domain, so the structure of the data expresses the business rules.
DoMake illegal states impossible to represent in the model where you can. Make required fields required, and model mutually-exclusive states as such.
DoKeep a clear mapping between domain types and storage. Do not let database quirks leak into the domain, or domain logic leak into SQL.
ConsiderSeparating read and write shapes where they truly differ. Use query DTOs that are distinct from the write model, instead of forcing one shape to do both.
Do notModel everything as loose sets of nullable fields and strings. A vague model makes every consumer vague too.

Persist safely and efficiently

AlwaysParameterise every query, with no exceptions. Values are parameters, never string-concatenated into SQL.
DoInclude the tenant predicate in every read, update, and delete of tenant-owned data. Enforce it server-side from the validated identity.
DoKeep data access in a dedicated layer (repositories or queries), not scattered as inline SQL through controllers and services.
DoFetch what you need and no more. Select explicit columns, page large result sets, and avoid one round trip per row (N+1) patterns.
ConsiderSet-based SQL instead of row-by-row loops for bulk work, and mapping results into typed models instead of passing raw rows around.
NeverBuild SQL by concatenating unvalidated input, or query tenant-owned data without enforcing the tenant predicate server-side.

Concatenation and no tenant scope

var sql = "SELECT * FROM Customers WHERE Email = '" + email + "'";
return conn.Query(sql);

SQL injection through the email value, and any tenant's customer is returned to any caller. Two of the most serious data-layer failures in one line.

Parameterised and tenant-scoped

const string sql = @"SELECT Id, Email, Status FROM Customers
  WHERE TenantId = @TenantId AND Email = @Email";
return conn.Query(sql, new { TenantId = ctx.TenantId, Email = email });

Input cannot alter the query, columns are explicit, and the row can only ever belong to the caller's tenant.

Self-review checklist

AskDoes the model reflect the domain, or am I bending the domain to fit a convenient table?
AskIs every query parameterised, with no value concatenated into SQL anywhere?
AskDoes every tenant-owned query carry its tenant predicate from the validated identity?
AskAm I fetching only what I need? Or am I pulling rows and columns I will throw away, or looping queries I could batch?

Why it matters: The data model shapes everything built on top of it, and the persistence layer is where the two worst data-layer failures live: injection and cross-tenant leakage. A faithful model plus careful, parameterised, tenant-scoped access prevents whole categories of bug and breach before they reach a code review.