Quality & Team

Test Data & Environments

Foundational

Good testing needs realistic data and trustworthy environments. But the easy shortcut, copying production data into dev or test, is one of the most common and serious data breaches in our industry. Use synthetic or properly masked data. Keep environments separate. Make them match production in shape, not in the real customer records they hold.

Two things make tests meaningful: data that looks like the real thing (right shapes, edge cases, volumes) and environments that behave like production. The mistake newer engineers make is reaching for the most realistic data of all, a copy of production. That moves real customers' personal, KYC, and financial data into less-protected places. That is a GDPR breach and a security risk, with no exceptions.

The right approach: generate synthetic data, or mask and anonymise it when you genuinely need production-shaped data. Keep non-production environments isolated from production data and networks. Keep environments consistent, so a passing test means something.

Use safe, realistic data

Prod dump into dev // restore last night's production backup into the dev database
// "so the bug reproduces with real data"

Every customer's personal, KYC, and financial data now sits in a low-protection dev environment that more people can access. That is a reportable breach, whatever the intent.

Masked or synthetic // seed dev with generated customers, or an approved masked extract:
// names -> fake, DOB -> shifted, doc numbers -> tokenised
// shapes and volumes preserved; no real identities

The data behaves like production for testing, but no real person's information is exposed.

Keep environments trustworthy

Self-review checklist

Why it matters: Copying production data around is both the tempting shortcut and one of the most common real-world breaches. Sensitive records end up in places that were never secured for them. Synthetic or masked data, in isolated and consistent environments, gives us trustworthy testing without ever putting a real customer's data at risk.