Reporting & Data Exports
Reports and exports are how data leaves the safety of the app, into spreadsheets, downloads, dashboards, and other systems. That makes them a quiet source of two problems: leaking more personal data than intended, and hurting performance by pulling huge datasets. Export with care: the right data, to the right person, sized to handle real volumes.
An export is a deliberate copy of data into a less-controlled place, so it needs the same care as any data handling: who is allowed it, what is included, and where it ends up. Reports also tend to aggregate across many rows, which is exactly where unbounded queries and N+1 patterns cause outages. And in a multi-tenant business, a report that forgets its tenant filter is a cross-tenant breach in a downloadable file.
This connects Data Modelling and Persistence (efficient queries), Multi-Tenancy (scoping), Data Classification and Masking (what is safe to include), and Privacy (lawful basis, minimisation).
Export the right data, safely
- AlwaysScope every report and export to the requester's tenant and authorisation. A report must never include data the requester is not entitled to (see Multi-Tenancy, Authentication & Authorization).
- DoInclude only the fields needed for the purpose, and mask or omit sensitive ones by default (see Data Masking & Redaction, Data Classification).
- DoTreat the export as personal-data processing. Lawful basis, minimisation, and retention apply to the generated file too (see Privacy & Data Protection).
- DoControl and audit who can run and download which reports, especially bulk or PII-heavy ones (see Audit Trails).
- NeverGenerate an export containing personal or special-category data without access control, tenant scoping, and awareness of where the file then goes (see Handling; exports are an easy breach).
Make reports scale
- DoRun large reports and exports off the request path (a background job) and stream the results instead of loading everything into memory (see Background Jobs, Performance & Resource Use).
- DoUse efficient, set-based, bounded queries. Paginate or stream, select only the needed columns, index for the report's access pattern, and avoid N+1 (see Data Modelling & Persistence).
- ConsiderRead replicas or a separate reporting store for heavy analytical queries, so reporting load does not slow the live system.
- ConsiderDelivering large exports as a generated file (for example, to secure storage with a signed link) instead of a synchronous download (see File & Blob Storage).
- AvoidBuilding a report by loading a whole table or looping per row. It is fine on dev data, but an outage on a big tenant.
var all = db.Query("SELECT * FROM Customers"); // no tenant, no paging
return Csv(all); // full PII, every tenant, in one download
A cross-tenant breach and a memory or timeout failure in one: every customer of every tenant, all columns including PII, loaded at once into a file anyone with the endpoint can pull.
// background job, streamed, tenant-scoped, columns limited, access-checked
await foreach (var row in db.StreamReport(tenantId, fields, paged))
writer.Write(Mask(row));
// delivered as a signed-URL file; the download is audited
Only the requester's tenant and the needed (masked) fields are included. It streams, so volume is safe, and the download is access-controlled and audited.
Self-review checklist
- AskIs this report or export scoped to the requester's tenant and entitlements?
- AskDoes it include more personal data than the purpose needs? Could fields be masked or dropped?
- AskWill it handle a large tenant's data, or is it unbounded or N+1?
- AskIs access to running and downloading it controlled and audited?