Search & Indexing
Letting users search across customers, cases, or transactions is genuinely useful and quietly risky. A search index is a second copy of your data, with the same sensitivity and tenancy rules, and naive search is a performance and injection hazard. Build search that is tenant-scoped, access-controlled, and efficient. The index inherits every obligation the source data has.
Search is usually backed by either the database (full-text indexes) or a dedicated search engine. Either way, the key points are: the index is a copy of sensitive data that must be protected and tenant-isolated exactly like the source; results must be filtered by what the user is allowed to see; and search queries built from user input are an injection surface. Add the usual performance concerns (large result sets, expensive queries) and search needs careful design.
This connects Data Classification (the index has the same sensitivity), Multi-Tenancy (results scoped per tenant and user), Trust Boundaries (query input), and Performance.
Keep search safe and scoped
- AlwaysScope every search to the user's tenant and authorisation. Results must only ever include data the searcher is entitled to see, enforced server-side (see Multi-Tenancy, Authentication & Authorization).
- DoTreat the search index as a copy of the source data with the same classification. Protect, encrypt, access-control, and retain it the same way (see Data Classification, Data Protection & Privacy).
- DoBuild search queries safely from user input (parameterise, or use the engine's safe query API). Never concatenate raw input into a query (see Trust Boundaries).
- DoIndex only the fields needed for search and display. Avoid copying special-category data into the index unless it is essential (see Data Masking & Redaction).
- NeverReturn search results across tenant boundaries, or expose an index publicly or too permissively (a classic source of data leaks).
Keep it correct and fast
- DoKeep the index consistent with the source. Update it in the same transaction, or through a reliable async pipeline (outbox or events), and handle eventual consistency in the UX (see Asynchronous Messaging, Distributed Systems & Consistency).
- DoPage and bound results, and cap query cost and complexity, so a broad search cannot exhaust resources (see Performance & Resource Use).
- DoApply rate limiting to search endpoints. They are easy to abuse for scraping or denial of service (see Rate Limiting & Abuse Prevention).
- ConsiderWhether full-text in the database is enough before adopting a separate search engine (another datastore to secure, sync, and operate; see Choosing Technology).
Self-review checklist
- AskAre results scoped to the user's tenant and entitlements, server-side?
- AskIs the index protected like the source data, with only the needed fields?
- AskAre queries built safely from input, paged and bounded, and rate-limited?
- AskHow does the index stay consistent with the source, and is staleness handled?