Event-Driven Architecture
In an event-driven architecture, components communicate by publishing and reacting to events rather than calling each other directly. A service announces that something happened; other services that care subscribe and respond, without the publisher knowing they exist. Done well it gives you loose coupling and resilience; done carelessly it gives you a distributed mess that is impossible to follow.
The shift is from "A calls B and waits" to "A emits an event; whoever cares handles it." The publisher is decoupled from its consumers — you can add a new consumer without touching the producer — and work happens asynchronously, so a slow or down consumer doesn't block the producer. This is the architecture-level expression of the messaging patterns in Asynchronous Messaging & Eventing.
That decoupling has a price: flow becomes implicit and harder to trace, ordering and duplication are facts of life, and "where did this go wrong?" spans many services. Event-driven design is the right tool when components must evolve and scale independently or react to things happening elsewhere — and the wrong one when you really just need a synchronous request/response. It builds on Distributed Systems & Consistency, and the events themselves are often discovered via Event Storming and stored via Event Sourcing.
Design the events and the contracts
- DoPublish events as immutable, past-tense facts with a stable, versioned schema. Consumers depend on that contract, so evolve it additively and never break existing fields (see Backward Compatibility).
- DoKeep publishers ignorant of consumers. A producer emits a fact and is done; it must not know or care who reacts. The moment a publisher special-cases a consumer, the coupling is back.
- DoDecide deliberately between thin events (just ids, consumers fetch detail) and fat events (carry the data). Thin events avoid stale payloads and over-sharing PII; fat events avoid chatty callbacks. Choose per event (see Data Protection & Privacy).
- ConsiderDistinguishing internal domain events from public integration events, and only exposing a curated, stable set across service boundaries.
- NeverPutting secrets or unminimised personal data into broadly-subscribed events. An event bus fans data out to many consumers and logs — treat the payload as published, not private.
// "event-driven", but the publisher orchestrates everyone:
Publish(paymentEvent);
await emailService.Send(...); // direct call
await ledgerService.Post(...); // direct call
await screeningService.Check(...);// now coupled to all three
The producer knows and waits on every consumer. This is request/response wearing an event costume — none of the decoupling, all of the indirection.
await bus.Publish(new PaymentCaptured(paymentId, amount, at));
// email, ledger, and screening each subscribe and handle it
// on their own, idempotently; the publisher knows none of them
The producer states a fact and moves on. Consumers evolve, scale, fail, and retry independently. Adding a fourth reaction touches no existing code.
Make consumers robust
- AlwaysMake every consumer idempotent. At-least-once delivery means duplicates will happen; processing the same event twice must not double-charge, double-post, or double-screen (see Data Integrity & Transactions).
- DoTolerate out-of-order and delayed events. Don't assume B's event arrives after A's; design handlers that cope with either order or that wait for prerequisites.
- DoUse the transactional outbox pattern to publish events atomically with the state change that caused them, so you never commit the change but lose the event, or vice versa (see Distributed Systems & Consistency).
- DoProvide dead-letter queues and a retry policy for poison messages, and alert on them. A handler that throws forever must not silently block the stream or vanish (see Designing for Failure).
Keep it observable and bounded
- DoPropagate correlation/causation ids on every event so you can trace a single business action across all the services it touched (see Observability & Logging Hygiene).
- ConsiderDocumenting the event flows (who publishes, who consumes) so the implicit choreography is written down somewhere a new engineer can follow.
- AvoidEvent-driving everything. Within one service or for a simple, immediate request/response, a direct call is clearer and easier to debug than a round trip through a broker.
Self-review checklist
- AskDo these components really need to be decoupled and async, or would a direct call be simpler and clearer?
- AskIs every consumer idempotent and safe under duplicate and out-of-order delivery?
- AskDo I publish the event and persist the state change atomically (outbox), with no dual-write gap?
- AskCan I trace one business action across all the services its events touched?