24 ยท System & Infrastructure Architecture
The layer beneath the application: how systems scale and stay available, and the infrastructure that runs them โ load balancing, caching tiers, databases, message queues, CDNs/edge, containers and orchestration, CI/CD, infrastructure-as-code, observability, and deployment strategies. Written for a frontend engineer who must design integrations, reason about system design interviews, and ship to production reliably.
Positioning
A senior frontend engineer isnโt a DevOps/SRE, but operates inside a system and must understand it: where your app is served from, how it scales, why the API is sometimes slow, what a CDN/edge does to your caching (18), how your deploy reaches users, and how to read a system-design interview. This file gives the system-design vocabulary (scaling, availability, consistency, caching, queues) and the infra literacy (containers, CI/CD, IaC, observability, deploy strategies) that senior frontend roles assume. It complements software architecture (10, 20โ22) and the decision-making file (25).
Foundations: the qualities youโre designing for
System architecture trades off a handful of qualities:
- Scalability โ handle growth in load/data without redesign.
- Availability โ stay up (measured in โninesโ: 99.9% โ 8.7h/yr down; 99.99% โ 52min).
- Reliability / Fault tolerance โ keep working despite component failures.
- Performance / Latency โ fast responses (
15,18). - Consistency โ all readers see the same data (vs eventual consistency,
13). - Maintainability, Security (
17), Cost.
Two master trade-offs frame everything:
- CAP theorem โ under a network Partition you must choose Consistency or Availability. PACELC extends it: else (no partition), trade Latency vs Consistency. Distributed systems are usually eventually consistent by choice โ which is why your UI must tolerate stale reads (
13). - Vertical vs horizontal scaling โ scale up (bigger machine: simple, has a ceiling, single point of failure) vs scale out (more machines: near-unlimited, needs statelessness + load balancing + coordination). Modern systems scale out; the enabling requirement is statelessness (no per-user state on a given server โ push it to a shared store/session service).
Deep dive: system building blocks
1. Load balancing
Distributes traffic across many server instances (round-robin, least-connections, IP-hash, latency-based). Enables horizontal scaling and availability (route around dead instances via health checks). Lives at L4 (TCP) or L7 (HTTP, can route by path/host โ relevant to MFE/zone routing, 09/08). Adds the need for statelessness or sticky sessions.
2. Caching (the highest-leverage performance tool, at every tier)
Store computed/fetched results closer to the consumer. Tiers, outerโinner:
- Browser cache + HTTP caching (
18) โCache-Control,ETag,stale-while-revalidate. - CDN / edge cache โ static assets and increasingly dynamic/edge-rendered content at PoPs near users (
18). - Application / in-memory cache โ Redis/Memcached for sessions, computed results, rate-limit counters, hot data.
- Database cache โ query/result caches, materialized views (CQRS read models,
13). Core concerns: invalidation (โone of the two hard problemsโ), TTL, eviction (LRU/LFU), cache stampede (many misses at once โ use request coalescing/locks), and write strategies (write-through, write-back, cache-aside). A BFF (12) is a common caching choke point.
3. Databases
- Relational (SQL) โ Postgres/MySQL. Strong consistency, ACID transactions, joins, schemas. Default for most apps; most teams over-reach for NoSQL too early.
- NoSQL families: document (MongoDB), key-value (Redis, DynamoDB), wide-column (Cassandra), graph (Neo4j). Chosen for scale-out, flexible schema, or specific access patterns; usually eventually consistent and join-light.
- Concepts to know: ACID vs BASE, indexing (and how a missing index makes a query O(n)), N+1 query problem (the backend twin of the frontend N+1,
12), replication (read replicas for read scaling), sharding/partitioning (horizontal data split for write scaling), and transactions vs distributed sagas (13). - Frontend touchpoint: this is why some data is strongly consistent and some isnโt; why โsearchโ might hit a different store (Elasticsearch) than โcheckout.โ
4. Message queues & event streaming
Kafka, RabbitMQ, SQS, NATS decouple producers from consumers for asynchronous, resilient processing (13). Enable: load leveling (absorb spikes), background jobs (emails, image processing), and event-driven architectures. Guarantees to know: at-least-once vs exactly-once delivery, ordering, idempotent consumers, dead-letter queues. Frontend touchpoint: real-time updates pushed to the browser via WebSocket/SSE (04, 18) often originate from these streams; โyour order is processingโ reflects async queue work.
5. API layer
- REST, GraphQL (
12), gRPC (service-to-service, binary/HTTP2), tRPC (TS end-to-end). An API gateway is the single entry point (routing, auth, rate-limiting,13); a BFF is the per-experience variant (12). - Rate limiting (token bucket/leaky bucket), API versioning, idempotency keys for safe retries.
6. CDN & edge compute
CDNs cache near users; edge runtimes (Cloudflare Workers, Vercel Edge, 08) run code at PoPs for SSR/personalization/auth with minimal latency โ the infra that makes streaming SSR/RSC fast globally (07).
Deep dive: infrastructure & delivery
7. Containers & orchestration
- Docker packages an app + its dependencies into a portable image that runs identically anywhere โ solves โworks on my machine,โ and is the unit of modern deployment.
- Kubernetes (K8s) orchestrates containers at scale: scheduling, self-healing (restart failed pods), horizontal autoscaling, rolling updates, service discovery, secrets/config. Heavyweight; many frontend teams instead use PaaS (Vercel/Netlify/Render/Fly) that hide K8s.
- Service mesh (
13) handles service-to-service mTLS/retries/observability via sidecars.
8. CI/CD (your daily infra)
- CI โ on every push: install, lint/typecheck, test (
16), build (14), and produce artifacts. Fast feedback; gate merges. - CD โ automatically deploy passing builds to staging/production. Continuous delivery (one click to prod) vs continuous deployment (fully automatic).
- Pipeline shape that works (
16): static checks โ unit โ integration โ build โ deploy preview โ E2E on preview โ promote. Tools: GitHub Actions, GitLab CI (Rianโs context), CircleCI. Frontend specifics: preview deployments per PR, caching dependencies/build, bundle-size budgets (14/15) as a gate, and watch CI memory on coverage providers.
9. Deployment strategies (how new code reaches users safely)
- Rolling โ replace instances gradually; default in K8s.
- Blue-green โ two identical environments; switch traffic from blue (old) to green (new) instantly; instant rollback by switching back.
- Canary โ release to a small % of users, watch metrics, ramp up or roll back. Pairs with feature flags (LaunchDarkly/Unleash) for decoupling deploy from release and gradual rollout/kill-switch.
- Frontend note: immutable, content-hashed assets (
18) make frontend deploys atomic; keep old chunks available so in-flight sessions donโt 404 mid-deploy.
10. Infrastructure as Code (IaC)
Define infra in version-controlled code, not clicks: Terraform (declarative, multi-cloud), Pulumi (real languages), AWS CDK, CloudFormation. Benefits: reproducible, reviewable, auditable environments; no โsnowflakeโ servers. GitOps extends this โ the repo is the source of truth for infra state.
11. Observability (you canโt fix what you canโt see)
Three pillars: logs (events), metrics (numeric time series โ latency, error rate, throughput; the โREDโ/โUSEโ methods), traces (a requestโs path across services โ distributed tracing via OpenTelemetry, essential for microservices/BFF debugging, 12/13). Add alerting on SLOs and error tracking (Sentry) + RUM (15) for the frontend. OpenTelemetry is the vendor-neutral standard.
12. Frontend deployment infra specifically
- Static/SSG โ object storage (S3) + CDN (CloudFront) or a Jamstack host (Netlify).
- SSR/RSC โ Node/edge runtime (Vercel, Cloudflare, a container on K8s) (
07,08). - MFEs โ independently deployed remotes behind a CDN, discovered via a manifest (
09). - Concerns: cache-busting via hashed filenames, atomic deploys, environment config injection, and not breaking long-lived sessions on deploy.
Worked example: a scalable web system (system-design sketch)
โโโโโโโโโโโโ CDN / Edge (static + cache + edge SSR) โโโโโโโโโโโ
Users โโโDNS(anycast)โโถ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
Load Balancer (L7, health checks)
โ (stateless app tier โ scale out)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ โผ โผ
App/SSR node App/SSR node BFF / API gateway
โ โ โ
โโโโโโโโโโโโโ Redis (sessions, cache) โโโโโโโโโโโโโโโโโโโโค
โ โผ
Primary DB (writes) โโreplicationโโโถ Read replicas Services
โ โ
โโโโโโโโโโโ events โโโถ Kafka โโโถ async workers (email, search index)
Cross-cutting: CI/CD pipeline ยท IaC (Terraform) ยท Observability (OTel: logs/metrics/traces) ยท feature flags
Reading it: scale out behind a load balancer (stateless apps, sessions in Redis), cache at CDN/edge/Redis tiers, separate read replicas from the write primary, push slow work to queues, and keep the whole thing reproducible (IaC) and observable (OTel). This is the shape behind most system-design answers.
Pitfalls & gotchas
- Stateful app servers blocking horizontal scaling โ externalize session/state.
- Reaching for microservices/NoSQL/K8s prematurely โ huge operational cost; start simple (
25). - Cache invalidation bugs โ stale data, or stampedes on expiry; plan TTL + coalescing.
- No idempotency on retried operations โ duplicates (
13). - Ignoring the N+1 query on the backend feeding your UI โ slow APIs no frontend trick fixes.
- Treating eventual consistency as immediate โ UIs that break on stale reads (
13). - No observability โ flying blind; add tracing/metrics/error-tracking before you need them.
- Deploys that 404 old chunks โ keep prior hashed assets during/after deploy.
- Snowflake infra (hand-clicked) โ unreproducible; use IaC.
Interview questions
- Vertical vs horizontal scaling โ trade-offs and the statelessness requirement.
- State the CAP theorem (and PACELC). What does choosing AP vs CP mean for a UI?
- Where can you cache in a web stack, and what are the invalidation/stampede concerns?
- SQL vs NoSQL โ when each? What are replication and sharding?
- What problem do message queues solve? At-least-once vs exactly-once?
- What do Docker and Kubernetes each do?
- Blue-green vs canary vs rolling deploys โ and where feature flags fit.
- What is Infrastructure as Code and why use it?
- Name the three pillars of observability and what distributed tracing buys you.
- Sketch a scalable system for a high-traffic web app.
Recommendations
- Design app tiers to be stateless and scale out behind a load balancer; keep state in shared stores.
- Cache at every tier with deliberate TTL/invalidation; protect against stampedes.
- Default to relational storage; adopt NoSQL/sharding only for proven scale/access-pattern needs.
- Use queues for async/spiky work; make consumers idempotent.
- Containerize; reach for managed PaaS over raw K8s unless you need K8s.
- Treat CI/CD + IaC + observability as part of the product: PR previews, bundle budgets (
15), tracing (OTel), error tracking (Sentry). - Ship frontend with atomic, hash-busted deploys and feature flags to separate deploy from release.
- Match complexity to need โ start simple (
25); add infrastructure when load/teams justify it.
Books & references
- โDesigning Data-Intensive Applicationsโ โ Martin Kleppmann (DDIA). The single best systems book: consistency, replication, partitioning, queues, streams. Essential. (Shared with
13.) - โSystem Design Interviewโ Vol 1 & 2 โ Alex Xu. The standard interview-prep books; build the vocabulary above into reusable templates. (ByteByteGo is the companion site/newsletter.)
- โBuilding Microservicesโ โ Sam Newman; โRelease It!โ โ Michael Nygard (stability/ops patterns) (
12,13). - โThe DevOps Handbookโ / โAccelerateโ โ Kim/Forsgren et al. CI/CD, delivery performance, and the metrics that matter.
- โSite Reliability Engineeringโ โ Google (free at sre.google). SLOs, observability, operating at scale.
- Docker docs, Kubernetes docs, Terraform docs, OpenTelemetry docs โ primary infra references.
- AWS/GCP Well-Architected Framework โ vendor-neutral-ish principles for reliability, performance, cost, security.
Connections
13-microservices-and-orchestration.mdโ the distributed-systems patterns (sagas, CQRS, EDA) that run on this infra.25-architecture-decisions-and-tradeoffs.mdโ monolith vs microservices, when to add this complexity.18-networking-and-protocols.mdโ CDNs, edge, HTTP caching, DNS, TLS at the transport layer.12-bff-and-data-enrichment.mdโ the BFF as caching/aggregation tier and tracing node.15-performance-and-core-web-vitals.mdโ RUM, edge/CDN, caching as performance levers; bundle budgets in CI.08-nextjs-and-meta-frameworks.mdโ edge runtime, deployment targets, caching layers.16-testing.mdโ where tests sit in the CI/CD pipeline.