Fundamentals of Backend engineering

Backend communication design patterns

Request - Response pattern

mermaid
sequenceDiagram
    participant Client
    participant Server
    Client->>Server: Request
    Server->>Client: Response
  • Client sends a request to the server (serialize)

  • Server parses the request

  • Server processes the request (deserialize)

  • Server send a response

  • Client parses and processes the response

Used universally for all types of communication
HTTP, DNS, SSH, RPC (remote procedure call) etc.

Basic request structure

txt
GET / HTTP/1.1
HEADERS
<CRLF>
BODY
  • Message format dictates how the message is deserialized / parsed

Request response does not work well for, the following cases.

  • Notification service,
    Client does not know when to expect a response

  • Chat service

  • Very Long Running Task (VLRT) service

  • What if client disconnects?

bash
curl -v --trace google.com

Sync vs Async work loads

No title

  • Can I do work while I wait for a response?

  • Sometimes Async execution is preferred, client can do something else while waiting for a response.

Sync I/O

  • Caller sends a request and blocks

  • Caller cannot execute any other code while waiting for a response

  • Receiver responds and caller unblocks

  • Caller and Receiver are in sync

mermaid
sequenceDiagram
    participant Client
    participant Server
    Client->>+Server: Request
    Note right of Client: Client is blocked
    Server->>-Client: Response
    Note right of Client: Client is unblocked
  • The OS kicks the process out of the CPU main thread,
    and puts it in a waiting queue, until the IO task is complete.

  • The OS then puts the process back in the CPU main thread after the IO task is done.

Async I/O

  • Caller sends a request and does not block

  • Caller can work until a response is received

  • Caller either

    • Checks if the response is ready, (epoll in linux, check if the file descriptor is ready)

    • Receiver calls back when response is ready, (io_uring in linux)

    • Spins up a new thread that blocks ( Multi-threading ), The main thread is not blocked, but a new thread is blocked.

      NodeJS uses this approach, it uses a thread pool to handle blocking IO operations.

  • Caller and receiver are not necessarily in sync

Sync vs Async

No title

  • Synchronicity is a client property.

  • Most modern clients are async.

  • Event loop in JS does this.

Example:

  • Face to face conversation is sync (you need to wait for the other person to finish speaking)

  • Email is async (you can send an email and do something else while waiting for a response)

Async BE patterns

BE returns a response immediately with a job_id and puts the job_id in a queue.

Async Posgtres

  • Postgres async commits (returns a response immediately, but the commit is done in the background) Only the WAL is flushed to disk, WAL changes is flushed to disk at a later point.

  • The dirty pages are marked as clean when the WAL is flushed to disk.

Async Replication

  • At the cost of consistency, you can return a response immediately and replicate the data in the background.

Async OS fsync

  • There is a cache in the OS, the data is written to the cache and the fsync is done in the background.

  • This is done to reduce the wear and tear of the disk.

Push pattern (Socket)

  • If you want data as soon as it is available, you can use the push pattern.

  • Client wants real time notifications from backend. (Socket)

  • Push model is good for certain cases.

Steps

  • Client connects to the server

  • Server sends data to the client as soon as it is available

  • Client doesn't have to request anything

  • Protocol must be bidirectional (gRPC supports streaming between microservices)

Pros

  • Real time data

Cons

  • Client must be online ( needs to be physically connected to the server )

  • Clients must be able to handle the load ( Server needs to know the clients data consumption rate )

  • Requires a bidirectional protocol ( Polling is preferred for light weight clients )

Short Polling

mermaid
sequenceDiagram
    participant client
    participant server

    client->>server: Request
    server-->>client: 200 OK, request_id

    client->>server: request_id
    server-->>client: 200 OK, "Not ready yet"

    client->>server: request_id
    server-->>client: 200 OK, "Not ready yet"

    client->>server: request_id
    server-->>client: 200 OK, "Done", Response

Pros

  • Is very simple to implement

  • Good for long running tasks

  • Client can disconnect safely

Cons

  • Very chatty, lots of requests, from multiple clients (Network congestion)

  • Network bandwidth is wasted (you get billed for the additional bandwidth)

  • Even the check takes time to process the request, which is wasted bandwidth.

Long Polling

mermaid
sequenceDiagram
    participant client
    participant server

    client->>server: Request
    server-->>client: 200 OK, request_id
    client->>server: request_id
    Note right of client: Server does not respond until the data is ready
    server-->>client: 200 OK, "Done", Response
    client->>server: FIN Close connection
  • Avoids the chattiness of short polling

  • Some variations have a timeout, if the server does not respond in time, the client can retry.

  • Even the server has a timeout, if the client does not respond in time, the server can close the connection.

Pros

  • Less Chatty

  • Client can disconnect safely

Cons

Even though long polling reduces the frequency of requests compared to traditional polling, there is still some latency between when the server has new data and when the client receives it. This latency exists because:

  • The server might take some time to respond if it’s waiting for new data.

  • After receiving the response, the client still has to initiate a new request, which causes a delay before the next cycle starts.

In contrast, true real-time communication, like WebSockets, allows the server to immediately push updates to the client without waiting for a new request. (one way communication, server can push response when something changed without need a new request from client)

Server Sent Events

mermaid
sequenceDiagram
    participant client
    participant server

    client->>server: Request
    server-->>client: 200 OK, Response
    Note right of server: Server keeps the connection open
    server-->>client: Response
    server-->>client: Response
    server-->>client: Response
    client->>server: FIN Close connection
  • One Request, a very very long response

  • The TCP connection is kept open by the server, FIN is not sent by the server.

  • The client can close the connection at any time, by sending a FIN.

  • The server sometimes send a heartbeat to check if the client is still connected, and reset network timeouts.

Pros

  • Real time

  • Compatible with HTTP

Cons

  • Client must be online

  • Clients must be able to handle the load

  • Polling is preferred for light weight clients

  • HTTP/1.1 has a limit of 6 connections per domain

    • In chrome, you can open only 6 http connections to a domain at a time.

    • When a connection is being open, the browser cannot use that connection to make another request, because its being busy.

    • If all the connections are SSE, the connections will be blocked and the browser cannot make any other requests to the same domain.

  • HTTP/2 can send multiple streams in the same connection, they have a limit of 200 streams per connection.

Pub-Sub

mermaid
graph LR
request -->
pub[Publisher] -->|channel| queue[[Queue]] -->|channel| sub[Subscriber]
pub -.-> response

The regular HTTP request response pattern CONS

  • Bad for multiple clients

  • Highly coupled

  • Client / Server needs to always online

In the pub-sub pattern, the publisher send a message a queue on request and immediately returns a response to the client. The subscriber listens to the queue and processes the message.

Pros

  • Scales with multiple receivers ( great for microservices )

  • We can consume at our own pace

  • Loose coupling

Cons

  • Message delivery issues (RabitMQ, Kafka, etc handle this)

  • Complexity

  • Network saturation b/w publisher and queue, (Kafka uses long polling to avoid this, RabitMQ uses sockets)

Multiplexing vs Demultiplexing

mermaid
graph LR
    request_1 --> multiplexer
    request_2 --> multiplexer
    request_n --> multiplexer

    multiplexer --> multiplexed_request
  • In HTTP/1.1, the browser opens a new HTTP connection for each request.

  • In HTTP/2, the browser opens a single connection and sends multiple requests over the same connection. (This saves up network bandwidth and reduces overhead with HTTP connection setup.

mermaid
graph LR
    multiplexed_response --> demultiplexer

    demultiplexer --> response_1
    demultiplexer --> response_2
    demultiplexer --> response_n
  • Flow control / congestion control is easier when using a single connection for each request.

Connection pooling

mermaid
graph LR
    request -->|lease| pool[Connection Pool] --> pg[(Postgres)]
    pool --> response -.-|unlease| pool
  • We open a fixed number of connections to the server and keep them hot.

  • When a request comes in, we lease a connection from the pool.

  • Once the query / reponse is done, we unlease the connection back to the pool.

Why can't we use a multiplexed connection for Pg?

  • PG does not garantee the order of the response. (if query 1 is sent before query 2, the response of query 2 can come before query 1)

  • Piplelining is not supported from PG 14 onwards.

Stateful vs Stateless

Statefull

  • If a server stores data about the client in memory and depends on that data to process the request, it is stateful.

mermaid
graph LR
    client --> |session| server[Server]
    server --> |session| client
  • Session is stored in memory.

  • The client for it to work, needs to connect to the same server.

  • The load-balancer sticky session needs to be configured to make sure the client connects to the same server.

Statelessness

mermaid
graph LR
    client --> |request| server[Server]
    server <--> |session| db[(Database)]
    server --> |session| client
  • A stateless server may store data about the client in memory, but it can also safely loose it.

  • A stateless server can still store data about the client elsewhere, like in a database.

  • Can a server be restarted without the client noticing?

Protocols

  • TCP is a stateful protocol, since every segment has a sequence number, ip address, port number etc, living both in the client and server.

  • UDP is stateless, (we are going to this :, that's all)

    • DNS users UDP with a queryId, to recognize the response.

    • QUIC sends connectionID to recognize the connection.

  • You can build a stateless protocol on top of a stateful protocol, (HTTP (stateless) is built on top of TCP)

    • If TCP connection is lost, HTTP just spins up a new connection. (It does not care)

A completely stateless server

  • JWT without any protections

  • A simple function that takes an input and returns an output, without storing any data.

Sidecar Pattern

  • Every language needs to import a library to interface with the network. (HTTP, gRPC, etc)

  • Library adheres to the protocol specifications. (HTTP on the client side written in JS can talk to a server written in Go)

  • Most of the time the app and the library needs to be in the same language.

  • Changing the library is hard, we need additional testing.

  • We can delegate the communication to a sidecar, which can be written in any language.

mermaid
graph LR
    subgraph Server
        sidecar[Sidecar]
        server[Server]
    end
    app <-.->|WebSocket| sidecar
    sidecar <-.-> |localhost| server <--> db[(Database)]
  • Service mesh is a more advanced version of the sidecar pattern.

Pros

  • Language agnostic (polyglot), each microservice can be written in a different language.

  • Protocol upgrade is easy

  • Easier to enable TLS

  • Tracing and monitoring a request from strat to finish is easier.

  • Service Discovery

  • Caching, servers can outsource caching to the sidecar.

Cons

  • Complexity ( more moving parts, how do you debug such a system? )

  • latency ( one more additional hop )