
Response Time
- Response Time = Network Delay + Service Time
- Response Time: Total time it took to respond to the request
- Network Delay: Time the request sent “on the wire” in both directions
- Service Time: Time it took for the server to process the request
Latency
- Network Latency
- Server-side Latency
- Client-side Latency

How to calculate Latency?
- Average
- Sum of the latencies / Total number of requests
- This can be misleading. Because there can be requests that are too slow/fast that could affect much on the average
- Percentiles => Better way to calculate latency!
- 50th percentile(p50)
- Median. 50% of the requests are under the median and the other 50% are over the median
- 90th percentile(p90)
- 99th percentile(p99)
- 50th percentile(p50)
Throughput
- Rate at which something is processed
- ex) Requests per second, Database queries per minute, Network packets per hour
- How to increase throughput?
- Batch processing
- Splitting a larger task into smaller ones and running the tasks in parallel
- Message queue cases
- Scaling out message consumers
- Scaling out message queues
- ex) Stream processing
- Sharding(Partitioning)
- Split databases into multiple shards and split write requests into different shards
- Replication
- Create replicas which will handle read requests
- Create replicas which will handle read requests
- Batch processing
Bandwidth
- Maximum rate of data transfer across a given path(Bits per second)
- => Think bandwidth as a tube while throughput as a water inside the tube


Leave a comment