title: “System Design Foundational Topics”
excerpt: “Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Praesent elementum facilisis leo vel fringilla est ullamcorper eget. At imperdiet dui accumsan sit amet nulla facilities morbi tempus.”
coverImage: “/assets/blog/dynamic-routing/cover.jpg”
date: “2020-03-16T05:35:07.322Z”
author:
name: JJ Kasper
picture: “/assets/blog/authors/jj.jpeg”
ogImage:
url: “/assets/blog/dynamic-routing/cover.jpg”
Database Performance Tip
Soft Delete
- Instead of DELETE, UPDATE the record as something like
is_delete = true
- Recoverability, Archival, Audit
- Easy on the database - No tree re-balancing
Long text vs Short Text
- Long texts are saved as reference in the column, whereas the short texts are saved in the column with data
Date-Time
datetime
is convenient, but internally database might be converting string to date object, which might be heavy on size and indexepoch
is an integer, efficient, optimal and light weight. But no readabilitycustom format
Database Views materialized
- Views are temporary tables
- Suppose you have 6 joins, create a temp table with the joins
- Updating this is heavy, so should not be used which are frequently updated
Caching
Reduces response times by saving any heavy optimization
- Cache are not only RAM based
Reduce disk I/O or Network I/O or compute
1. Caching at different levels
- Main mem of API server
- limited in size
- Inconsistency with distributed system, as difficult to update with database update
- If crashed, application might crash
2. Browser cache
- Local storage
3. CDN
- Can cache anything, from API to website
4. Disk of API Server
- Disk is sometimes mostly under utilized
- Saves the network I/O
- Should be static data, as it will be inconsistent. Use-case specific
5. Load Balancers
- Cache API responses
Scaling
Ability to handle large number of concurrent requests
Two strategies
- Vertical Scaling
- Horizontal Scaling
Vertical Scaling
- Infrastructure bulking, adding more CPU, RAM, Disk, etc. to handle more requests or tasks
- Easy to manage
- Risk of downtime, not suitable to high availability
- Updates will have downtime
- Limited by hardware
Horizontal Scaling
- Linear Amplification - Adding more machines to handle more requests or tasks
- Must be aware of units of one machine. Like how many requests one unit can handle
- Network partitioning is needed.
- Always do bottom up scaling, otherwise your API will accept requests but the database would fail
Database scaling
Vertical Scaling
Have a bigger database
Read Replicas
DB replica which is in sync with the master, and takes the load of reading data
Sharding
Dividing data of database into multiple parts, and hence division of the database
Delegation
Leveraging another worker process to do the heavy lifting for the main process asynchronously.
Brokers - Queues for tasks or messages
Two common implementations
- Message Queues - SQS, RabbitMQ - Consumers pulls the message from the Queue
- Message Streams -
What does not need to be done in realtime, do no do in realtime.
Concurrency
To get faster execution, process in parallel, by leveraging threads and multiprocessing.
Issues
- Communication between threads
- concurrent user of shared-resources
Handling
- Locks
- Mutexes and semaphones
- Go lock free
Communication
Usual communication- HTTP/s
Short Polling
- Continuous check
- HTTP overhead
Long Polling
- Open and hold the connection until the server responds
- Re-connects after expire or response received
- Reduces number of hits. Realtime data transfer
- Still not persistent, overhead of HTTP
Web-sockets
- Bidirectional channel
- Persistent connection
- Realtime data transfer
Server-sent events
- Unidirectional connection, persistent
- Server sends the data
- Leverages the HTTP, so simple to implement
- Less resource intensive then websocket