For message to be transferred, it needs a process called ‘serialization’ and ‘deserialization’.

Data Formats
1. Textual formats
- ex) JSON, XML, CSV, …
- Pros
- Human-readable (easier to debug and test)
- Widely supported by languages and tools
- Cons
- Bigger messages (slower to transfer)
- Slower serialization and deserialization
2. Binary formats
- ex) Protobuf, Thrift, Avro, …
- Utilizes schema to serialize/deserialize
- Pros
- Smaller messages (faster to transfer and less space needed to store)
- ex) Integers are encoded into a single byte instead of several bytes
- Faster to serialize/deserialize
- Smaller messages (faster to transfer and less space needed to store)
- Cons
- Not human-readable (harder to debug and test)




Backward compatibility is important for distributed systems.


Leave a comment