Data Encoding Formats

Published by

on

For message to be transferred, it needs a process called ‘serialization’ and ‘deserialization’.


Data Formats

1. Textual formats
    • ex) JSON, XML, CSV, …
    • Pros
      • Human-readable (easier to debug and test)
      • Widely supported by languages and tools
    • Cons
      • Bigger messages (slower to transfer)
      • Slower serialization and deserialization

    2. Binary formats
    • ex) Protobuf, Thrift, Avro, …
    • Utilizes schema to serialize/deserialize
    • Pros
      • Smaller messages (faster to transfer and less space needed to store)
        • ex) Integers are encoded into a single byte instead of several bytes
      • Faster to serialize/deserialize
    • Cons
      • Not human-readable (harder to debug and test)

    Backward compatibility is important for distributed systems.


    Reference

    Leave a comment