Monday, November 7, 2016

gRPC & Protocol Buffers

What is gRPC ?


gRPC is a modern, open source remote procedure call (RPC) framework that can run anywhere. It is an improved rework of Google’s single purpose RPC infrastructure named Stubby. It uses newest technology standards (i.e SPDY, HTTP/2 and QUIC). You can read more about gRPC principles and requirements at http://www.grpc.io/blog/principles


Why gRPC ?


Pros:

  • Static entry point = Reduce HTTP request parsing time
  • HTTP/2 Multiplexing (Multiple response/request over a single TCP connection) reduces significant network overhead over HTTP/1.x

Cons:
  • As of this writing, no native browsers support yet

What is Protocol Buffers (PB) ?


Protocol buffers is a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more. We define how we want our data to be structured once, then we can use special generated source code to easily write and read our structured data to and from a variety of data streams and using a variety of languages.

PB vs JSON:
  • PB is a schema based definition which in a type-safe programming languages it will be checked during compile time. While in a dynamic programming languages (e.g. Javascript, Python, etc) it only reduces boiler plate in de-/serialization.
  • PB significantly reduces data on-the-wire size compared to JSON
  • PB is harder to debug compared to JSON
  • PB is less human readable

Why Protocol Buffers?


One biggest advantage of Protocol Buffers is that is have code generators for multiple language (C++, C#, Go, Java, Python, JavaScript, PHP, Ruby, etc.). The generator will create data structure for each of the language to be used with custom implementation.

Another big advantage is the new fields added in the protocol information will not break any intermediate servers that didn’t need to inspect the data could simply parse it and pass thourgh the data without needing to know about all the fields


gRPC & Protocol Buffers


gRPC by default uses Protocol Buffers as default mechanism for serializing structured data. Combined with HTTP/2, which is a binary protocol, that makes protocol buffers as first choice over serialization format because it is a binary serialization mechanism.

Scala specific implementations:

- Use `scalapb-runtime-grpc` as wrapper for `grpc-java` implementation
- Use `scalapb-compilerplugin` to compile protobuf to scala classes

Some catch during research from other:
  1. Connections can die in a somewhat unexpected way. This turned out to be caused by HTTP/2.0 which only allows 1 billion streams over a single connection. Maybe not a common issue, but it hurt us because we had a few processes reaching this limit at the same time, breaking our redundancy. It's easy work around it, and I believe the grpc-java team has plan for a fix that would make this invisible to a single channel.
  2. Mixing small/low-latency requests with large/slow requests caused very unstable latency for the low-latency requests. Our current work-around is to start two grpc servers (still within the same java process and sharing the same resources).

Conclusion


While gRPC+Protocol Buffers will increase our API performance, it still lacking in some areas. It is best used for service-to-service communication in microservice architecture and not suitable for browsers usage and/or third-party communication util it gain better supports and has wide support. Although there are some projects to make gRPC usable by browsers (e.g grpc-gateway) it is still in early stages and other research should be done before using them.


Read more!