The text discusses the performance issues encountered while using gRPC for data exchange and mutual client/server state synchronization in a high throughput system. It highlights the differences between unary requests and streamed requests, emphasizing that most of the existing performance analysis focuses on unary requests. The author shares their experience with creating a new component of their message-processing pipeline using gRPC streaming server written in Go. They discuss the process of performance testing and the challenges faced while trying to scale the system for tens of thousands of messages per second.
The text also explores the reasons behind the high CPU usage, including syscalls, scheduling goroutines, and gRPC internals. It emphasizes that simplifying the test case helped identify the root cause of the issue, which was related to the client's behavior in creating publishers and sending messages. The author concludes by sharing lessons learned from this experience, such as being aware of the actual costs of moving data and understanding the underlying services involved. They also mention their continued exploration of gRPC/HTTP2 stack performance and best practices for on-call processes.