streamed requests should have the best performance on a per-message basis
This is actually wrong. The maintainer of gRPC (Eric Anderson) has said:
We don't generally recommend using streaming RPCs for higher gRPC performance. It is true that sending a message on a stream is faster than a new unary RPC, but the improvement is fixed and has higher complexity. Instead, we recommend using streaming RPCs when it would provide higher application (your code) performance or lower application complexity.