OpenCensus for Java gRPC developers

orijtech
Orijtech Developers
7 min readMay 24, 2018

--

OpenCensus for Java gRPC developers!

In this tutorial, we’ll examine how to use OpenCensus in your gRPC projects in the Java programming language for observability both into your server and then client! We’ll then examine how we can integrate with OpenCensus exporters from Jaeger and Google Stackdriver Tracing and Monitoring.

gRPC is a modern high performance framework for making remote procedure calls, powered by Protocol Buffer encoding. It is used in large distributed systems as well as on mobile devices, it supports bi-directional streaming and is built of HTTP/2 which allows for efficient and persistent connections. gRPC is implemented in a variety of languages like: Go, Python, C/C++, Java, Node.js, PHP, Ruby. Its homepage is https://grpc.io/

OpenCensus is a modern observability framework for distributed tracing and monitoring across microservices and monoliths alike. It is polyglot in nature, that is implemented in a variety of languages: Go, Python, Java, C/C++, Node.js, PHP, Ruby, Erlang, C#(coming soon). Like gRPC, it is also accessible and usable on a variety of environments ranging from mobile devices, general purpose computers and data centres, allowing for distributed computing. Its homepage is https://opencensus.io/

Java is a modern language that is beloved across the software industry, powering all sorts of devices from mobile devices, watches, micro controllers, general purpose computers, data centres. It has been used in the industry since the mid 90s and is one of the most influential languages in computing history, with cross platform support and its popular feature “Write Once, Run Anywhere” WORA.

With the above introductions, you might have noticed the themes here: high performance, cross platform, reliable, efficiency, modern,, productivity. The points mentioned make the trio a great combination.

In the past, efficiency of computing services has been measured by response time, resource usage, robustness — which are mostly extrisinc observations, observable from the outside. OpenCensus provides a new paradigm of examining a service. It provides vendor agnostic, low latency observability across a wide range of services to allow you to introspect your services so that you can know when things go wrong, trace back what happened.

In this tutorial, we examine a company’s revenue generating service that’s gRPC based — for simplicity it capitalizes content sent in and send them back. The company’s intention is to generate revenue by giving API access to this magical service.

To get started, if you are not familiar with gRPC in Java, please take a look at https://grpc.io/docs/quickstart/java.html

Our service takes in a payload with bytes, then capitalizes them on the server and sends them back to the caller.

To use gRPC, we’ll generate the stubs, but firstly let’s make our project with this structure directorysrc/main/protoand in there we’ll paste the above file as defs.proto

For this project, we’ll be using Apache Maven to build https://maven.apache.org/

To use Apache Maven, we’ll need to define a pom.xml file such as this one in the root of the project i.e. pom.xml should be in the same level as src

The next step will be to generate the stubs, which we shall do with

mvn install

Once that has successfully ran, let’s now go add the actual server implementation. Please create this following directory src/main/java/io/ocgrpc/capitalize

Thereafter, create this file

Plain server

src/main/java/io/ocgrpc/capitalize/CapitalizeServer.java

src/main/java/io/ocgrpc/capitalize/CapitalizeServer.java

and then for the client in file

Plain client

src/main/java/io/ocgrpc/capitalize/CapitalizeClient.java

src/main/java/io/ocgrpc/capitalize/CapitalizeClient.java

And then to run it

mvn install

Run the server

mvn exec:java -Dexec.mainClass=io.ocgrpc.capitalize.CapitalizeServer

Running server

Run the client too

man exec:java -Dexec.mainClass=io.ocgrpc.capitalize.CapitalizeClient

Running interactive client

And with that we are open for business!! Time for our customers to rain that money in $$$

Open for business

Aftermath

It’s been 1 hour since launch. The news of our new service is blowing up the entire internet, everyone wants a piece of the API, and the bank is breaking from people throwing money at us to access the API. Euphoria engulfs the whole team, all our hardwork has paid off, we seem invincible and the satisfaction of popularity warms everyone. We then head home.

It hits 3AM and team member A who is too excited wakes up their entire family to give them a demo of the service. However, all of a sudden they notice that the response comes back after a couple of seconds. A couple of seconds they wonder? In testing, we only saw a maximum of 1ms. Something must be wrong!! They try again for a couple of minutes and the response time seems to be getting worse. Alas everyone gets paged. When did the service time start degrading? Why did it degrade? What exactly is wrong? How can we mitigate this problem? Throw more servers at it? How many servers? How do we tell when we’ve fixed the problem? We don’t have any observability to see what’s going wrong with our systems? Did we fly too close to the sky?

Who knows?

In comes OpenCensus: OpenCensus is a single distribution of libraries for modern distributed tracing and monitoring. OpenCensus can help answer most of those questions. By “most of those questions”, I mean the questions about observability such as: When did the latency increase? Why? How? By how much? What part of the system is the slowest? How can we optimize and assert successful changes?

OpenCensus is implemented also in Java, developed entirely in the open, and adds very low latency to your application and it is integrated into both HTTP and gRPC transports.

With OpenCensus, to examine what’s going wrong with our system, we’ll apply a minimal diff and then add exporters that our different teams use to examine the traces and metrics collected from our applications. Our teams use Jaeger, Stackdriver Tracing and Monitoring. OpenCensus exports to a wide range of exporters such as Prometheus, AWS X-Ray, Stackdriver Tracing and Monitoring, Jaeger, Zipkin, Instana, SignalFX and many more.

Observability by instrumenting the server

To collect statistics and traces from gRPC server, OpenCensus is already integrated with gRPC out of the box. We just need to turn on the integration with a few imports

pom.xml diff
CapitalizeServer.java diff

And with the diffs above we now get

New pom.xml

And the fully instrumented server’s source code

src/main/java/io/ocgrpc/capitalize/CapitalizeServer.java

and the same for the client which becomes

src/main/java/io/ocgrpc/capitalize/CapitalizeServer.java

Examining that “observability”

With Jaeger:

If you don’t already have it, please check out how to install and use Jaeger at https://www.jaegertracing.io/docs/getting-started/#all-in-one-docker-image

On visiting the Jaeger endpoint http://localhost:14268/api/traces

we get back

On Jaeger
drilling down

On Stackdriver Tracing

Good times
Plausibly system overload with intense capitalization
Plausibly ingress network congestion
Plausibly system overload along with ingress network congestion

On Stackdriver Monitoring

Client latency
Server latency

Engineers can now add alerts with Stackdriver Monitoring https://cloud.google.com/monitoring/alerts/ or they can use Prometheus alerting https://prometheus.io/docs/alerting/overview /which is also supported by OpenCensus https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/stats/prometheus or even SignalFX https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/stats/signalfx and many more.

The Java implementation is available at https://github.com/census-instrumentation/opencensus-java or for easier integration please see the Javadoc at https://www.javadoc.io/doc/io.opencensus/opencensus-api/

Various teams can simultaenously examine system behaviour, be it traces or metrics, with low latency. An exercise left to the reader is to see how you can add tracing to streaming functions with gRPC — a hint is that for a trace to be exported, it has to be ended. In order for tracing with streaming to work, you can just start and end spans after sending each response or in batches, just like we’ve done here. Please try it out and share your response ;)

With that we are off to the races

Off to the races with OpenCensus: Ready, Set and Go!!

Thank you for reading this far and hope this tutorial was useful, you can find all the code in this tutorial at https://github.com/orijtech/opencensus-for-grpc-java-developers

Please feel free to check out the OpenCensus community https://opencensus.io send us feedback, instrument your backends and share with your friends and teams!

This tutorial is part of a bunch more coming where we’ll use different languages, different transports and provide more samples etc.

Thank you again.

Kind regards,

Emmanuel T Odeke

--

--

orijtech
Orijtech Developers

Observability and infrastructure for high performance systems and the cloud.