Cloud Spanner, instrumented by OpenCensus and exported to Stackdriver

orijtech
6 min readMar 8, 2018

--

In this post, we’ll explore the power of OpenCensus’ exporters, using the Google Cloud Spanner package for both the Go and Java programming languages.

The Cloud Spanner package has already been instrumented using OpenCensus. Our mission is to add a few lines to already running code to harness this instrumentation, to extract metrics and traces. We’ll then export them to Stackdriver for recording, root cause analyses, visualization and other analyses — a typical task that Application Performance Management teams would perform.

The example is an excerpt of a microservice from a fictional athletics startup, ZeuSports. It is their database microservice(dBmus). dBmus is in charge of adding newly subscribed players to the app. In this demo we’ll visualize some metrics and traces, to see how long Cloud Spanner calls take thus harnessing the power of the instrumentation already in the Cloud Spanner package, leveraged by the respective exporters.

Requirements:

  • Go1.8 and greater, or Java SE 7 and greater
  • A cloud provider based project; it should support Stackdriver Monitoring and Tracing — we’ll use Google Cloud Platform for this example
  • Enable Stackdriver Monitoring and Tracing on the project

Step 1: Enable the Stackdriver APIs

Assuming you have an account on a cloud provider that supports Stackdriver Monitoring and Tracing; for this example we’ll use Google Cloud Platform. Let’s go ahead and firstly search for, then enable the APIs on our project, if they are not yet enabled.

search for stackdriver APIs
Enable Stackdriver Monitoring API
Stackdriver Monitoring API enabled
Enable Stackdriver Trace API
Stackdriver Trace API enabled

Step 2a: Operational details

Our microservice receives new players’ information and adds them to a Cloud Spanner table “players” where the schema looks like this

Players table schema

Step 2b: Method of attack:

In order for the instrumented Cloud Spanner package to export traces and metrics to Stackdriver, we need to:

a) Create an OpenCensus Stackdriver Monitoring/Stats exporter instance

b) Create an OpenCensus Stackdriver Trace exporter instance

c) Create the Cloud Spanner client

d) Enable a trace sampler to capture a percentage of traces. Please note that for this demo we are always sampling and that is very high. More realistically perhaps tracing 1 in 10,000 might be suffice

e) Subscribe to the respective gRPC metrics that we would like to track, in this case we’ll subscribe to: RoundTrip latency, Per-Minute Error count, Number of requests

Step 3: Show me the code

The Go code:

the entirety of the source code in Go

The equivalent Java code is:

the entirety of the source code in Java

Step 3.X: Enable Application Default Credentials

Please make sure to enable Application Default Credentials for authentication. Please visit https://developers.google.com/identity/protocols/application-default-credentials if you haven’t yet enabled them.

Step 4: Get the dependencies and run it!

Let’s go into our terminal and do something like this:

For Go:

make the directory go gettable/buildable

Then, we’ll paste the main.go file from above and afterwards run:

go get dependencies, build and then runthe binary

Note: Just in case you didn’t properly setup your Application Default Credentials but have some test credentials, instead do this:

go get dependencies, build the binary but also include the target Google Credentials source

For Java:

Please add the pom.xml file below:

pom.xml
run it

Step 5: Harness the power and visualize!

As soon as the code ran, it started exporting to Stackdriver, so let’s switch back to our browser and open up Stackdriver Trace at https://console.cloud.google.com/traces/traces to see the output.

overview of the trace list

Notice in the span list that the latencies of create-players are variable; as the number of operations increases and existing sessions are reused, it gets faster. For example notice above that CreateSession took ~212ms but as we continued reusing the expensively setup session things got a lot more faster, down to ~8ms! You can learn more about the benefits of CreateSession at https://cloud.google.com/spanner/docs/sessions#performance_benefits_of_a_session_cache

create-players trace

Drilling down to a particular data point in time yields a trace with a root span and children spans shown below:

First child span, ReadWriteTransaction

Further drilling down to the child span:

Last child span, Commit

It is worth noting that the above graphs were produced by multiple runs of the sample code. But in a long running server applications, where cloud spanner client is created just once and used for the lifetime of the server, we expect to see very few CreateSession calls after the initial warm up. Typical usage of cloud spanner client in such an application would yield a graph like this:

querying
ReadWrite transaction with CreateSession dominating

And now to view the collected metrics at https://app.google.stackdriver.com/metrics-explorer?project=census-demos:

Visualizing the number of BeginTransaction operations
request latencies

As you can see, from the example above, in less than 42 lines, ZeuSports went from blindly operating their backend(they couldn’t guess how long Cloud Spanner operations took, what the latencies were), to getting an illumination on their database operations, metrics and traces in near-realtime, using OpenCensus’ packages in both the Go and Java programming languages. The power of visualization helps teams perform root cause analyses, post-mortems, bisect bad changes, figure out what needs to be optimized. The OpenCensus team has built and continues to actively build more client libraries in other languages too as well as more exporters such as Prometheus, Zipkin, SignalFx, and many others — with the purpose of democratizing good instrumentation, cheaply, scalably and nicely without having to burden your teams with maintenance.

Thank you for following this far and hopefully this can help get you started with tracing and monitoring for Cloud Spanner in your backend. As I mentioned, the OpenCensus team has been working hard to bring us even more exporters. Please check out http://opencensus.io/ and get involved, let the team know what things you need, share examples of your instrumented backends, contribute to the code which is entirely developed in the open on Github at https://github.com/census-instrumentation

Big thanks to Jaana Burcu Dogan for snippets of the original Go code sample and for putting up instructions on the wiki for step-by-step usage;
Ramon Nogueira, Pritam Shah, Gopi Palaniappan, Jonathan Amsterdam, Vikas Kedia, Damian Reeves, Di Xiao and Bogdan Drutu for the reviews and also to everyone on the OpenCensus team for all their hard work!

The code in this tutorial/example is heavily borrowed from https://github.com/GoogleCloudPlatform/golang-samples/blob/master/opencensus/opencensus_spanner_quickstart/main.go and the wiki https://github.com/GoogleCloudPlatform/google-cloud-go/wiki/OpenCensus as well as from https://github.com/GoogleCloudPlatform/java-docs-samples/tree/master/spanner/cloud-client

By Emmanuel T Odeke, Orijtech, Inc.

--

--

orijtech
orijtech

Written by orijtech

Observability and infrastructure for high performance systems and the cloud.