GeistHaus
log in · sign up

Prometheus: Monitoring at SoundCloud

developers.soundcloud.com

In previous blog posts, we discussed how SoundCloud has been moving towards a microservice architecture. Soon we had hundreds of services, with many thousand instances running and changing at the same time. With our existing monitoring set-up, mostly based on StatsD and Graphite, we ran into a number of serious limitations. What we really needed was a system with the following features: A multi-dimensional data model, so that data can be sliced and diced at will, along dimensions like instance, service, endpoint, and method. Operational simplicity, so that you can spin up a monitoring server where and when you want, even on your local workstation, without setting up a distributed storage backend or reconfiguring the world. Scalable data collection and decentralized architecture, so that you can reliably monitor the many instances of your services, and independent teams can set up independent monitoring servers. Finally, a powerful query language that leverages the data model for meaningful alerting (including easy silencing) and graphing (for dashboards and for ad-hoc exploration). All of these features existed in various systems. However, we could not identify a system that combined them all until a colleague started an ambitious pet project in 2012 that aimed to do so. Shortly thereafter, we decided to develop it into SoundCloud’s monitoring system: Prometheus was born.

4 pages link to this URL
Links for January 28

(by way of @scienceporn) The news story of the week is the Greek elections that put the left wing anti-austerity Syriza into power. The Economist has a useful article on the implications for the broader Euro-zone. As Paul Krugman explains, the austerity measures Greece has imposed at the behest of the institutions that bailed it out have been ruinous for the Greek people. The best overall article I’ve read on what happens next is from the New Yorker’s John Cassidy.

0 inbound links article en posts