I recently rebuilt my monitoring environment for the servers I manage. In this post I describe what technologies I chose and how they fit together to form a monitoring stack that - once set up – gives you quick access to relevant metrics and logs and doesn’t burden you much with operations. A couple of requirements were important for me: Everything should be self-hosted and I wanted a central monitoring server which collects data from all monitored hosts Since most servers I manage are internal and not reachable from the outside, pushing metrics and logs is more practical than pulling (partially allowed outbound access is a given). Pushing also skips maintaining a list of servers on the monitoring host. Distributing collectors to the servers to be monitored should be easy Servers should authenticate to push logs and metrics and it should be easy for me to onboard new servers or revoke access. There should only be a limited amount of management overhead to manage credentials/PKI on my side Monitoring dashboards and exploratory tools for logs need only be accessible for me without much compartmentalization but should obviously also be properly protected When certain metrics go out of normal range or relevant errors pop up I should get alerts. The notifications should arrive as emails in my mailbox and as notifications on my phone (iOS). I operate at a relatively small scale, so keep this in mind :-)
No pages have linked to this URL yet.