Observability for applications and infrastructure

Tiny business people reading list of rules. Man and woman making checklist for control of companys management on huge clipboard flat vector illustration. Guidance concept

Observability frameworks are very powerful today, and can capture metrics from a very wide range of sources. They are often used purely by data centre infrastructure teams, because they have agents which integrate with common network monitoring protocols like SNMP, or which actively gather data by using network tools.

ServerSage builds on what is perhaps the most powerful open source observability framework, Prometheus, but extends the scope and reach into application health. ServerSage is a comprehensive application health and progress monitoring framework. It supports the philosophy that observability is an integral part of an application’s design, and must be integrated when the source code is being written. The architecture of metrics for health monitoring of an application under full load is as important as the database schema of its data store, and must therefore be given the same importance at the design stage itself. The second underlying philosophy which ServerSage implements is that there is no separation between application and infrastructure – it is all one system. Splitting the two layers into silos weakens both, dramatically.

ServerSage therefore uses powerful tools and strategies perfected by monitoring infrastructure to monitor application health in real time. With ServerSage embedded in your application stack, you have the lightest and fastest stethoscope to look inside the application stack in real time.

Special reports and dashboards compare recent metrics with long-term averages and helps identify trends without the need for separately gathering baseline data. The usual real-time alerts are all supported based on low and high watermarks.

Technical Details

Built by extending the Prometheus framework, without any code change to the core Prometheus code.

Combines metrics of business operations, application software layers, underlying components like databases and application servers, network, operating systems, and hardware all on a common timeline and with a common toolset to give you unprecedented insights into overall system health.

You can correlate holiday sales rush metrics with database and network choke metrics to pinpoint exactly which resources are over-stretched due to which external factors.

Agents (called “exporters”) embedded in application code capture metrics in real time with extremely low overhead, and yet give you the richness of multi-dimensional labels on each data point.

A rich reporting interface allows easy creation of custom dashboards

Alerts when current readings breach watermarks.

Alerts when short-term metrics fall outside long-term averages by pre-defined limits.

Client libraries in Java and Go, which will be linked with application code to capture metrics.

Code generator to take metric specs and generate source code for functions which will be called from the application code, with stronger error checking for bug-free applications.