Skip to content

Metrics

GoServe provides production-grade monitoring via Prometheus. All metrics are exposed in the standard Prometheus text format at the /metrics endpoint.

Configuration

The metrics endpoint is enabled by default and listens on the main server address (default :8080).

To scrape GoServe with Prometheus, add the following to your prometheus.yml:

scrape_configs:
  - job_name: 'goserve'
    static_configs:
      - targets: ['localhost:8080']

Available Metrics

HTTP Metrics

These metrics track the health and performance of the GoServe REST API.

Metric Name Type Labels Description
goserve_http_requests_total Counter method, status, path Total number of HTTP requests processed.
goserve_http_request_duration_seconds Histogram method, path Latency distribution of HTTP requests.

Inference Metrics

These metrics provide deep visibility into the performance of your Machine Learning models.

Metric Name Type Labels Description
goserve_inference_duration_seconds Histogram model_name Time spent executing the ONNX model (excluding HTTP overhead).
goserve_inference_errors_total Counter model_name, error_type Count of failed inference attempts.

Querying Examples (PromQL)

Average Inference Latency (Last 5 min)

rate(goserve_inference_duration_seconds_sum[5m]) / rate(goserve_inference_duration_seconds_count[5m])

Request Volume by Status Code

sum(rate(goserve_http_requests_total[5m])) by (status)

Error Rate per Model

sum(rate(goserve_inference_errors_total[1m])) by (model_name)

Runtime Metrics

GoServe also exposes standard Go runtime metrics, including: - go_goroutines: Number of active goroutines. - go_memstats_alloc_bytes: Current heap memory usage. - process_cpu_seconds_total: CPU usage of the GoServe process.