Welcome to GoServe

GoServe (goserve.run) is a lightweight, high-performance machine learning inference server built in Go with ONNX Runtime.

It is designed for production environments where efficiency, reliability, and ease of use are paramount. By leveraging Go's native concurrency model and the performance of the ONNX Runtime C library, GoServe provides a robust alternative to Python-based inference servers.

Key Goals

Simplicity: Drop-in binary with minimal configuration.
Performance: Optimized for low latency and high throughput.
Efficiency: Low CPU and memory overhead.
Reliability: Built-in health checks and structured logging.

Why GoServe?

While Python is excellent for model development and training, Go is often a better choice for high-scale production serving. GoServe bridges this gap by allowing you to take your ONNX models (from PyTorch, TensorFlow, Scikit-Learn, etc.) and serve them using a high-performance Go binary.

Next Steps

Check out the Quick Start to get running in minutes.
Explore the API Reference to learn how to interact with the server.
See the Configuration Guide for advanced setup.