Welcome to GoServe
GoServe (goserve.run) is a lightweight, high-performance machine learning inference server built in Go with ONNX Runtime.
It is designed for production environments where efficiency, reliability, and ease of use are paramount. By leveraging Go's native concurrency model and the performance of the ONNX Runtime C library, GoServe provides a robust alternative to Python-based inference servers.
Key Goals
- Simplicity: Drop-in binary with minimal configuration.
- Performance: Optimized for low latency and high throughput.
- Efficiency: Low CPU and memory overhead.
- Reliability: Built-in health checks and structured logging.
Why GoServe?
While Python is excellent for model development and training, Go is often a better choice for high-scale production serving. GoServe bridges this gap by allowing you to take your ONNX models (from PyTorch, TensorFlow, Scikit-Learn, etc.) and serve them using a high-performance Go binary.
Next Steps
- Check out the Quick Start to get running in minutes.
- Explore the API Reference to learn how to interact with the server.
- See the Configuration Guide for advanced setup.