Frequently Asked Questions¶

Common questions about GPUX.

General¶

GPUX is a Docker-like GPU runtime for ML inference that provides universal GPU compatibility.

GPUX focuses on simplicity and universal GPU compatibility, while TorchServe and Triton are more complex and NVIDIA-focused.

Yes! GPUX is built on mature technologies (ONNX Runtime) and is ready for production use.

Python 3.11 and higher.

No! GPUX works on CPU-only machines with automatic fallback.

Performance depends on your hardware and model. Use gpux run --benchmark to measure.

Yes! GPUX automatically uses TensorRT when available on NVIDIA GPUs.