GPUX¶

Docker-like GPU Runtime for ML Inference¶

GPUX provides universal GPU compatibility for ML inference workloads. Run the same model on any GPU without compatibility issues.

Get Started View on GitHub

⚡ Why GPUX?¶

🌍 Universal GPU Support¶

Works on NVIDIA, AMD, Apple Silicon, Intel, and Windows GPUs. No more "works on my GPU" problems.

🐳 Docker-like UX¶

Familiar commands and configuration. If you know Docker, you know GPUX.

gpux build .
gpux run model-name
gpux serve model-name

⚙️ Zero Configuration¶

Automatically selects the best GPU provider. Works out of the box.

🚀 High Performance¶

Leverages optimized ONNX Runtime backends with TensorRT, CUDA, CoreML, and more.

🔧 Production Ready¶

Built on mature, battle-tested technologies. Ready for production workloads.

🐍 Python First¶

Simple Python API for seamless integration into your ML pipelines.

🎯 Quick Example¶

Pull models from Hugging Face and run inference - no configuration needed!

📝 Text: Sentiment Analysis🎤 Audio: Speech Recognition🖼️ Image: Classification

# Pull a modern sentiment analysis model
gpux pull cardiffnlp/twitter-roberta-base-sentiment-latest

# Run inference
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
  --input '{"inputs": "I love this product!"}'

# Start HTTP server
gpux serve cardiffnlp/twitter-roberta-base-sentiment-latest --port 8080

# Pull Whisper for speech-to-text
gpux pull openai/whisper-base

# Run inference on audio file
gpux run openai/whisper-base \
  --input '{"audio": "path/to/audio.wav"}'

# Pull Vision Transformer for image classification
gpux pull google/vit-base-patch16-224

# Run inference on image
gpux run google/vit-base-patch16-224 \
  --input '{"image": "path/to/image.jpg"}'

Zero Configuration

GPUX automatically: - Downloads and converts models to ONNX - Generates configuration - Selects the best GPU provider - Handles input preprocessing

Advanced Configuration

For custom models or advanced settings, see Configuration Guide.

🖥️ Supported Platforms¶

Platform	GPU	Provider	Status
NVIDIA	CUDA	TensorRT, CUDA	✅ Supported
AMD	ROCm	ROCm	✅ Supported
Apple	Metal	CoreML	✅ Supported
Intel	OpenVINO	OpenVINO	✅ Supported
Windows	DirectML	DirectML	✅ Supported
Universal	CPU	CPU	✅ Supported

📦 Installation¶

Install GPUX using uv (recommended) or pip:

uvpip

uv add gpux

pip install gpux

Why uv?

We recommend using uv for faster, more reliable dependency management.

🚀 Key Features¶

Automatic Provider Selection¶

GPUX automatically selects the best execution provider for your hardware:

from gpux import GPUXRuntime

runtime = GPUXRuntime(model_path="model.onnx")
# Automatically uses:
# - TensorRT/CUDA on NVIDIA GPUs
# - CoreML on Apple Silicon
# - ROCm on AMD GPUs
# - CPU as fallback

Benchmarking Built-in¶

Measure performance with ease:

gpux run model-name --benchmark --runs 1000

╭─ Benchmark Results ─────────────────────╮
│ Mean Time     │ 0.42 ms                 │
│ Std Time      │ 0.05 ms                 │
│ Min Time      │ 0.38 ms                 │
│ Max Time      │ 0.55 ms                 │
│ Throughput    │ 2,380 fps               │
╰─────────────────────────────────────────╯

HTTP Server¶

Serve models with a single command:

gpux serve model-name --port 8080

Automatic OpenAPI/Swagger documentation at /docs.

📚 Learn More¶

Tutorial ¶

Step-by-step guide from installation to production deployment.

User Guide ¶

In-depth documentation of core concepts and features.

Examples ¶

Real-world examples: sentiment analysis, image classification, LLM inference, and more.

API Reference ¶

Complete CLI, configuration, and Python API reference.

Deployment ¶

Deploy to Docker, Kubernetes, AWS, GCP, Azure, and edge devices.

Advanced ¶

Performance optimization, custom providers, production best practices.

🌟 Show Your Support¶

If you find GPUX useful, please consider:

🤝 Contributing¶

We welcome contributions! See our Contributing Guide to get started.

📄 License¶

GPUX is licensed under the MIT License.

Ready to get started? Check out our Installation Guide!