Tutorial - Introduction¶
Welcome to the GPUX tutorial! This guide will take you from zero to deploying production-ready ML inference workloads.
π― What You'll Learn¶
By the end of this tutorial, you'll be able to:
- β Install and configure GPUX
- β
Pull models from Hugging Face with
gpux pull - β Run inference on models (text, audio, image)
- β Benchmark model performance
- β Deploy models with HTTP servers
- β
(Advanced) Create and understand
gpux.ymlconfiguration files
π Prerequisites¶
Before starting, you should have:
- Python 3.11+ installed
- Basic command-line knowledge
- A machine learning model (or use our examples)
- (Optional) A GPU (NVIDIA, AMD, Apple Silicon, Intel, or Windows)
Don't have a GPU?
GPUX works great on CPU too! It will automatically detect available hardware and select the best provider.
πΊοΈ Tutorial Structure¶
This tutorial is organized into progressive steps:
1. Installation¶
Install GPUX and verify your setup.
Time: 5 minutes
2. First Steps¶
Pull your first model from Hugging Face and run inference.
Time: 2 minutes
3. Pulling Models¶
Master pulling models from Hugging Face Hub.
Time: 10 minutes
4. Running Inference¶
Master the gpux run command with different input formats.
Time: 10 minutes
5. Serving Models¶
Deploy models with HTTP APIs for production use.
Time: 15 minutes
6. Benchmarking¶
Measure and optimize model performance.
Time: 10 minutes
7. Configuration βοΈ Advanced¶
Learn about gpux.yml for custom models and advanced settings.
Time: 15 minutes
π‘ Learning Path¶
graph LR
A[Installation] --> B[First Steps]
B --> C[Pulling Models]
C --> D[Running Inference]
D --> E[Serving]
E --> F[Benchmarking]
F --> G[Configuration]
G --> H[Production Ready!]
style A fill:#6366f1,stroke:#4f46e5,color:#fff
style B fill:#6366f1,stroke:#4f46e5,color:#fff
style C fill:#6366f1,stroke:#4f46e5,color:#fff
style D fill:#6366f1,stroke:#4f46e5,color:#fff
style E fill:#6366f1,stroke:#4f46e5,color:#fff
style F fill:#6366f1,stroke:#4f46e5,color:#fff
style G fill:#f59e0b,stroke:#d97706,color:#fff
style H fill:#10b981,stroke:#059669,color:#fff
π Quick Start¶
If you're already familiar with Docker and ML inference, here's a quick overview:
# Install GPUX
uv add gpux
# Pull a model from Hugging Face
gpux pull cardiffnlp/twitter-roberta-base-sentiment-latest
# Run inference
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
--input '{"inputs": "I love this product!"}'
# Start HTTP server
gpux serve cardiffnlp/twitter-roberta-base-sentiment-latest --port 8080
No Configuration Needed
GPUX automatically handles model download, conversion, and configuration. For custom models, see Configuration.
π Alternative Paths¶
Depending on your experience level, you can choose different paths:
New to ML inference?
Follow the tutorial in order, starting with Installation.
We'll explain every concept and provide detailed examples.
Familiar with ML deployment?
Skim through Installation and First Steps, then focus on:
π After the Tutorial¶
Once you complete this tutorial, explore:
- User Guide - Deep dive into GPUX concepts
- Examples - Real-world use cases
- Deployment - Production deployment strategies
- API Reference - Complete API documentation
π¬ Get Help¶
Stuck? We're here to help!
- π Check the FAQ
- π Open an issue
- π¬ Join our Discord
- π§ Email support
Ready to begin? Let's start with Installation β