Tutorial - Introduction¶

Welcome to the GPUX tutorial! This guide will take you from zero to deploying production-ready ML inference workloads.

🎯 What You'll Learn¶

By the end of this tutorial, you'll be able to:

✅ Install and configure GPUX
✅ Pull models from Hugging Face with gpux pull
✅ Run inference on models (text, audio, image)
✅ Benchmark model performance
✅ Deploy models with HTTP servers
✅ (Advanced) Create and understand gpux.yml configuration files

📋 Prerequisites¶

Before starting, you should have:

Python 3.11+ installed
Basic command-line knowledge
A machine learning model (or use our examples)
(Optional) A GPU (NVIDIA, AMD, Apple Silicon, Intel, or Windows)

Don't have a GPU?

GPUX works great on CPU too! It will automatically detect available hardware and select the best provider.

🗺️ Tutorial Structure¶

This tutorial is organized into progressive steps:

1. Installation ¶

Install GPUX and verify your setup.

Time: 5 minutes

2. First Steps ¶

Pull your first model from Hugging Face and run inference.

Time: 2 minutes

3. Pulling Models ¶

Master pulling models from Hugging Face Hub.

Time: 10 minutes

4. Running Inference ¶

Master the gpux run command with different input formats.

Time: 10 minutes

5. Serving Models ¶

Deploy models with HTTP APIs for production use.

Time: 15 minutes

6. Benchmarking ¶

Measure and optimize model performance.

Time: 10 minutes

7. Configuration ⚙️ Advanced¶

Learn about gpux.yml for custom models and advanced settings.

Time: 15 minutes

💡 Learning Path¶

graph LR
    A[Installation] --> B[First Steps]
    B --> C[Pulling Models]
    C --> D[Running Inference]
    D --> E[Serving]
    E --> F[Benchmarking]
    F --> G[Configuration]
    G --> H[Production Ready!]

    style A fill:#6366f1,stroke:#4f46e5,color:#fff
    style B fill:#6366f1,stroke:#4f46e5,color:#fff
    style C fill:#6366f1,stroke:#4f46e5,color:#fff
    style D fill:#6366f1,stroke:#4f46e5,color:#fff
    style E fill:#6366f1,stroke:#4f46e5,color:#fff
    style F fill:#6366f1,stroke:#4f46e5,color:#fff
    style G fill:#f59e0b,stroke:#d97706,color:#fff
    style H fill:#10b981,stroke:#059669,color:#fff

🚀 Quick Start¶

If you're already familiar with Docker and ML inference, here's a quick overview:

# Install GPUX
uv add gpux

# Pull a model from Hugging Face
gpux pull cardiffnlp/twitter-roberta-base-sentiment-latest

# Run inference
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
  --input '{"inputs": "I love this product!"}'

# Start HTTP server
gpux serve cardiffnlp/twitter-roberta-base-sentiment-latest --port 8080

No Configuration Needed

GPUX automatically handles model download, conversion, and configuration. For custom models, see Configuration.

📖 Alternative Paths¶

Depending on your experience level, you can choose different paths:

BeginnerIntermediateAdvanced

New to ML inference?

Follow the tutorial in order, starting with Installation.

We'll explain every concept and provide detailed examples.

Familiar with ML deployment?

Skim through Installation and First Steps, then focus on:

ML ops expert?

Jump directly to:

🎓 After the Tutorial¶

Once you complete this tutorial, explore:

User Guide - Deep dive into GPUX concepts
Examples - Real-world use cases
Deployment - Production deployment strategies
API Reference - Complete API documentation

💬 Get Help¶

Stuck? We're here to help!

📖 Check the FAQ
🐛 Open an issue
💬 Join our Discord
📧 Email support

Ready to begin? Let's start with Installation →

Tutorial - Introduction¶

🎯 What You'll Learn¶

📋 Prerequisites¶

🗺️ Tutorial Structure¶

1. Installation¶

2. First Steps¶

3. Pulling Models¶

4. Running Inference¶

5. Serving Models¶

6. Benchmarking¶

7. Configuration ⚙️ Advanced¶

💡 Learning Path¶

🚀 Quick Start¶

📖 Alternative Paths¶

🎓 After the Tutorial¶

💬 Get Help¶

1. Installation ¶

2. First Steps ¶

3. Pulling Models ¶

4. Running Inference ¶

5. Serving Models ¶

6. Benchmarking ¶