Skip to content

Tutorial - Introduction

Welcome to the GPUX tutorial! This guide will take you from zero to deploying production-ready ML inference workloads.

🎯 What You'll Learn

By the end of this tutorial, you'll be able to:

  • βœ… Install and configure GPUX
  • βœ… Pull models from Hugging Face with gpux pull
  • βœ… Run inference on models (text, audio, image)
  • βœ… Benchmark model performance
  • βœ… Deploy models with HTTP servers
  • βœ… (Advanced) Create and understand gpux.yml configuration files

πŸ“‹ Prerequisites

Before starting, you should have:

  • Python 3.11+ installed
  • Basic command-line knowledge
  • A machine learning model (or use our examples)
  • (Optional) A GPU (NVIDIA, AMD, Apple Silicon, Intel, or Windows)

Don't have a GPU?

GPUX works great on CPU too! It will automatically detect available hardware and select the best provider.

πŸ—ΊοΈ Tutorial Structure

This tutorial is organized into progressive steps:

1. Installation

Install GPUX and verify your setup.

Time: 5 minutes

2. First Steps

Pull your first model from Hugging Face and run inference.

Time: 2 minutes

3. Pulling Models

Master pulling models from Hugging Face Hub.

Time: 10 minutes

4. Running Inference

Master the gpux run command with different input formats.

Time: 10 minutes

5. Serving Models

Deploy models with HTTP APIs for production use.

Time: 15 minutes

6. Benchmarking

Measure and optimize model performance.

Time: 10 minutes

7. Configuration βš™οΈ Advanced

Learn about gpux.yml for custom models and advanced settings.

Time: 15 minutes


πŸ’‘ Learning Path

graph LR
    A[Installation] --> B[First Steps]
    B --> C[Pulling Models]
    C --> D[Running Inference]
    D --> E[Serving]
    E --> F[Benchmarking]
    F --> G[Configuration]
    G --> H[Production Ready!]

    style A fill:#6366f1,stroke:#4f46e5,color:#fff
    style B fill:#6366f1,stroke:#4f46e5,color:#fff
    style C fill:#6366f1,stroke:#4f46e5,color:#fff
    style D fill:#6366f1,stroke:#4f46e5,color:#fff
    style E fill:#6366f1,stroke:#4f46e5,color:#fff
    style F fill:#6366f1,stroke:#4f46e5,color:#fff
    style G fill:#f59e0b,stroke:#d97706,color:#fff
    style H fill:#10b981,stroke:#059669,color:#fff

πŸš€ Quick Start

If you're already familiar with Docker and ML inference, here's a quick overview:

# Install GPUX
uv add gpux

# Pull a model from Hugging Face
gpux pull cardiffnlp/twitter-roberta-base-sentiment-latest

# Run inference
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
  --input '{"inputs": "I love this product!"}'

# Start HTTP server
gpux serve cardiffnlp/twitter-roberta-base-sentiment-latest --port 8080

No Configuration Needed

GPUX automatically handles model download, conversion, and configuration. For custom models, see Configuration.


πŸ“– Alternative Paths

Depending on your experience level, you can choose different paths:

New to ML inference?

Follow the tutorial in order, starting with Installation.

We'll explain every concept and provide detailed examples.

Familiar with ML deployment?

Skim through Installation and First Steps, then focus on:

ML ops expert?

Jump directly to:


πŸŽ“ After the Tutorial

Once you complete this tutorial, explore:


πŸ’¬ Get Help

Stuck? We're here to help!


Ready to begin? Let's start with Installation β†’