Skip to content

First Steps

Get started with GPUX in under 2 minutes by pulling a model from Hugging Face!


๐ŸŽฏ What You'll Build

By the end of this guide, you'll have:

  • โœ… Pulled a model from Hugging Face registry
  • โœ… Run inference on a real model
  • โœ… Served your model via HTTP API

๐Ÿš€ Quick Start: Pull from Hugging Face

The fastest way to get started is to pull a pre-trained model from Hugging Face:

# Pull a modern sentiment analysis model (RoBERTa-based)
gpux pull cardiffnlp/twitter-roberta-base-sentiment-latest

Expected output:

โ•ญโ”€ Pulling Model โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Registry: huggingface                                           โ”‚
โ”‚ Model: cardiffnlp/twitter-roberta-base-sentiment-latest         โ”‚
โ”‚ Size: ~500 MB                                                    โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

๐Ÿ“ฅ Downloading model files...
โœ… Model downloaded successfully!

๐Ÿ”„ Converting to ONNX...
โœ… Conversion completed!

๐Ÿ“ Generating configuration...
โœ… Configuration saved to: ~/.gpux/models/cardiffnlp/twitter-roberta-base-sentiment-latest/gpux.yml

Run Inference

# Run sentiment analysis
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
  --input '{"inputs": "I love this product!"}'

Expected output:

{
  "label": "POSITIVE",
  "score": 0.95
}

Serve Your Model

# Start HTTP server
gpux serve cardiffnlp/twitter-roberta-base-sentiment-latest --port 8080

Test the API:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"inputs": "This is amazing!"}'

Congratulations! ๐ŸŽ‰

You just pulled, ran, and served a real ML model in under 2 minutes!


๐Ÿ› ๏ธ Advanced: Create Your Own Model

Advanced Feature

This section is for users who want to create their own ONNX models and configure them manually. Most users will use gpux pull instead. Skip this section if you're just getting started.

For this tutorial, we'll create a simple linear regression model. Don't worry if you're not familiar with machine learning - this is just for demonstration!

Create a file named create_model.py:

"""Create a simple ONNX model for GPUX tutorial."""
import torch
import torch.nn as nn

# Define a simple linear model
class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(10, 2)  # 10 inputs, 2 outputs

    def forward(self, x):
        return self.linear(x)

# Create model instance
model = SimpleModel()
model.eval()

# Create dummy input
dummy_input = torch.randn(1, 10)

# Export to ONNX
torch.onnx.export(
    model,
    dummy_input,
    "model.onnx",
    input_names=["input"],
    output_names=["output"],
    dynamic_axes={
        "input": {0: "batch_size"},
        "output": {0: "batch_size"}
    }
)

print("โœ… Model exported to model.onnx")

Run the script:

# Install PyTorch if needed
uv add torch

# Create the model
python create_model.py

Option 2: Download Example Model

Alternatively, download a pre-made example model:

# Download example model (sentiment analysis)
curl -o model.onnx https://github.com/onnx/models/raw/main/vision/classification/mobilenet/model/mobilenetv2-7.onnx

Using Your Own Model

If you already have an ONNX model, just copy it to this directory and rename it to model.onnx.


๐Ÿ“ Create Configuration File

Now create a gpux.yml file to configure your model:

# gpux.yml - Configuration for GPUX
name: my-first-model
version: 1.0.0
description: "My first GPUX model"

model:
  source: ./model.onnx
  format: onnx

inputs:
  input:
    type: float32
    shape: [1, 10]
    required: true
    description: "10-dimensional input vector"

outputs:
  output:
    type: float32
    shape: [1, 2]
    description: "2-dimensional output vector"

runtime:
  gpu:
    memory: 2GB
    backend: auto  # Automatically select best GPU
  batch_size: 1
  timeout: 30

Configuration Explained

  • name: Your model's name (used in CLI commands)
  • model.source: Path to your ONNX model file
  • inputs: Define input tensor specifications
  • outputs: Define output tensor specifications
  • runtime: GPU and performance settings

๐Ÿ—๏ธ Build Your Model

Validate and build your GPUX project:

gpux build .

Expected output:

โ•ญโ”€ Model Information โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Name      โ”‚ my-first-model                              โ”‚
โ”‚ Version   โ”‚ 1.0.0                                       โ”‚
โ”‚ Format    โ”‚ onnx                                        โ”‚
โ”‚ Size      โ”‚ 0.1 MB                                      โ”‚
โ”‚ Inputs    โ”‚ 1                                           โ”‚
โ”‚ Outputs   โ”‚ 1                                           โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

โ•ญโ”€ Execution Provider โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Provider    โ”‚ CoreMLExecutionProvider                   โ”‚
โ”‚ Platform    โ”‚ Apple Silicon                             โ”‚
โ”‚ Available   โ”‚ โœ… Yes                                    โ”‚
โ”‚ Description โ”‚ Optimized for Apple devices              โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

โœ… Build completed successfully!
Build artifacts saved to: .gpux

What Just Happened?

GPUX: 1. โœ… Validated your gpux.yml configuration 2. โœ… Inspected your ONNX model 3. โœ… Detected the best GPU provider (or CPU) 4. โœ… Saved build artifacts to .gpux/ directory


๐Ÿš€ Run Your First Inference

Now let's run inference on your model!

Create Input Data

Create a file named input.json:

{
  "input": [[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]]
}

Run Inference

gpux run my-first-model --file input.json

Expected output:

{
  "output": [
    [0.123, -0.456]
  ]
}

Congratulations! ๐ŸŽ‰

You just ran your first inference with GPUX!

Alternative: Inline Input

You can also provide input directly via the command line:

gpux run my-first-model --input '{"input": [[1,2,3,4,5,6,7,8,9,10]]}'

๐Ÿ” Inspect Your Model

Get detailed information about your model:

gpux inspect my-first-model

Expected output:

โ•ญโ”€ Model Information โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Name      โ”‚ my-first-model                              โ”‚
โ”‚ Version   โ”‚ 1.0.0                                       โ”‚
โ”‚ Path      โ”‚ ./model.onnx                                โ”‚
โ”‚ Size      โ”‚ 0.1 MB                                      โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

โ•ญโ”€ Input Specifications โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Name  โ”‚ Type    โ”‚ Shape     โ”‚ Required โ”‚
โ”‚ input โ”‚ float32 โ”‚ [1, 10]   โ”‚ โœ…       โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

โ•ญโ”€ Output Specifications โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Name   โ”‚ Type    โ”‚ Shape    โ”‚
โ”‚ output โ”‚ float32 โ”‚ [1, 2]   โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

โ•ญโ”€ Runtime Information โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Provider  โ”‚ CoreMLExecutionProvider                     โ”‚
โ”‚ Backend   โ”‚ auto                                        โ”‚
โ”‚ GPU Memoryโ”‚ 2GB                                         โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

๐Ÿ“‚ Your Project Structure

After completing these steps, your project should look like this:

my-first-model/
โ”œโ”€โ”€ model.onnx           # Your ONNX model
โ”œโ”€โ”€ gpux.yml             # GPUX configuration
โ”œโ”€โ”€ input.json           # Sample input data
โ”œโ”€โ”€ create_model.py      # Model creation script (optional)
โ””โ”€โ”€ .gpux/               # Build artifacts (auto-generated)
    โ”œโ”€โ”€ model_info.json
    โ””โ”€โ”€ provider_info.json

๐ŸŽ“ Understanding the Workflow

Here's what happens when you use GPUX with registry models:

graph LR
    A[gpux pull model-id] --> B[Download Model]
    B --> C[Convert to ONNX]
    C --> D[Generate Config]
    D --> E[Cache Model]
    E --> F[gpux run model-id]
    G[input.json] --> F
    F --> H[Load from Cache]
    H --> I[Run Inference]
    I --> J[Return Results]

    E --> K[gpux serve model-id]
    K --> L[Start HTTP Server]
    L --> M[Handle API Requests]

    style A fill:#6366f1,stroke:#4f46e5,color:#fff
    style G fill:#6366f1,stroke:#4f46e5,color:#fff
    style J fill:#10b981,stroke:#059669,color:#fff
    style M fill:#10b981,stroke:#059669,color:#fff

Local Project Workflow

For local projects with gpux.yml:

graph LR
    A[gpux.yml] --> B[gpux build]
    C[model.onnx] --> B
    B --> D[Validate Config]
    D --> E[Inspect Model]
    E --> F[Select Provider]
    F --> G[Save Build Info]
    G --> H[gpux run]
    I[input.json] --> H
    H --> J[Load Model]
    J --> K[Run Inference]
    K --> L[Return Results]

    style A fill:#6366f1,stroke:#4f46e5,color:#fff
    style C fill:#6366f1,stroke:#4f46e5,color:#fff
    style I fill:#6366f1,stroke:#4f46e5,color:#fff
    style L fill:#10b981,stroke:#059669,color:#fff

โœจ Try Different Model Types

Now that you have a working GPUX setup, try these different model types:

๐Ÿ“ Text Models

Emotion Analysis

# 6 emotion categories (joy, sadness, anger, fear, surprise, neutral)
gpux pull j-hartmann/emotion-english-distilroberta-base
gpux run j-hartmann/emotion-english-distilroberta-base \
  --input '{"inputs": "I feel great today!"}'

Text Generation

# Small, efficient language model
gpux pull microsoft/phi-2
gpux run microsoft/phi-2 \
  --input '{"inputs": "The future of AI is"}'

Embeddings

# Modern embedding model
gpux pull BAAI/bge-small-en-v1.5
gpux run BAAI/bge-small-en-v1.5 \
  --input '{"inputs": "Hello world"}'

๐ŸŽค Audio Models

Speech Recognition

# Whisper for speech-to-text transcription
gpux pull openai/whisper-base
gpux run openai/whisper-base \
  --input '{"audio": "path/to/audio.wav"}'

Audio Classification

# Emotion recognition in audio
gpux pull superb/hubert-base-superb-er
gpux run superb/hubert-base-superb-er \
  --input '{"audio": "path/to/audio.wav"}'

๐Ÿ–ผ๏ธ Image Models

Image Classification

# Vision Transformer for image classification
gpux pull google/vit-base-patch16-224
gpux run google/vit-base-patch16-224 \
  --input '{"image": "path/to/image.jpg"}'

Object Detection

# DETR for object detection
gpux pull facebook/detr-resnet-50
gpux run facebook/detr-resnet-50 \
  --input '{"image": "path/to/image.jpg"}'

๐Ÿ” Inspect Models

Get detailed information about any model:

# Get detailed model information
gpux inspect cardiffnlp/twitter-roberta-base-sentiment-latest

โš™๏ธ Use Different Providers

# Force CPU provider
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
  --input '{"inputs": "test"}' \
  --provider cpu

# Force specific GPU provider
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
  --input '{"inputs": "test"}' \
  --provider cuda

๐Ÿ› ๏ธ Advanced: Using Your Own Models

If you have your own ONNX model, you can create a gpux.yml configuration file. This is an advanced feature - most users will use gpux pull instead.

See Configuration Guide for details.


๐Ÿ› Troubleshooting

Model file not found

Error: Model file not found: ./model.onnx

Solution: Make sure model.onnx exists in your project directory:

ls -lh model.onnx

Input validation failed

Error: Input mismatch. Missing: {'input'}

Solution: Check your input data matches the expected format:

# Verify input specification
gpux inspect my-first-model

# Ensure input.json has the correct key names
cat input.json

Invalid YAML

Error: Invalid YAML in configuration file

Solution: Validate your gpux.yml syntax:

# Check YAML syntax
python -c "import yaml; yaml.safe_load(open('gpux.yml'))"

๐Ÿ“š What's Next?

Great job! You've successfully pulled and run your first GPUX model. ๐ŸŽ‰

Continue learning:


๐Ÿ’ก Key Takeaways

What You Learned

โœ… How to pull models from Hugging Face with gpux pull โœ… How to run inference with gpux run โœ… How to serve models with gpux serve โœ… How to inspect model information with gpux inspect


Previous: Installation | Next: Pulling Models