First Steps¶
Get started with GPUX in under 2 minutes by pulling a model from Hugging Face!
๐ฏ What You'll Build¶
By the end of this guide, you'll have:
- โ Pulled a model from Hugging Face registry
- โ Run inference on a real model
- โ Served your model via HTTP API
๐ Quick Start: Pull from Hugging Face¶
The fastest way to get started is to pull a pre-trained model from Hugging Face:
# Pull a modern sentiment analysis model (RoBERTa-based)
gpux pull cardiffnlp/twitter-roberta-base-sentiment-latest
Expected output:
โญโ Pulling Model โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Registry: huggingface โ
โ Model: cardiffnlp/twitter-roberta-base-sentiment-latest โ
โ Size: ~500 MB โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
๐ฅ Downloading model files...
โ
Model downloaded successfully!
๐ Converting to ONNX...
โ
Conversion completed!
๐ Generating configuration...
โ
Configuration saved to: ~/.gpux/models/cardiffnlp/twitter-roberta-base-sentiment-latest/gpux.yml
Run Inference¶
# Run sentiment analysis
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
--input '{"inputs": "I love this product!"}'
Expected output:
Serve Your Model¶
Test the API:
curl -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"inputs": "This is amazing!"}'
Congratulations! ๐
You just pulled, ran, and served a real ML model in under 2 minutes!
๐ ๏ธ Advanced: Create Your Own Model¶
Advanced Feature
This section is for users who want to create their own ONNX models and configure them manually.
Most users will use gpux pull instead. Skip this section if you're just getting started.
For this tutorial, we'll create a simple linear regression model. Don't worry if you're not familiar with machine learning - this is just for demonstration!
Option 1: Using PyTorch (Recommended)¶
Create a file named create_model.py:
"""Create a simple ONNX model for GPUX tutorial."""
import torch
import torch.nn as nn
# Define a simple linear model
class SimpleModel(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(10, 2) # 10 inputs, 2 outputs
def forward(self, x):
return self.linear(x)
# Create model instance
model = SimpleModel()
model.eval()
# Create dummy input
dummy_input = torch.randn(1, 10)
# Export to ONNX
torch.onnx.export(
model,
dummy_input,
"model.onnx",
input_names=["input"],
output_names=["output"],
dynamic_axes={
"input": {0: "batch_size"},
"output": {0: "batch_size"}
}
)
print("โ
Model exported to model.onnx")
Run the script:
Option 2: Download Example Model¶
Alternatively, download a pre-made example model:
# Download example model (sentiment analysis)
curl -o model.onnx https://github.com/onnx/models/raw/main/vision/classification/mobilenet/model/mobilenetv2-7.onnx
Using Your Own Model
If you already have an ONNX model, just copy it to this directory and rename it to model.onnx.
๐ Create Configuration File¶
Now create a gpux.yml file to configure your model:
# gpux.yml - Configuration for GPUX
name: my-first-model
version: 1.0.0
description: "My first GPUX model"
model:
source: ./model.onnx
format: onnx
inputs:
input:
type: float32
shape: [1, 10]
required: true
description: "10-dimensional input vector"
outputs:
output:
type: float32
shape: [1, 2]
description: "2-dimensional output vector"
runtime:
gpu:
memory: 2GB
backend: auto # Automatically select best GPU
batch_size: 1
timeout: 30
Configuration Explained
name: Your model's name (used in CLI commands)model.source: Path to your ONNX model fileinputs: Define input tensor specificationsoutputs: Define output tensor specificationsruntime: GPU and performance settings
๐๏ธ Build Your Model¶
Validate and build your GPUX project:
Expected output:
โญโ Model Information โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Name โ my-first-model โ
โ Version โ 1.0.0 โ
โ Format โ onnx โ
โ Size โ 0.1 MB โ
โ Inputs โ 1 โ
โ Outputs โ 1 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Execution Provider โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Provider โ CoreMLExecutionProvider โ
โ Platform โ Apple Silicon โ
โ Available โ โ
Yes โ
โ Description โ Optimized for Apple devices โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โ
Build completed successfully!
Build artifacts saved to: .gpux
What Just Happened?
GPUX:
1. โ
Validated your gpux.yml configuration
2. โ
Inspected your ONNX model
3. โ
Detected the best GPU provider (or CPU)
4. โ
Saved build artifacts to .gpux/ directory
๐ Run Your First Inference¶
Now let's run inference on your model!
Create Input Data¶
Create a file named input.json:
Run Inference¶
Expected output:
Congratulations! ๐
You just ran your first inference with GPUX!
Alternative: Inline Input¶
You can also provide input directly via the command line:
๐ Inspect Your Model¶
Get detailed information about your model:
Expected output:
โญโ Model Information โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Name โ my-first-model โ
โ Version โ 1.0.0 โ
โ Path โ ./model.onnx โ
โ Size โ 0.1 MB โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Input Specifications โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Name โ Type โ Shape โ Required โ
โ input โ float32 โ [1, 10] โ โ
โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Output Specifications โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Name โ Type โ Shape โ
โ output โ float32 โ [1, 2] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Runtime Information โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Provider โ CoreMLExecutionProvider โ
โ Backend โ auto โ
โ GPU Memoryโ 2GB โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
๐ Your Project Structure¶
After completing these steps, your project should look like this:
my-first-model/
โโโ model.onnx # Your ONNX model
โโโ gpux.yml # GPUX configuration
โโโ input.json # Sample input data
โโโ create_model.py # Model creation script (optional)
โโโ .gpux/ # Build artifacts (auto-generated)
โโโ model_info.json
โโโ provider_info.json
๐ Understanding the Workflow¶
Here's what happens when you use GPUX with registry models:
graph LR
A[gpux pull model-id] --> B[Download Model]
B --> C[Convert to ONNX]
C --> D[Generate Config]
D --> E[Cache Model]
E --> F[gpux run model-id]
G[input.json] --> F
F --> H[Load from Cache]
H --> I[Run Inference]
I --> J[Return Results]
E --> K[gpux serve model-id]
K --> L[Start HTTP Server]
L --> M[Handle API Requests]
style A fill:#6366f1,stroke:#4f46e5,color:#fff
style G fill:#6366f1,stroke:#4f46e5,color:#fff
style J fill:#10b981,stroke:#059669,color:#fff
style M fill:#10b981,stroke:#059669,color:#fff
Local Project Workflow¶
For local projects with gpux.yml:
graph LR
A[gpux.yml] --> B[gpux build]
C[model.onnx] --> B
B --> D[Validate Config]
D --> E[Inspect Model]
E --> F[Select Provider]
F --> G[Save Build Info]
G --> H[gpux run]
I[input.json] --> H
H --> J[Load Model]
J --> K[Run Inference]
K --> L[Return Results]
style A fill:#6366f1,stroke:#4f46e5,color:#fff
style C fill:#6366f1,stroke:#4f46e5,color:#fff
style I fill:#6366f1,stroke:#4f46e5,color:#fff
style L fill:#10b981,stroke:#059669,color:#fff
โจ Try Different Model Types¶
Now that you have a working GPUX setup, try these different model types:
๐ Text Models¶
Emotion Analysis¶
# 6 emotion categories (joy, sadness, anger, fear, surprise, neutral)
gpux pull j-hartmann/emotion-english-distilroberta-base
gpux run j-hartmann/emotion-english-distilroberta-base \
--input '{"inputs": "I feel great today!"}'
Text Generation¶
# Small, efficient language model
gpux pull microsoft/phi-2
gpux run microsoft/phi-2 \
--input '{"inputs": "The future of AI is"}'
Embeddings¶
# Modern embedding model
gpux pull BAAI/bge-small-en-v1.5
gpux run BAAI/bge-small-en-v1.5 \
--input '{"inputs": "Hello world"}'
๐ค Audio Models¶
Speech Recognition¶
# Whisper for speech-to-text transcription
gpux pull openai/whisper-base
gpux run openai/whisper-base \
--input '{"audio": "path/to/audio.wav"}'
Audio Classification¶
# Emotion recognition in audio
gpux pull superb/hubert-base-superb-er
gpux run superb/hubert-base-superb-er \
--input '{"audio": "path/to/audio.wav"}'
๐ผ๏ธ Image Models¶
Image Classification¶
# Vision Transformer for image classification
gpux pull google/vit-base-patch16-224
gpux run google/vit-base-patch16-224 \
--input '{"image": "path/to/image.jpg"}'
Object Detection¶
# DETR for object detection
gpux pull facebook/detr-resnet-50
gpux run facebook/detr-resnet-50 \
--input '{"image": "path/to/image.jpg"}'
๐ Inspect Models¶
Get detailed information about any model:
โ๏ธ Use Different Providers¶
# Force CPU provider
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
--input '{"inputs": "test"}' \
--provider cpu
# Force specific GPU provider
gpux run cardiffnlp/twitter-roberta-base-sentiment-latest \
--input '{"inputs": "test"}' \
--provider cuda
๐ ๏ธ Advanced: Using Your Own Models¶
If you have your own ONNX model, you can create a gpux.yml configuration file.
This is an advanced feature - most users will use gpux pull instead.
See Configuration Guide for details.
๐ Troubleshooting¶
Model file not found¶
Error: Model file not found: ./model.onnx
Solution: Make sure model.onnx exists in your project directory:
Input validation failed¶
Error: Input mismatch. Missing: {'input'}
Solution: Check your input data matches the expected format:
# Verify input specification
gpux inspect my-first-model
# Ensure input.json has the correct key names
cat input.json
Invalid YAML¶
Error: Invalid YAML in configuration file
Solution: Validate your gpux.yml syntax:
๐ What's Next?¶
Great job! You've successfully pulled and run your first GPUX model. ๐
Continue learning:
- Pulling Models โ - Learn more about Hugging Face integration
- Running Inference โ - Advanced inference techniques
- Serving Models โ - Production deployment
- Benchmarking โ - Measure model performance
- Configuration โ - Advanced: Create custom
gpux.ymlfiles
๐ก Key Takeaways¶
What You Learned
โ
How to pull models from Hugging Face with gpux pull
โ
How to run inference with gpux run
โ
How to serve models with gpux serve
โ
How to inspect model information with gpux inspect
Previous: Installation | Next: Pulling Models