GPU Providers¶
Understanding execution providers and how GPUX selects the best backend for your hardware.
🎯 What You'll Learn¶
- ✅ What execution providers are
- ✅ Available providers and platforms
- ✅ Provider selection logic
- ✅ Platform-specific optimization
- ✅ Troubleshooting provider issues
🧠 What are Execution Providers?¶
Execution providers are backends that execute ONNX models on specific hardware:
graph LR
A[ONNX Model] --> B[ONNX Runtime]
B --> C{Provider}
C -->|NVIDIA| D[TensorRT/CUDA]
C -->|Apple| E[CoreML]
C -->|AMD| F[ROCm]
C -->|Intel| G[OpenVINO]
C -->|Windows| H[DirectML]
C -->|Fallback| I[CPU]
📋 Available Providers¶
Priority Order¶
GPUX selects providers in this order:
- TensorrtExecutionProvider - NVIDIA TensorRT (best performance)
- CUDAExecutionProvider - NVIDIA CUDA
- ROCmExecutionProvider - AMD ROCm
- CoreMLExecutionProvider - Apple Silicon
- DirectMLExecutionProvider - Windows DirectML
- OpenVINOExecutionProvider - Intel OpenVINO
- CPUExecutionProvider - CPU fallback
Provider Details¶
| Provider | Hardware | OS | Performance |
|---|---|---|---|
| TensorRT | NVIDIA GPU | Linux, Windows | ⭐⭐⭐⭐⭐ |
| CUDA | NVIDIA GPU | Linux, Windows | ⭐⭐⭐⭐ |
| ROCm | AMD GPU | Linux | ⭐⭐⭐⭐ |
| CoreML | Apple Silicon | macOS | ⭐⭐⭐⭐ |
| DirectML | Any GPU | Windows | ⭐⭐⭐ |
| OpenVINO | Intel GPU/CPU | All | ⭐⭐⭐ |
| CPU | Any | All | ⭐⭐ |
🔍 Provider Selection¶
Automatic Selection¶
GPUX automatically selects the best provider:
Manual Selection¶
Force a specific provider:
# Build with specific provider
gpux build . --provider cuda
# Or in gpux.yml
runtime:
gpu:
backend: cuda
Check Selected Provider¶
🖥️ Platform-Specific Guides¶
NVIDIA GPUs¶
Requirements: - CUDA 11.8+ or 12.x - cuDNN 8.x - NVIDIA drivers 520+
Install CUDA Runtime:
Configuration:
TensorRT Optimization:
Apple Silicon (M1/M2/M3)¶
Requirements: - macOS 12.0+ - Apple Silicon Mac
CoreML is built-in:
Performance: - ✅ Excellent for small-medium models - ✅ Low power consumption - ✅ Unified memory architecture
AMD GPUs¶
Requirements: - ROCm 5.4+ - Supported AMD GPU
Install ROCm Runtime:
Configuration:
Intel GPUs¶
Requirements: - Intel GPU (integrated or Arc) - OpenVINO toolkit
Install OpenVINO:
Configuration:
Windows (DirectML)¶
Requirements: - Windows 10/11 - DirectX 12 compatible GPU
DirectML works with: - NVIDIA GPUs - AMD GPUs - Intel GPUs
Configuration:
⚙️ Provider Configuration¶
CUDA Configuration¶
from gpux.core.providers import ProviderManager
manager = ProviderManager()
cuda_config = {
'device_id': 0, # GPU device ID
'cudnn_conv_algo_search': 'EXHAUSTIVE',
'do_copy_in_default_stream': True
}
TensorRT Configuration¶
tensorrt_config = {
'trt_max_workspace_size': 1 << 30, # 1GB
'trt_fp16_enable': True, # FP16 optimization
'trt_engine_cache_enable': True # Cache compiled engines
}
CoreML Configuration¶
🔄 Fallback Behavior¶
If preferred provider fails, GPUX falls back to CPU:
Example:
🐛 Troubleshooting¶
Provider Not Available¶
Check available providers:
Install missing provider:
# NVIDIA
pip install onnxruntime-gpu
# AMD
pip install onnxruntime-rocm
# Intel
pip install onnxruntime-openvino
Provider Selection Failed¶
Error: No execution providers available
Solution: 1. Verify drivers installed 2. Check ONNX Runtime version 3. Try CPU fallback
Performance Issues¶
Compare providers:
# Benchmark each provider
gpux build . --provider cuda
gpux run model --benchmark --runs 1000
gpux build . --provider cpu
gpux run model --benchmark --runs 1000
📊 Performance Comparison¶
Example inference times for ResNet-50:
| Provider | Time (ms) | Speedup |
|---|---|---|
| TensorRT | 2.1 | 47x |
| CUDA | 4.5 | 22x |
| CoreML | 8.3 | 12x |
| DirectML | 15.2 | 6.5x |
| OpenVINO | 18.7 | 5.3x |
| CPU | 98.5 | 1x |
Results vary by hardware and model
💡 Key Takeaways¶
What You Learned
✅ Execution providers explained ✅ Provider priority and selection ✅ Platform-specific setup ✅ Configuration options ✅ Troubleshooting provider issues ✅ Performance comparison
Previous: Models | Next: Inputs & Outputs →