Platform Guides¶

Platform-specific optimization guides for GPUX.

Overview¶

GPUX provides universal GPU compatibility across all major platforms. Each platform has specific optimizations and execution providers for best performance.

Supported Platforms¶

🟢 NVIDIA GPUs¶

Execution Providers: TensorRT, CUDA

Best performance with TensorRT optimization
Wide range of GPU support (GeForce, Quadro, Tesla)
Excellent for production deployments

→ NVIDIA Guide

🔴 AMD GPUs¶

Execution Provider: ROCm

Native AMD GPU acceleration
Support for Radeon and Instinct series
Linux-focused deployment

→ AMD Guide

🍎 Apple Silicon¶

Execution Provider: CoreML

Optimized for M1/M2/M3/M4 chips
Neural Engine acceleration
Excellent power efficiency

→ Apple Silicon Guide

🔷 Intel GPUs¶

Execution Provider: OpenVINO

Support for Intel Iris and Arc GPUs
CPU + iGPU optimization
Cross-platform support

→ Intel Guide

🪟 Windows DirectML¶

Execution Provider: DirectML

Universal Windows GPU support
Works with any DirectX 12 compatible GPU
Fallback for non-NVIDIA Windows systems

→ Windows Guide

💻 CPU-Only¶

Execution Provider: CPU

Universal fallback
No GPU required
Multi-threaded optimization

→ CPU Guide

Platform Comparison¶

Platform	Provider	Performance	Power Efficiency	Ease of Setup
NVIDIA (TensorRT)	TensorrtExecutionProvider	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐
NVIDIA (CUDA)	CUDAExecutionProvider	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐
AMD (ROCm)	ROCmExecutionProvider	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐
Apple Silicon	CoreMLExecutionProvider	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Intel (OpenVINO)	OpenVINOExecutionProvider	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐
Windows (DirectML)	DirectMLExecutionProvider	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐
CPU	CPUExecutionProvider	⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐

Quick Start by Platform¶

NVIDIA GPU¶

# Install CUDA dependencies
uv add onnxruntime-gpu

# Configure for CUDA
runtime:
  gpu:
    backend: cuda
    memory: 4GB

Apple Silicon¶

# CoreML is built-in with onnxruntime
uv add onnxruntime

# Configure for CoreML
runtime:
  gpu:
    backend: coreml
    memory: 2GB

AMD GPU¶

# Install ROCm dependencies
uv add onnxruntime-rocm

# Configure for ROCm
runtime:
  gpu:
    backend: rocm
    memory: 4GB

Windows¶

# Install DirectML
uv add onnxruntime-directml

# Configure for DirectML
runtime:
  gpu:
    backend: directml
    memory: 2GB

Platform Detection¶

GPUX automatically detects your platform and selects the best provider:

from gpux import GPUXRuntime

# Auto-detect best provider
runtime = GPUXRuntime(model_path="model.onnx")

# Check selected provider
info = runtime.get_provider_info()
print(f"Using: {info['name']} on {info['platform']}")

Performance Tips by Platform¶

NVIDIA¶

Use TensorRT for best performance (up to 4x faster than CUDA)
Enable FP16 precision for RTX GPUs
Optimize batch size based on GPU memory

Apple Silicon¶

Use CoreML for Neural Engine acceleration
Models <1GB work best on M-series chips
Enable ANE (Apple Neural Engine) optimizations

AMD¶

Use ROCm with latest drivers
Monitor GPU utilization with rocm-smi
Linux recommended over Windows

Intel¶

Use OpenVINO for iGPU + CPU optimization
Works well with INT8 quantized models
Good for edge deployments

Windows¶

DirectML works with any DX12 GPU
Good fallback for non-NVIDIA systems
Supports NVIDIA, AMD, and Intel GPUs

Environment Setup¶

Linux (NVIDIA)¶

# Install CUDA toolkit
sudo apt-get install cuda-toolkit-12-0

# Install GPUX with GPU support
uv add onnxruntime-gpu

macOS (Apple Silicon)¶

# No additional setup needed
uv add onnxruntime

Linux (AMD)¶

# Install ROCm
sudo apt-get install rocm-dkms

# Install GPUX with ROCm
uv add onnxruntime-rocm

Windows¶

# Install DirectML runtime
# (Usually included with Windows 10/11)

# Install GPUX with DirectML
uv add onnxruntime-directml

Troubleshooting¶

Provider Not Available¶

Check available providers:

gpux inspect

GPU Not Detected¶

Verify GPU and drivers:

# NVIDIA
nvidia-smi

# AMD
rocm-smi

# Intel
clinfo

Performance Issues¶

Check GPU utilization
Verify correct provider is selected
Optimize batch size
Consider model quantization

Platform Guides¶

Overview¶

Supported Platforms¶

🟢 NVIDIA GPUs¶

🔴 AMD GPUs¶

🍎 Apple Silicon¶

🔷 Intel GPUs¶

🪟 Windows DirectML¶

💻 CPU-Only¶

Platform Comparison¶

Quick Start by Platform¶

NVIDIA GPU¶

Apple Silicon¶

AMD GPU¶

Windows¶

Platform Detection¶

Performance Tips by Platform¶

NVIDIA¶

Apple Silicon¶

AMD¶

Intel¶

Windows¶

Environment Setup¶

Linux (NVIDIA)¶

macOS (Apple Silicon)¶

Linux (AMD)¶

Windows¶

Troubleshooting¶

Provider Not Available¶

GPU Not Detected¶

Performance Issues¶

See Also¶