Python API¶
Using GPUX programmatically in your applications.
๐ฏ Overview¶
Complete guide to the GPUX Python API.
๐ Quick Start¶
from gpux import GPUXRuntime
# Initialize
runtime = GPUXRuntime(model_path="model.onnx")
# Inference
result = runtime.infer({"input": data})
# Cleanup
runtime.cleanup()
๐๏ธ GPUXRuntime¶
Initialization¶
runtime = GPUXRuntime(
model_path="model.onnx",
provider="auto", # or "cuda", "coreml", etc.
memory_limit="2GB",
batch_size=1,
timeout=30
)
Methods¶
infer(input_data)¶
Run inference:
batch_infer(batch_data)¶
Batch processing:
benchmark(input_data, num_runs, warmup_runs)¶
Performance testing:
get_model_info()¶
Model information:
Context Manager¶
๐ง Configuration¶
From Python¶
from gpux.config.parser import GPUXConfigParser
parser = GPUXConfigParser()
config = parser.parse_file("gpux.yml")
runtime = GPUXRuntime(
model_path=parser.get_model_path("."),
**config.runtime.dict()
)
๐งช Testing¶
import pytest
from gpux import GPUXRuntime
def test_inference():
runtime = GPUXRuntime("model.onnx")
result = runtime.infer({"input": test_data})
assert result is not None
runtime.cleanup()
๐ก Key Takeaways¶
Success
โ GPUXRuntime initialization โ Inference methods โ Batch processing โ Benchmarking โ Context managers โ Configuration
Previous: Batch Inference | Next: Error Handling โ