Skip to content

Python API

Using GPUX programmatically in your applications.


๐ŸŽฏ Overview

Complete guide to the GPUX Python API.


๐Ÿš€ Quick Start

from gpux import GPUXRuntime

# Initialize
runtime = GPUXRuntime(model_path="model.onnx")

# Inference
result = runtime.infer({"input": data})

# Cleanup
runtime.cleanup()

๐Ÿ—๏ธ GPUXRuntime

Initialization

runtime = GPUXRuntime(
    model_path="model.onnx",
    provider="auto",  # or "cuda", "coreml", etc.
    memory_limit="2GB",
    batch_size=1,
    timeout=30
)

Methods

infer(input_data)

Run inference:

result = runtime.infer({
    "input": np.array([[1, 2, 3]])
})

batch_infer(batch_data)

Batch processing:

results = runtime.batch_infer([
    {"input": data1},
    {"input": data2}
])

benchmark(input_data, num_runs, warmup_runs)

Performance testing:

metrics = runtime.benchmark(
    {"input": data},
    num_runs=1000,
    warmup_runs=100
)

get_model_info()

Model information:

info = runtime.get_model_info()
print(info.name, info.version)

Context Manager

with GPUXRuntime("model.onnx") as runtime:
    result = runtime.infer(data)

๐Ÿ”ง Configuration

From Python

from gpux.config.parser import GPUXConfigParser

parser = GPUXConfigParser()
config = parser.parse_file("gpux.yml")

runtime = GPUXRuntime(
    model_path=parser.get_model_path("."),
    **config.runtime.dict()
)

๐Ÿงช Testing

import pytest
from gpux import GPUXRuntime

def test_inference():
    runtime = GPUXRuntime("model.onnx")
    result = runtime.infer({"input": test_data})
    assert result is not None
    runtime.cleanup()

๐Ÿ’ก Key Takeaways

Success

โœ… GPUXRuntime initialization โœ… Inference methods โœ… Batch processing โœ… Benchmarking โœ… Context managers โœ… Configuration


Previous: Batch Inference | Next: Error Handling โ†’