gpux inspect¶
Inspect models from registries or local projects and runtime information.
Overview¶
The gpux inspect command provides detailed information about models, their inputs/outputs, metadata, and available execution providers. It supports both registry models (pulled from Hugging Face) and local models with gpux.yml configuration.
Arguments¶
MODEL_NAME¶
Name of the model to inspect (optional). Can be:
- Registry model:
distilbert-base-uncased-finetuned-sst-2-english - Local model:
sentiment-analysis(requiresgpux.yml) - Model path:
./models/bertor/path/to/model
Behavior: - If provided: Inspects the specified model - If omitted: Shows runtime information (available providers)
Examples:
# Registry models
gpux inspect distilbert-base-uncased-finetuned-sst-2-english
gpux inspect facebook/opt-125m
gpux inspect sentence-transformers/all-MiniLM-L6-v2
# Local models
gpux inspect sentiment-analysis
gpux inspect ./models/bert
# Runtime information
gpux inspect
Options¶
--config, -c¶
Configuration file name.
- Type:
string - Default:
gpux.yml
--model, -m¶
Direct path to model file (bypasses model lookup).
- Type:
string
--json¶
Output in JSON format (useful for scripting).
- Type:
boolean - Default:
false
--verbose¶
Enable verbose output.
- Type:
boolean - Default:
false
Inspection Modes¶
1. Inspect Model by Name¶
Inspect a model using its name:
Output:
Configuration¶
| Property | Value |
|---|---|
| Name | sentiment-analysis |
| Version | 1.0.0 |
| Model Source | ./model.onnx |
| Model Format | onnx |
| GPU Memory | 2GB |
| GPU Backend | auto |
| Batch Size | 1 |
| Timeout | 30s |
Model Information¶
| Property | Value |
|---|---|
| Name | sentiment-analysis |
| Version | 1.0.0 |
| Format | onnx |
| Size | 256.0 MB |
| Path | ./model.onnx |
Input Specifications¶
| Name | Type | Shape | Required | Description |
|---|---|---|---|---|
| input_ids | int64 | [1, 128] | ✅ | Tokenized input IDs |
| attention_mask | int64 | [1, 128] | ✅ | Attention mask |
Output Specifications¶
| Name | Type | Shape | Labels | Description |
|---|---|---|---|---|
| logits | float32 | [1, 2] | negative, positive | Sentiment logits |
2. Inspect Model File¶
Inspect a model file directly (no configuration required):
Output:
Model Information¶
| Property | Value |
|---|---|
| Name | model |
| Version | 1 |
| Format | onnx |
| Size | 256.0 MB |
| Path | ./model.onnx |
Input Specifications¶
| Name | Type | Shape | Required | Description |
|---|---|---|---|---|
| input | float32 | [1, 3, 224, 224] | ✅ | N/A |
Output Specifications¶
| Name | Type | Shape | Labels | Description |
|---|---|---|---|---|
| output | float32 | [1, 1000] | N/A | N/A |
3. Inspect Runtime¶
Show available execution providers (no model name):
Output:
Available Execution Providers¶
| Provider | Available | Platform | Description |
|---|---|---|---|
| TensorrtExecutionProvider | ❌ | NVIDIA TensorRT | NVIDIA TensorRT optimization |
| CUDAExecutionProvider | ✅ | NVIDIA CUDA | NVIDIA CUDA GPU acceleration |
| ROCmExecutionProvider | ❌ | AMD ROCm | AMD GPU acceleration |
| CoreMLExecutionProvider | ❌ | Apple CoreML | Apple Silicon optimization |
| DmlExecutionProvider | ❌ | DirectML | Windows DirectX acceleration |
| OpenVINOExecutionProvider | ❌ | Intel OpenVINO | Intel hardware acceleration |
| CPUExecutionProvider | ✅ | CPU | Universal CPU fallback |
Provider Priority¶
| Priority | Provider | Status |
|---|---|---|
| 1 | TensorrtExecutionProvider | Not Available |
| 2 | CUDAExecutionProvider | Available |
| 3 | ROCmExecutionProvider | Not Available |
| 4 | CoreMLExecutionProvider | Not Available |
| 5 | DmlExecutionProvider | Not Available |
| 6 | OpenVINOExecutionProvider | Not Available |
| 7 | CPUExecutionProvider | Available |
JSON Output¶
Model Inspection (JSON)¶
Output:
{
"config": {
"name": "sentiment-analysis",
"version": "1.0.0",
"model": {
"source": "./model.onnx",
"format": "onnx"
},
"inputs": {
"input_ids": {
"type": "int64",
"shape": [1, 128],
"required": true
}
},
"outputs": {
"logits": {
"type": "float32",
"shape": [1, 2],
"labels": ["negative", "positive"]
}
},
"runtime": {
"gpu": {
"memory": "2GB",
"backend": "auto"
},
"batch_size": 1,
"timeout": 30
}
},
"model_info": {
"name": "sentiment-analysis",
"version": "1.0.0",
"format": "onnx",
"size_mb": 256.0,
"path": "./model.onnx",
"inputs": [
{
"name": "input_ids",
"type": "int64",
"shape": [1, 128],
"required": true,
"description": "Tokenized input IDs"
}
],
"outputs": [
{
"name": "logits",
"type": "float32",
"shape": [1, 2],
"labels": ["negative", "positive"]
}
]
}
}
Runtime Inspection (JSON)¶
Output:
{
"available_providers": [
"CUDAExecutionProvider",
"CPUExecutionProvider"
],
"provider_details": {
"TensorrtExecutionProvider": {
"available": false,
"platform": "NVIDIA TensorRT",
"description": "NVIDIA TensorRT optimization"
},
"CUDAExecutionProvider": {
"available": true,
"platform": "NVIDIA CUDA",
"description": "NVIDIA CUDA GPU acceleration"
},
"CPUExecutionProvider": {
"available": true,
"platform": "CPU",
"description": "Universal CPU fallback"
}
}
}
Examples¶
Inspect Sentiment Model¶
Inspect ONNX File Directly¶
Check Available Providers¶
JSON Output for Scripting¶
Save Inspection to File¶
Check if GPU is Available¶
gpux inspect --json | jq '.available_providers | contains(["CUDAExecutionProvider"])'
# Output: true or false
Use Cases¶
1. Verify Model Inputs/Outputs¶
Before running inference, check expected inputs:
2. Debug Configuration Issues¶
Verify configuration is correctly parsed:
3. Check Provider Availability¶
Ensure GPU providers are available:
4. Automate Model Validation¶
Use JSON output in CI/CD:
#!/bin/bash
SIZE=$(gpux inspect sentiment --json | jq '.model_info.size_mb')
if (( $(echo "$SIZE > 500" | bc -l) )); then
echo "Error: Model too large ($SIZE MB)"
exit 1
fi
5. Generate Model Documentation¶
Extract model specs for documentation:
Error Handling¶
Model Not Found¶
Solution: Ensure the model exists and gpux.yml is configured.
Model File Not Found¶
Solution: Check the model.source path in gpux.yml.
Invalid Model File¶
Solution: Verify the ONNX model is valid:
Best Practices¶
Inspect Before Running
Always inspect a model before running inference to understand its inputs/outputs:
Use JSON for Automation
Use --json flag for scripting and automation:
Check Providers Before Deployment
Verify GPU providers are available on target platform:
Save Model Info
Save inspection results for documentation:
Output Fields¶
Model Information¶
name: Model nameversion: Model versionformat: Model format (onnx)size_mb: Model size in megabytespath: Path to model file
Input Specifications¶
name: Input nametype: Data type (float32, int64, etc.)shape: Tensor shaperequired: Whether input is requireddescription: Input description
Output Specifications¶
name: Output nametype: Data typeshape: Tensor shapelabels: Class labels (if applicable)description: Output description
Provider Information¶
provider: Provider nameavailable: Whether provider is availableplatform: Platform/hardware typedescription: Provider description
Related Commands¶
gpux build- Build and validate modelsgpux run- Run inferencegpux serve- Start HTTP server