Skip to content

Configuration Schema

Complete reference for gpux.yml configuration file.


Overview

The gpux.yml file is the single source of truth for GPUX model configuration. It defines everything from model paths to runtime settings and serving configuration.

name: string              # Required: Model name
version: string           # Optional: Model version (default: "1.0.0")
description: string       # Optional: Model description

model:                    # Required: Model configuration
  source: string          # Required: Path to model file
  format: string          # Optional: Model format (default: "onnx")
  version: string         # Optional: Model version

inputs:                   # Required: Input specifications
  - name: string          # Required: Input name
    type: string          # Required: Data type
    shape: [int]          # Optional: Tensor shape
    required: bool        # Optional: Required field (default: true)
    max_length: int       # Optional: Maximum length
    description: string   # Optional: Input description

outputs:                  # Required: Output specifications
  - name: string          # Required: Output name
    type: string          # Required: Data type
    shape: [int]          # Optional: Tensor shape
    labels: [string]      # Optional: Class labels
    description: string   # Optional: Output description

runtime:                  # Optional: Runtime configuration
  gpu:
    memory: string        # GPU memory limit (default: "2GB")
    backend: string       # GPU backend (default: "auto")
  timeout: int            # Timeout in seconds (default: 30)
  batch_size: int         # Batch size (default: 1)
  enable_profiling: bool  # Enable profiling (default: false)

serving:                  # Optional: HTTP serving configuration
  port: int               # Server port (default: 8080)
  host: string            # Server host (default: "0.0.0.0")
  batch_size: int         # Serving batch size (default: 1)
  timeout: int            # Request timeout (default: 5)
  max_workers: int        # Max worker processes (default: 4)

preprocessing:            # Optional: Preprocessing configuration
  tokenizer: string       # Tokenizer name
  max_length: int         # Max tokenization length
  resize: [int, int]      # Image resize dimensions
  normalize: string       # Normalization method
  custom: {}              # Custom preprocessing config

metadata:                 # Optional: Custom metadata
  key: value              # Any custom key-value pairs

Minimal Example

name: sentiment-analysis
version: 1.0.0

model:
  source: ./model.onnx

inputs:
  - name: text
    type: string
    required: true

outputs:
  - name: sentiment
    type: float32
    shape: [2]
    labels: [negative, positive]

Complete Example

name: sentiment-analysis
version: 1.0.0
description: BERT-based sentiment analysis model

model:
  source: ./model.onnx
  format: onnx
  version: 1.0.0

inputs:
  - name: input_ids
    type: int64
    shape: [1, 128]
    required: true
    description: Tokenized input IDs
  - name: attention_mask
    type: int64
    shape: [1, 128]
    required: true
    description: Attention mask for input

outputs:
  - name: logits
    type: float32
    shape: [1, 2]
    labels: [negative, positive]
    description: Sentiment classification logits

runtime:
  gpu:
    memory: 2GB
    backend: auto
  timeout: 30
  batch_size: 1
  enable_profiling: false

serving:
  port: 8080
  host: 0.0.0.0
  batch_size: 1
  timeout: 5
  max_workers: 4

preprocessing:
  tokenizer: bert-base-uncased
  max_length: 128

metadata:
  author: GPUX Team
  license: MIT
  dataset: SST-2

Top-Level Fields

name (required)

Model name used for identification.

  • Type: string
  • Required: Yes
  • Example: sentiment-analysis, image-classifier
name: sentiment-analysis

version

Model version following semantic versioning.

  • Type: string
  • Required: No
  • Default: 1.0.0
  • Example: 1.0.0, 2.1.3
version: 1.0.0

description

Human-readable model description.

  • Type: string
  • Required: No
  • Example: BERT-based sentiment analysis
description: BERT-based sentiment analysis model for binary classification

Section References

Detailed documentation for each configuration section:

  • Model - Model source and format configuration
  • Inputs - Input specifications and validation
  • Outputs - Output specifications and labels
  • Runtime - GPU, timeout, and batch settings
  • Serving - HTTP server configuration
  • Preprocessing - Data preprocessing settings

Data Types

Supported Types

GPUX supports the following data types:

Type Description Example
float32 32-bit floating point [0.5, 1.2, -0.3]
float64 64-bit floating point [0.123456789]
int32 32-bit integer [1, 2, 3]
int64 64-bit integer [100, 200, 300]
uint8 8-bit unsigned integer [0, 255]
bool Boolean [true, false]
string Text string ["hello", "world"]

Type Conversion

GPUX automatically converts compatible types:

  • Python lists → NumPy arrays
  • JSON numbers → float32/int64
  • JSON strings → string tensors

Shape Specifications

Fixed Shapes

Specify exact tensor dimensions:

inputs:
  - name: image
    type: float32
    shape: [1, 3, 224, 224]  # [batch, channels, height, width]

Dynamic Shapes

Use -1 or omit shape for dynamic dimensions:

inputs:
  - name: text
    type: int64
    shape: [1, -1]  # Variable sequence length

Or omit shape entirely:

inputs:
  - name: text
    type: int64  # Fully dynamic shape

Alternative Syntax

Dict-Style Inputs/Outputs

You can also use dictionary syntax for inputs and outputs:

inputs:
  input_ids:
    type: int64
    shape: [1, 128]
    required: true
  attention_mask:
    type: int64
    shape: [1, 128]

outputs:
  logits:
    type: float32
    shape: [1, 2]
    labels: [negative, positive]

This is equivalent to the list syntax:

inputs:
  - name: input_ids
    type: int64
    shape: [1, 128]
    required: true
  - name: attention_mask
    type: int64
    shape: [1, 128]

outputs:
  - name: logits
    type: float32
    shape: [1, 2]
    labels: [negative, positive]

Validation Rules

Required Fields

The following fields are required:

  • name - Model name
  • model.source - Model file path
  • inputs - At least one input
  • outputs - At least one output

Input Validation

  • At least one input must be specified
  • Each input must have name and type
  • shape is optional but recommended
  • required defaults to true

Output Validation

  • At least one output must be specified
  • Each output must have name and type
  • labels should match output shape

Memory Validation

GPU memory must be specified with units:

runtime:
  gpu:
    memory: 2GB    # ✅ Valid
    memory: 512MB  # ✅ Valid
    memory: 1024KB # ✅ Valid
    memory: 2      # ❌ Invalid (missing units)

Environment Variables

You can use environment variables in configuration:

model:
  source: ${MODEL_PATH}/model.onnx

runtime:
  gpu:
    memory: ${GPU_MEMORY:-2GB}  # Default to 2GB

serving:
  port: ${PORT:-8080}

File Paths

Relative Paths

Paths are relative to the gpux.yml file:

model:
  source: ./model.onnx        # Same directory
  source: ./models/model.onnx # Subdirectory
  source: ../model.onnx       # Parent directory

Absolute Paths

You can also use absolute paths:

model:
  source: /opt/models/sentiment.onnx

Common Patterns

Image Classification

name: image-classifier
model:
  source: ./resnet50.onnx
inputs:
  - name: image
    type: float32
    shape: [1, 3, 224, 224]
outputs:
  - name: probabilities
    type: float32
    shape: [1, 1000]
preprocessing:
  resize: [224, 224]
  normalize: imagenet

Text Classification

name: sentiment-analysis
model:
  source: ./bert.onnx
inputs:
  - name: input_ids
    type: int64
    shape: [1, 128]
  - name: attention_mask
    type: int64
    shape: [1, 128]
outputs:
  - name: logits
    type: float32
    shape: [1, 2]
    labels: [negative, positive]
preprocessing:
  tokenizer: bert-base-uncased
  max_length: 128

Object Detection

name: yolo-detector
model:
  source: ./yolov8.onnx
inputs:
  - name: images
    type: float32
    shape: [1, 3, 640, 640]
outputs:
  - name: boxes
    type: float32
    shape: [-1, 4]
  - name: scores
    type: float32
    shape: [-1]
  - name: classes
    type: int64
    shape: [-1]
preprocessing:
  resize: [640, 640]

Best Practices

Always Specify Shapes

Include shape information for better validation and performance:

inputs:
  - name: input
    type: float32
    shape: [1, 10]  # ✅ Recommended

Use Descriptive Names

Choose clear, descriptive names for inputs/outputs:

inputs:
  - name: input_ids        # ✅ Clear
    # vs
  - name: x                # ❌ Unclear

Document with Descriptions

Add descriptions for complex inputs/outputs:

inputs:
  - name: attention_mask
    type: int64
    description: Binary mask indicating valid tokens

GPU Memory Limits

Set appropriate GPU memory limits based on model size:

runtime:
  gpu:
    memory: 2GB  # Adjust based on model


Troubleshooting

Validation Errors

Error: "At least one input must be specified"

# ❌ Missing inputs
name: model
model:
  source: ./model.onnx

# ✅ Fixed
name: model
model:
  source: ./model.onnx
inputs:
  - name: input
    type: float32

Error: "Memory must be specified as GB, MB, or KB"

# ❌ Missing units
runtime:
  gpu:
    memory: 2

# ✅ Fixed
runtime:
  gpu:
    memory: 2GB


See Also