Sentiment Analysis¶
BERT-based text classification for sentiment analysis.
🎯 What You'll Build¶
A sentiment classifier that determines if text is positive or negative using BERT.
Example:
- Input: "I love this product!"
- Output: {sentiment: [0.1, 0.9]} (90% positive)
📦 Model Preparation¶
Export BERT Model¶
from transformers import AutoModelForSequenceClassification
from optimum.onnxruntime import ORTModelForSequenceClassification
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
# Export to ONNX
model = ORTModelForSequenceClassification.from_pretrained(
model_name,
export=True
)
model.save_pretrained("./sentiment-model")
This creates sentiment-model/model.onnx.
⚙️ Configuration¶
Create gpux.yml:
name: sentiment-analysis
version: 1.0.0
description: "BERT sentiment classification"
model:
source: ./sentiment-model/model.onnx
format: onnx
inputs:
input_ids:
type: int64
shape: [1, 128]
required: true
attention_mask:
type: int64
shape: [1, 128]
required: true
outputs:
logits:
type: float32
shape: [1, 2]
labels: [negative, positive]
runtime:
gpu:
backend: auto
memory: 2GB
🚀 Running Inference¶
Prepare Input¶
from transformers import AutoTokenizer
import json
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
text = "I love this product!"
tokens = tokenizer(text, padding="max_length", max_length=128, return_tensors="np")
# Save as JSON
input_data = {
"input_ids": tokens["input_ids"].tolist(),
"attention_mask": tokens["attention_mask"].tolist()
}
with open("input.json", "w") as f:
json.dump(input_data, f)
Build and Run¶
Output:
90% positive! ✅
🐍 Python API¶
from transformers import AutoTokenizer
from gpux import GPUXRuntime
import numpy as np
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
# Initialize runtime
runtime = GPUXRuntime("sentiment-model/model.onnx")
# Tokenize
text = "This is amazing!"
tokens = tokenizer(text, padding="max_length", max_length=128, return_tensors="np")
# Inference
result = runtime.infer({
"input_ids": tokens["input_ids"],
"attention_mask": tokens["attention_mask"]
})
# Get probabilities
logits = result["logits"][0]
probs = np.exp(logits) / np.sum(np.exp(logits))
print(f"Negative: {probs[0]:.2%}")
print(f"Positive: {probs[1]:.2%}")
🌐 Production Deployment¶
HTTP Server¶
Client¶
import requests
response = requests.post(
"http://localhost:8080/predict",
json={
"input_ids": [[101, 1045, ...]], # Tokenized
"attention_mask": [[1, 1, ...]]
}
)
print(response.json())
💡 Key Takeaways¶
Success
✅ BERT model export to ONNX ✅ Text tokenization ✅ Multi-input models ✅ Probability calculation ✅ Production serving
Next: Image Classification →