🚀 About

High-performance inference API for transformer models trained with LLM Tool. Deploy your classification, generation, and embedding models with automatic resource management, concurrent request handling, and Ollama integration for generative AI.

📊 Text Classification
🤖 Ollama Integration
GPU Acceleration
🔒 API Key Auth
📦 Client Packages
🧪 Playground
🔐
Test the inference API with your models
Your API key is stored locally and never sent to third parties
API Online
Version 2.1.0 • Endpoints: /health, /models, /infer