LLM Tool Infer API

██╗███╗ ██╗███████╗███████╗██████╗ █████╗ ██████╗ ██╗ ██║████╗ ██║██╔════╝██╔════╝██╔══██╗ ██╔══██╗██╔══██╗██║ ██║██╔██╗ ██║█████╗ █████╗ ██████╔╝ ███████║██████╔╝██║ ██║██║╚██╗██║██╔══╝ ██╔══╝ ██╔══██╗ ██╔══██║██╔═══╝ ██║ ██║██║ ╚████║██║ ███████╗██║ ██║ ██║ ██║██║ ██║ ╚═╝╚═╝ ╚═══╝╚═╝ ╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝

██╗ ██╗ ███╗ ███╗ ████████╗ ██████╗ ██████╗ ██╗ ██║ ██║ ████╗ ████║ ╚══██╔══╝██╔═══██╗██╔═══██╗██║ ██║ ██║ ██╔████╔██║ ██║ ██║ ██║██║ ██║██║ ██║ ██║ ██║╚██╔╝██║ ██║ ██║ ██║██║ ██║██║ ███████╗███████╗██║ ╚═╝ ██║ ██║ ╚██████╔╝╚██████╔╝███████╗ ╚══════╝╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚══════╝

🚀 About

High-performance inference API for transformer models trained with LLM Tool. Deploy your classification, generation, and embedding models with automatic resource management, concurrent request handling, and Ollama integration for generative AI.

📊 Text Classification

🤖 Ollama Integration

⚡ GPU Acceleration

🔒 API Key Auth

🧪 Playground

🔐

Test the inference API with your models

Your API key is stored locally and never sent to third parties