Ultra-fast LLM inference — Llama 3.3, DeepSeek R1, Gemma 2, Whisper, and PlayAI TTS. Supports tool use, JSON mode, reasoning, and web search.
View documentationUse Groq when your agent needs the fastest possible LLM inference — rapid iteration, high-throughput pipelines, real-time applications, or latency-sensitive workflows. Great for agents that chain multiple LLM calls.
Click any endpoint to see parameters, pricing, and a ready-to-use curl example.
Ultra-fast chat completions with tool use, JSON mode, reasoning.
/services/groq/groq/chatapi.indiegent.com/services/groqUSDC · Automatic
Payment protocols handled by IndieGent
Tempo