Models

Browse 431+ AI models — one API key, every major provider.

Our most capable model — engineered to beat any single one. Try the flagship.

Showing 24 of 431

deepseek

DeepSeek: DeepSeek V4 Flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

text->text1.0M ctx

Input

$0.108/M

Output

$0.216/M

xiaomi

Xiaomi: MiMo-V2.5

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

textimageaudiovideo->text1.0M ctx

Input

$0.126/M

Output

$0.336/M

minimax

MiniMax: MiniMax M3

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

textimagevideo->text1.0M ctx

Input

$0.360/M

Output

$1.44/M

tencent

Tencent: Hy3 preview

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...

text->text262K ctx

Input

$0.076/M

Output

$0.252/M

anthropic

Anthropic: Claude Opus 4.7

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...

textimagefile->text1.0M ctx

Input

$6.00/M

Output

$30.00/M

z-ai

Z.ai: GLM 5.2

GLM 5.2 is a large-scale reasoning model from Z.ai. It supports text input and output with a 1M-token context window, and is suited for long-horizon agent workflows, project-level software engineering,...

text->text1.0M ctx

Input

$0.907/M

Output

$2.85/M

deepseek

DeepSeek: DeepSeek V4 Pro

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

text->text1.0M ctx

Input

$0.522/M

Output

$1.04/M

anthropic

Anthropic: Claude Opus 4.8

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...

textimagefile->text1.0M ctx

Input

$6.00/M

Output

$30.00/M

anthropic

Anthropic: Claude Sonnet 4.6

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...

textimagefile->text1.0M ctx

Input

$3.60/M

Output

$18.00/M

stepfun

StepFun: Step 3.7 Flash

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

textimagevideo->text256K ctx

Input

$0.240/M

Output

$1.38/M

openai

OpenAI: GPT-5.5

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...

textimagefile->text1.1M ctx

Input

$6.00/M

Output

$36.00/M

google

Google: Gemini 3 Flash Preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...

textimagefileaudio1.0M ctx

Input

$0.600/M

Output

$3.60/M

deepseek

DeepSeek: DeepSeek V3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

text->text131K ctx

Input

$0.275/M

Output

$0.412/M

nvidia

NVIDIA: Nemotron 3 Ultra (free)

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

text->text1.0M ctx

Input

Free

Output

Free

google

Google: Gemini 2.5 Flash Lite

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

textimagefileaudio1.0M ctx

Input

$0.120/M

Output

$0.480/M

nex-agi

Nex AGI: Nex-N2-Pro (free)

Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total. Built on the Qwen3.5 architecture, it accepts text and image input and produces...

textimage->text262K ctx

Input

Free

Output

Free

inclusionai

inclusionAI: Ring-2.6-1T (free)

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency. It is optimized for coding agents, tool...

text->text262K ctx

Input

Free

Output

Free

google

Google: Gemini 2.5 Flash

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...

textimagefileaudio1.0M ctx

Input

$0.360/M

Output

$3.00/M

poolside

Poolside: Laguna M.1 (free)

Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai/), optimized for complex software engineering tasks. Designed for agentic coding workflows, it supports tool calling and reasoning, with a 256K...

text->text262K ctx

Input

Free

Output

Free

x-ai

xAI: Grok 4.1 Fast

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using...

textimagefile->text2.0M ctx

Input

$0.240/M

Output

$0.600/M

xiaomi

Xiaomi: MiMo-V2.5-Pro

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....

text->text1.0M ctx

Input

$0.522/M

Output

$1.04/M

moonshotai

MoonshotAI: Kimi K2.6

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and...

textimage->text262K ctx

Input

$0.792/M

Output

$4.09/M

openai

OpenAI: GPT-5.4

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for...

textimagefile->text1.1M ctx

Input

$3.00/M

Output

$18.00/M

google

Google: Gemini 3.5 Flash

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

textimagefileaudio1.0M ctx

Input

$1.80/M

Output

$10.80/M