{"data":[{"id":"cmoxkjdnp006k6whd5i7sfu70","openrouterId":"deepseek/deepseek-r1-0528","slug":"deepseek-deepseek-r1-0528","name":"DeepSeek: R1 0528","description":"May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...","contextLength":163840,"pricing":{"prompt":6e-7,"completion":0.0000025800000000000003,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"9899253295","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-05-28T13:59:30.833Z","avgThroughputTps":22,"avgLatencyMs":2005.875,"isActive":true},{"id":"cmoxkjbz6002y6whdup4y8d4k","openrouterId":"google/gemini-3-flash-preview","slug":"google-gemini-3-flash-preview","name":"Google: Gemini 3 Flash Preview","description":"Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...","contextLength":1048576,"pricing":{"prompt":6e-7,"completion":0.0000036,"image":6e-7,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"986216558595","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-12-17T10:57:58.000Z","avgThroughputTps":75.5,"avgLatencyMs":1112.5,"isActive":true},{"id":"cmoxkjdec00606whda81snl5y","openrouterId":"qwen/qwen3-235b-a22b-2507","slug":"qwen-qwen3-235b-a22b-2507","name":"Qwen: Qwen3 235B A22B Instruct 2507","description":"Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...","contextLength":262144,"pricing":{"prompt":8.52e-8,"completion":1.2e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"98468569681","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-07-21T13:39:15.880Z","avgThroughputTps":36.45454545454545,"avgLatencyMs":612.9090909090909,"isActive":true},{"id":"cmoxkjdi100686whdq21sthlk","openrouterId":"morph/morph-v3-large","slug":"morph-morph-v3-large","name":"Morph: Morph V3 Large","description":"Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transformations. The model requires the prompt to be in the following format: <instruction>{instruction}</instruction> <code>{initial_code}</code>...","contextLength":262144,"pricing":{"prompt":0.0000010799999999999998,"completion":0.00000228,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"965075544","provider":"morph","authorName":"morph","authorSlug":"morph","iconUrl":"https://openrouter.ai/images/icons/morph.svg","releaseDate":"2025-07-07T13:54:18.685Z","avgThroughputTps":2590,"avgLatencyMs":390.5,"isActive":true},{"id":"cmoxkjc93003j6whd1te759k9","openrouterId":"deepseek/deepseek-v3.2","slug":"deepseek-deepseek-v3.2","name":"DeepSeek: DeepSeek V3.2","description":"DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...","contextLength":131072,"pricing":{"prompt":3.024e-7,"completion":4.536e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"962920458767","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-12-01T08:10:42.818Z","avgThroughputTps":21.666666666666668,"avgLatencyMs":1835,"isActive":true},{"id":"cmoxkje74007q6whd2d1wfpoi","openrouterId":"anthropic/claude-3.7-sonnet","slug":"anthropic-claude-3.7-sonnet","name":"Anthropic: Claude 3.7 Sonnet","description":"Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and...","contextLength":200000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"9593295506","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-02-24T13:35:10.000Z","avgThroughputTps":44,"avgLatencyMs":702.5,"isActive":true},{"id":"cmoxkjall00036whdaxh88cdz","openrouterId":"openai/gpt-chat-latest","slug":"openai-gpt-chat-latest","name":"OpenAI: GPT Chat Latest","description":"GPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always resolves to the latest Instant chat model used in ChatGPT. As OpenAI rolls out new Instant model updates...","contextLength":400000,"pricing":{"prompt":0.000006,"completion":0.000036,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"951288121","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-05-05T12:56:52.607Z","avgThroughputTps":74,"avgLatencyMs":914,"isActive":true},{"id":"cmoxkjcn3004d6whd32vrrv2g","openrouterId":"google/gemini-2.5-flash-image","slug":"google-gemini-2.5-flash-image","name":"Google: Nano Banana (Gemini 2.5 Flash Image)","description":"Gemini 2.5 Flash Image, a.k.a. \"Nano Banana,\" is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation,...","contextLength":32768,"pricing":{"prompt":3.6e-7,"completion":0.000003,"image":3.6e-7,"request":0},"modalities":["text","image->text","image"],"perWeekTokens":"9271410461","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-10-07T16:53:51.000Z","avgThroughputTps":169,"avgLatencyMs":3796,"isActive":true},{"id":"cmoxkjcgn003z6whdkjiouxf0","openrouterId":"nvidia/nemotron-nano-12b-v2-vl:free","slug":"nvidia-nemotron-nano-12b-v2-vl-free","name":"NVIDIA: Nemotron Nano 12B 2 VL (free)","description":"NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...","contextLength":128000,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"9201096125","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2025-10-28T14:19:25.723Z","avgThroughputTps":3,"avgLatencyMs":14040,"isActive":true},{"id":"cmoxkjd9b005p6whdn1o1tgtp","openrouterId":"qwen/qwen3-coder-30b-a3b-instruct","slug":"qwen-qwen3-coder-30b-a3b-instruct","name":"Qwen: Qwen3 Coder 30B A3B Instruct","description":"Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the...","contextLength":160000,"pricing":{"prompt":8.4e-8,"completion":3.24e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"9190989069","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-07-31T10:32:59.359Z","avgThroughputTps":36.666666666666664,"avgLatencyMs":1914.5,"isActive":true},{"id":"cmoxkjbje00226whdhgjro666","openrouterId":"qwen/qwen3.5-flash-02-23","slug":"qwen-qwen3.5-flash-02-23","name":"Qwen: Qwen3.5-Flash","description":"The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...","contextLength":1000000,"pricing":{"prompt":7.8e-8,"completion":3.12e-7,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"91239343171","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-25T16:09:36.000Z","avgThroughputTps":79,"avgLatencyMs":517,"isActive":true},{"id":"cmoxkjcym00526whd7cy2yvur","openrouterId":"nvidia/nemotron-nano-9b-v2","slug":"nvidia-nemotron-nano-9b-v2","name":"NVIDIA: Nemotron Nano 9B V2","description":"NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and...","contextLength":131072,"pricing":{"prompt":4.8e-8,"completion":1.92e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"904786419","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2025-09-05T17:13:27.486Z","avgThroughputTps":123,"avgLatencyMs":151,"isActive":true},{"id":"cmoxkjeyz009d6whdizef2zvb","openrouterId":"meta-llama/llama-3.1-70b-instruct","slug":"meta-llama-llama-3.1-70b-instruct","name":"Meta: Llama 3.1 70B Instruct","description":"Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong...","contextLength":131072,"pricing":{"prompt":4.8e-7,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"9030858071","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-07-22T20:00:00.000Z","avgThroughputTps":23.25,"avgLatencyMs":274.25,"isActive":true},{"id":"cmoxkjc3k00376whdss2vvaj3","openrouterId":"z-ai/glm-4.6v","slug":"z-ai-glm-4.6v","name":"Z.ai: GLM 4.6V","description":"GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts...","contextLength":131072,"pricing":{"prompt":3.6e-7,"completion":0.0000010799999999999998,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"883293316","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-12-08T10:24:22.464Z","avgThroughputTps":25.5,"avgLatencyMs":2965.75,"isActive":true},{"id":"cmoxkjchy00426whd2mnenvsr","openrouterId":"ibm-granite/granite-4.0-h-micro","slug":"ibm-granite-granite-4.0-h-micro","name":"IBM: Granite 4.0 Micro","description":"Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tuned for long...","contextLength":131000,"pricing":{"prompt":2.04e-8,"completion":1.344e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"869726002","provider":"ibm-granite","authorName":"ibm-granite","authorSlug":"ibm-granite","iconUrl":"https://openrouter.ai/images/icons/ibm-granite.svg","releaseDate":"2025-10-19T22:34:55.126Z","avgThroughputTps":29,"avgLatencyMs":487,"isActive":true},{"id":"cmoxkjf4e009p6whdoq89ao2m","openrouterId":"microsoft/wizardlm-2-8x22b","slug":"microsoft-wizardlm-2-8x22b","name":"WizardLM-2 8x22B","description":"WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing state-of-the-art opensource models. It is...","contextLength":65536,"pricing":{"prompt":7.44e-7,"completion":7.44e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"856739943","provider":"microsoft","authorName":"Microsoft","authorSlug":"microsoft","iconUrl":"https://openrouter.ai/images/icons/Microsoft.svg","releaseDate":"2024-04-15T20:00:00.000Z","avgThroughputTps":10,"avgLatencyMs":922,"isActive":true},{"id":"cmoxkjbx7002u6whdh4bddo5k","openrouterId":"bytedance-seed/seed-1.6-flash","slug":"bytedance-seed-seed-1.6-flash","name":"ByteDance Seed: Seed 1.6 Flash","description":"Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of...","contextLength":262144,"pricing":{"prompt":9e-8,"completion":3.6e-7,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"818374542","provider":"bytedance-seed","authorName":"bytedance-seed","authorSlug":"bytedance-seed","iconUrl":"https://openrouter.ai/images/icons/bytedance-seed.svg","releaseDate":"2025-12-23T10:50:11.000Z","avgThroughputTps":53.5,"avgLatencyMs":1015,"isActive":true},{"id":"cmoxkjck800476whdj5z4vx19","openrouterId":"qwen/qwen3-vl-8b-instruct","slug":"qwen-qwen3-vl-8b-instruct","name":"Qwen: Qwen3 VL 8B Instruct","description":"Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...","contextLength":256000,"pricing":{"prompt":9.6e-8,"completion":6e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"8082252716","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-10-14T13:35:08.402Z","avgThroughputTps":59.375,"avgLatencyMs":596.75,"isActive":true},{"id":"cmoxkjavu000p6whdu1vdgouu","openrouterId":"deepseek/deepseek-v4-pro","slug":"deepseek-deepseek-v4-pro","name":"DeepSeek: DeepSeek V4 Pro","description":"DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...","contextLength":1048576,"pricing":{"prompt":5.22e-7,"completion":0.000001044,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"794723733569","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2026-04-23T23:17:59.654Z","avgThroughputTps":29.833333333333332,"avgLatencyMs":1809.0416666666667,"isActive":true},{"id":"cmoxkjey0009b6whdx5j43m2c","openrouterId":"openai/gpt-4o-2024-08-06","slug":"openai-gpt-4o-2024-08-06","name":"OpenAI: GPT-4o (2024-08-06)","description":"The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respone_format. Read more [here](https://openai.com/index/introducing-structured-outputs-in-the-api/). GPT-4o (\"o\" for \"omni\") is...","contextLength":128000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"787150509","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-08-05T20:00:00.000Z","avgThroughputTps":38,"avgLatencyMs":542,"isActive":true},{"id":"cmoxkjbl600256whdvan7my6h","openrouterId":"openai/gpt-5.3-codex","slug":"openai-gpt-5.3-codex","name":"OpenAI: GPT-5.3-Codex","description":"GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...","contextLength":400000,"pricing":{"prompt":0.0000021,"completion":0.0000168,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"77155102214","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-02-24T13:52:44.481Z","avgThroughputTps":47,"avgLatencyMs":3657.25,"isActive":true},{"id":"cmoxkjcw9004x6whdn16n6adk","openrouterId":"qwen/qwen3-next-80b-a3b-instruct:free","slug":"qwen-qwen3-next-80b-a3b-instruct-free","name":"Qwen: Qwen3 Next 80B A3B Instruct (free)","description":"Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...","contextLength":262144,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"766423430","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-11T13:36:53.637Z","avgThroughputTps":22,"avgLatencyMs":1040,"isActive":true},{"id":"cmoxkjdx200746whdxoxvai13","openrouterId":"openai/gpt-4.1-mini","slug":"openai-gpt-4.1-mini","name":"OpenAI: GPT-4.1 Mini","description":"GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard...","contextLength":1047576,"pricing":{"prompt":4.8e-7,"completion":0.00000192,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"76259600573","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-04-14T13:23:01.000Z","avgThroughputTps":39.333333333333336,"avgLatencyMs":805,"isActive":true},{"id":"cmoxkjdp0006n6whdydvcrwkc","openrouterId":"google/gemma-3n-e4b-it","slug":"google-gemma-3n-e4b-it","name":"Google: Gemma 3n 4B","description":"Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...","contextLength":32768,"pricing":{"prompt":7.2e-8,"completion":1.44e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"761760362","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-05-20T17:33:44.157Z","avgThroughputTps":32,"avgLatencyMs":229,"isActive":true},{"id":"cmoxkjckp00486whd7e5jv2pt","openrouterId":"openai/gpt-5-image","slug":"openai-gpt-5-image","name":"OpenAI: GPT-5 Image","description":"[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state-of-the-art image generation capabilities. It offers major improvements in reasoning, code quality, and user experience while incorporating GPT Image 1's superior instruction following,...","contextLength":400000,"pricing":{"prompt":0.000012,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text","image"],"perWeekTokens":"76039898","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-10-14T09:19:46.029Z","avgThroughputTps":85,"avgLatencyMs":9925,"isActive":true},{"id":"cmoxkjekm008i6whd2z6bewtj","openrouterId":"meta-llama/llama-3.3-70b-instruct:free","slug":"meta-llama-llama-3.3-70b-instruct-free","name":"Meta: Llama 3.3 70B Instruct (free)","description":"The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"759012818","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-12-06T12:28:57.828Z","avgThroughputTps":17,"avgLatencyMs":2425,"isActive":true},{"id":"cmoxkjb1700106whdevavt1rb","openrouterId":"moonshotai/kimi-k2.6","slug":"moonshotai-kimi-k2.6","name":"MoonshotAI: Kimi K2.6","description":"Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and...","contextLength":262144,"pricing":{"prompt":8.76e-7,"completion":0.0000041879999999999995,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"758269608257","provider":"moonshotai","authorName":"moonshotai","authorSlug":"moonshotai","iconUrl":"https://openrouter.ai/images/icons/moonshotai.svg","releaseDate":"2026-04-20T11:36:42.832Z","avgThroughputTps":31.941176470588236,"avgLatencyMs":1308.5,"isActive":true},{"id":"cmoxkjf4u009q6whdeg8on1mg","openrouterId":"openai/gpt-4-turbo","slug":"openai-gpt-4-turbo","name":"OpenAI: GPT-4 Turbo","description":"The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling.\n\nTraining data: up to December 2023.","contextLength":128000,"pricing":{"prompt":0.000012,"completion":0.000036,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"75470234","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-04-08T20:00:00.000Z","avgThroughputTps":29,"avgLatencyMs":1152,"isActive":true},{"id":"cmoxkjd7v005m6whdljrvyisd","openrouterId":"openai/gpt-oss-20b","slug":"openai-gpt-oss-20b","name":"OpenAI: gpt-oss-20b","description":"gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for...","contextLength":131072,"pricing":{"prompt":3.6e-8,"completion":1.68e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"74166651749","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-05T13:17:09.000Z","avgThroughputTps":150.25,"avgLatencyMs":628.5833333333334,"isActive":true},{"id":"cmoxkje15007d6whdzgm5zeid","openrouterId":"mistralai/mistral-small-3.1-24b-instruct","slug":"mistralai-mistral-small-3.1-24b-instruct","name":"Mistral: Mistral Small 3.1 24B","description":"Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities. It provides state-of-the-art performance in text-based reasoning and...","contextLength":128000,"pricing":{"prompt":4.212e-7,"completion":6.66e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"737898810","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-03-17T15:15:37.004Z","avgThroughputTps":39,"avgLatencyMs":411,"isActive":true},{"id":"cmoxkjbds001q6whd2ncbidbr","openrouterId":"bytedance-seed/seed-2.0-lite","slug":"bytedance-seed-seed-2.0-lite","name":"ByteDance Seed: Seed-2.0-Lite","description":"Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across...","contextLength":262144,"pricing":{"prompt":3e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"736027766","provider":"bytedance-seed","authorName":"bytedance-seed","authorSlug":"bytedance-seed","iconUrl":"https://openrouter.ai/images/icons/bytedance-seed.svg","releaseDate":"2026-03-10T11:40:31.000Z","avgThroughputTps":19,"avgLatencyMs":1972.5,"isActive":true},{"id":"cmoxkjdxw00766whd6tykiop2","openrouterId":"alfredpros/codellama-7b-instruct-solidity","slug":"alfredpros-codellama-7b-instruct-solidity","name":"AlfredPros: CodeLLaMa 7B Instruct Solidity","description":"A finetuned 7 billion parameters Code LLaMA - Instruct model to generate Solidity smart contract using 4-bit QLoRA finetuning provided by PEFT library.","contextLength":4096,"pricing":{"prompt":9.6e-7,"completion":0.00000144,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"735821","provider":"alfredpros","authorName":"alfredpros","authorSlug":"alfredpros","iconUrl":"https://openrouter.ai/images/icons/alfredpros.svg","releaseDate":"2025-04-14T10:44:34.216Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjcdf003s6whdv5bj4kh9","openrouterId":"openai/gpt-5.1-codex","slug":"openai-gpt-5.1-codex","name":"OpenAI: GPT-5.1-Codex","description":"GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....","contextLength":400000,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"7318902030","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-11-13T13:58:18.000Z","avgThroughputTps":48,"avgLatencyMs":2311.5,"isActive":true},{"id":"cmoxkje06007b6whdvuzdayi8","openrouterId":"deepseek/deepseek-chat-v3-0324","slug":"deepseek-deepseek-chat-v3-0324","name":"DeepSeek: DeepSeek V3 0324","description":"DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [DeepSeek V3](/deepseek/deepseek-chat-v3) model and performs really well...","contextLength":163840,"pricing":{"prompt":2.4e-7,"completion":9.24e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"71875493431","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-03-24T09:59:15.252Z","avgThroughputTps":20.333333333333332,"avgLatencyMs":1656.5833333333333,"isActive":true},{"id":"de7d40a91bfe4fd7877bb892","openrouterId":"anthropic/claude-opus-4.7-fast","slug":"anthropic-claude-opus-4.7-fast","name":"Anthropic: Claude Opus 4.7 (Fast)","description":"Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing.\n\nLearn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode","contextLength":1000000,"pricing":{"prompt":0.000036,"completion":0.00017999999999999998,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"7170310242","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2026-05-12T15:10:11.422Z","avgThroughputTps":109,"avgLatencyMs":728,"isActive":true},{"id":"cmoxkjbug002o6whd783c06ol","openrouterId":"liquid/lfm-2.5-1.2b-thinking:free","slug":"liquid-lfm-2.5-1.2b-thinking-free","name":"LiquidAI: LFM2.5-1.2B-Thinking (free)","description":"LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is...","contextLength":32768,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"716682624","provider":"liquid","authorName":"Liquid","authorSlug":"liquid","iconUrl":"https://openrouter.ai/images/icons/Liquid.svg","releaseDate":"2026-01-20T11:45:27.038Z","avgThroughputTps":100,"avgLatencyMs":338,"isActive":true},{"id":"cmoxkjc4200386whdqoe5acuj","openrouterId":"nex-agi/deepseek-v3.1-nex-n1","slug":"nex-agi-deepseek-v3.1-nex-n1","name":"Nex AGI: DeepSeek V3.1 Nex N1","description":"DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designed to highlight agent autonomy, tool use, and real-world productivity. Nex-N1 demonstrates competitive performance across...","contextLength":131072,"pricing":{"prompt":1.62e-7,"completion":6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"710256685","provider":"nex-agi","authorName":"nex-agi","authorSlug":"nex-agi","iconUrl":"https://openrouter.ai/images/icons/nex-agi.svg","releaseDate":"2025-12-08T09:33:13.218Z","avgThroughputTps":35,"avgLatencyMs":1941,"isActive":true},{"id":"cmoxkjdii00696whdw02cgtw7","openrouterId":"morph/morph-v3-fast","slug":"morph-morph-v3-fast","name":"Morph: Morph V3 Fast","description":"Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The model requires the prompt to be in the following format: <instruction>{instruction}</instruction> <code>{initial_code}</code> <update>{edit_snippet}</update>...","contextLength":81920,"pricing":{"prompt":9.6e-7,"completion":0.00000144,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"70397165","provider":"morph","authorName":"morph","authorSlug":"morph","iconUrl":"https://openrouter.ai/images/icons/morph.svg","releaseDate":"2025-07-07T13:40:02.233Z","avgThroughputTps":732,"avgLatencyMs":318,"isActive":true},{"id":"cmoxkjbaw001k6whdcr09eckk","openrouterId":"openai/gpt-5.4-nano","slug":"openai-gpt-5.4-nano","name":"OpenAI: GPT-5.4 Nano","description":"GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-latency...","contextLength":400000,"pricing":{"prompt":2.4e-7,"completion":0.0000015,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"70292798124","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-03-17T07:49:47.000Z","avgThroughputTps":61,"avgLatencyMs":995.25,"isActive":true},{"id":"cmoxkjdaq005s6whd1nh0x2cw","openrouterId":"z-ai/glm-4.5-air:free","slug":"z-ai-glm-4.5-air-free","name":"Z.ai: GLM 4.5 Air (free)","description":"GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"70286098074","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-07-25T15:20:58.066Z","avgThroughputTps":29,"avgLatencyMs":4127,"isActive":true},{"id":"cmoxkjew900976whdnwc7wega","openrouterId":"nousresearch/hermes-3-llama-3.1-70b","slug":"nousresearch-hermes-3-llama-3.1-70b","name":"Nous: Hermes 3 70B Instruct","description":"Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...","contextLength":131072,"pricing":{"prompt":3.6e-7,"completion":3.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"700753214","provider":"nousresearch","authorName":"Nous Research","authorSlug":"nousresearch","iconUrl":"https://openrouter.ai/images/icons/Nous Research.svg","releaseDate":"2024-08-17T20:00:00.000Z","avgThroughputTps":31,"avgLatencyMs":411,"isActive":true},{"id":"cmoxkje9h007v6whdxi77cykv","openrouterId":"google/gemini-2.0-flash-001","slug":"google-gemini-2.0-flash-001","name":"Google: Gemini 2.0 Flash","description":"Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5). It...","contextLength":1048576,"pricing":{"prompt":1.2e-7,"completion":4.8e-7,"image":1.2e-7,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"70057220172","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-02-05T10:30:13.144Z","avgThroughputTps":59,"avgLatencyMs":585.5,"isActive":true},{"id":"cmoxkjbrn002i6whdd0xgj33t","openrouterId":"stepfun/step-3.5-flash","slug":"stepfun-step-3.5-flash","name":"StepFun: Step 3.5 Flash","description":"Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token....","contextLength":262144,"pricing":{"prompt":1.2e-7,"completion":3.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"698661246550","provider":"stepfun","authorName":"stepfun","authorSlug":"stepfun","iconUrl":"https://openrouter.ai/images/icons/stepfun.svg","releaseDate":"2026-01-29T18:12:17.060Z","avgThroughputTps":81.33333333333333,"avgLatencyMs":1744.8333333333333,"isActive":true},{"id":"cmoxkjeyi009c6whdbcvhxt6a","openrouterId":"meta-llama/llama-3.1-8b-instruct","slug":"meta-llama-llama-3.1-8b-instruct","name":"Meta: Llama 3.1 8B Instruct","description":"Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated strong performance compared to...","contextLength":131072,"pricing":{"prompt":2.4e-8,"completion":6e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"69673492340","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-07-22T20:00:00.000Z","avgThroughputTps":103.57142857142857,"avgLatencyMs":392.7857142857143,"isActive":true},{"id":"cmoxkjep5008s6whd1i51g015","openrouterId":"thedrummer/unslopnemo-12b","slug":"thedrummer-unslopnemo-12b","name":"TheDrummer: UnslopNemo 12B","description":"UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.","contextLength":32768,"pricing":{"prompt":4.8e-7,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"694194844","provider":"thedrummer","authorName":"Drummer","authorSlug":"thedrummer","iconUrl":"https://openrouter.ai/images/icons/Drummer.svg","releaseDate":"2024-11-08T17:04:08.359Z","avgThroughputTps":72,"avgLatencyMs":627.5,"isActive":true},{"id":"cmoxkjf3f009n6whdjykrdw40","openrouterId":"meta-llama/llama-3-70b-instruct","slug":"meta-llama-llama-3-70b-instruct","name":"Meta: Llama 3 70B Instruct","description":"Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 70B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...","contextLength":8192,"pricing":{"prompt":6.119999999999999e-7,"completion":8.88e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"68518686","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-04-17T20:00:00.000Z","avgThroughputTps":19,"avgLatencyMs":1049,"isActive":true},{"id":"cmoxkjd2p005b6whdz4oqodvr","openrouterId":"baidu/ernie-4.5-21b-a3b","slug":"baidu-ernie-4.5-21b-a3b","name":"Baidu: ERNIE 4.5 21B A3B","description":"A sophisticated text-based Mixture-of-Experts (MoE) model featuring 21B total parameters with 3B activated per token, delivering exceptional multimodal understanding and generation through heterogeneous MoE structures and modality-isolated routing. Supporting an...","contextLength":131072,"pricing":{"prompt":8.4e-8,"completion":3.36e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"67143783","provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2025-08-12T17:29:27.753Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjay7000u6whdi712j3wn","openrouterId":"xiaomi/mimo-v2.5","slug":"xiaomi-mimo-v2.5","name":"Xiaomi: MiMo-V2.5","description":"MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...","contextLength":1048576,"pricing":{"prompt":4.8e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","audio","video->text"],"perWeekTokens":"66851417930","provider":"xiaomi","authorName":"Xiaomi","authorSlug":"xiaomi","iconUrl":"https://openrouter.ai/images/icons/Xiaomi.svg","releaseDate":"2026-04-22T12:11:09.307Z","avgThroughputTps":25,"avgLatencyMs":1960,"isActive":true},{"id":"cmoxkjbg3001v6whdinasv4w2","openrouterId":"openai/gpt-5.3-chat","slug":"openai-gpt-5.3-chat","name":"OpenAI: GPT-5.3 Chat","description":"GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualization and significantly...","contextLength":128000,"pricing":{"prompt":0.0000021,"completion":0.0000168,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"6682064998","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-03-03T13:54:21.454Z","avgThroughputTps":44,"avgLatencyMs":2323.5,"isActive":true},{"id":"cmoxkjccs003r6whdhllkcaki","openrouterId":"openai/gpt-5.1-chat","slug":"openai-gpt-5.1-chat","name":"OpenAI: GPT-5.1 Chat","description":"GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...","contextLength":128000,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"66255478910","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-11-13T13:58:22.000Z","avgThroughputTps":45.5,"avgLatencyMs":2276.25,"isActive":true},{"id":"cmoxkjco1004f6whdv2ec8urq","openrouterId":"qwen/qwen3-vl-30b-a3b-instruct","slug":"qwen-qwen3-vl-30b-a3b-instruct","name":"Qwen: Qwen3 VL 30B A3B Instruct","description":"Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...","contextLength":262144,"pricing":{"prompt":1.56e-7,"completion":6.24e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"6590750951","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-10-06T19:47:56.430Z","avgThroughputTps":30,"avgLatencyMs":1552.4285714285713,"isActive":true},{"id":"cmoxkjcoz004h6whd5lrixizq","openrouterId":"z-ai/glm-4.6","slug":"z-ai-glm-4.6","name":"Z.ai: GLM 4.6","description":"Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...","contextLength":202752,"pricing":{"prompt":5.16e-7,"completion":0.000002088,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"6556839870","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-09-30T08:32:56.306Z","avgThroughputTps":31.7,"avgLatencyMs":4759.9,"isActive":true},{"id":"cmoxkjexk009a6whdicpm9lsf","openrouterId":"sao10k/l3-lunaris-8b","slug":"sao10k-l3-lunaris-8b","name":"Sao10K: Llama 3 8B Lunaris","description":"Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, designed to balance creativity with improved logic and general knowledge....","contextLength":8192,"pricing":{"prompt":4.8e-8,"completion":6e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"6486161390","provider":"sao10k","authorName":"Sao10K","authorSlug":"sao10k","iconUrl":"https://openrouter.ai/images/icons/Sao10K.svg","releaseDate":"2024-08-12T20:00:00.000Z","avgThroughputTps":58.5,"avgLatencyMs":631,"isActive":true},{"id":"cmoxkjbkp00246whdnrj31crj","openrouterId":"google/gemini-3.1-pro-preview-customtools","slug":"google-gemini-3.1-pro-preview-customtools","name":"Google: Gemini 3.1 Pro Preview Custom Tools","description":"Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party...","contextLength":1048756,"pricing":{"prompt":0.0000024,"completion":0.0000144,"image":0.0000024,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"6389956449","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-02-25T13:58:43.000Z","avgThroughputTps":95,"avgLatencyMs":3942.5,"isActive":true},{"id":"cmoxkjchk00416whduqwxjdiq","openrouterId":"qwen/qwen3-vl-32b-instruct","slug":"qwen-qwen3-vl-32b-instruct","name":"Qwen: Qwen3 VL 32B Instruct","description":"Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...","contextLength":262144,"pricing":{"prompt":1.248e-7,"completion":4.992e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"63470293104","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-10-23T10:55:32.539Z","avgThroughputTps":29,"avgLatencyMs":1489,"isActive":true},{"id":"cmoxkjbzm002z6whdec1xe6qc","openrouterId":"xiaomi/mimo-v2-flash","slug":"xiaomi-mimo-v2-flash","name":"Xiaomi: MiMo-V2-Flash","description":"MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a...","contextLength":262144,"pricing":{"prompt":1.2e-7,"completion":3.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"62903290827","provider":"xiaomi","authorName":"Xiaomi","authorSlug":"xiaomi","iconUrl":"https://openrouter.ai/images/icons/Xiaomi.svg","releaseDate":"2025-12-14T11:55:08.000Z","avgThroughputTps":46,"avgLatencyMs":1257.5,"isActive":true},{"id":"cmoxkjex200996whd7y8p8b7y","openrouterId":"nousresearch/hermes-3-llama-3.1-405b","slug":"nousresearch-hermes-3-llama-3.1-405b","name":"Nous: Hermes 3 405B Instruct","description":"Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...","contextLength":131072,"pricing":{"prompt":0.0000012,"completion":0.0000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"627135956","provider":"nousresearch","authorName":"Nous Research","authorSlug":"nousresearch","iconUrl":"https://openrouter.ai/images/icons/Nous Research.svg","releaseDate":"2024-08-15T20:00:00.000Z","avgThroughputTps":13,"avgLatencyMs":840,"isActive":true},{"id":"cmoxkjetw00926whdmih8ttjv","openrouterId":"meta-llama/llama-3.2-11b-vision-instruct","slug":"meta-llama-llama-3.2-11b-vision-instruct","name":"Meta: Llama 3.2 11B Vision Instruct","description":"Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...","contextLength":131072,"pricing":{"prompt":2.9399999999999996e-7,"completion":2.9399999999999996e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"625007691","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-09-24T20:00:00.000Z","avgThroughputTps":39,"avgLatencyMs":287,"isActive":true},{"id":"cmoxkjbaf001j6whdsvfzz363","openrouterId":"minimax/minimax-m2.7","slug":"minimax-minimax-m2.7","name":"MiniMax: MiniMax M2.7","description":"MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...","contextLength":204800,"pricing":{"prompt":3.348e-7,"completion":0.00000144,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"611289349549","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2026-03-18T08:24:57.176Z","avgThroughputTps":63.833333333333336,"avgLatencyMs":1684.5833333333333,"isActive":true},{"id":"cmoxkjf5y009s6whdlxmyltow","openrouterId":"mistralai/mistral-large","slug":"mistralai-mistral-large","name":"Mistral Large","description":"This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....","contextLength":128000,"pricing":{"prompt":0.0000024,"completion":0.0000072,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"610524650","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2024-02-25T19:00:00.000Z","avgThroughputTps":36,"avgLatencyMs":570.5,"isActive":true},{"id":"cmoxkjblo00266whdoybadtnl","openrouterId":"aion-labs/aion-2.0","slug":"aion-labs-aion-2.0","name":"AionLabs: Aion-2.0","description":"Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging....","contextLength":131072,"pricing":{"prompt":9.6e-7,"completion":0.00000192,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"6097955227","provider":"aion-labs","authorName":"Aion Labs","authorSlug":"aion-labs","iconUrl":"https://openrouter.ai/images/icons/Aion Labs.svg","releaseDate":"2026-02-23T16:15:06.199Z","avgThroughputTps":27,"avgLatencyMs":3258.5,"isActive":true},{"id":"123ba1af601f46279c0f43e6","openrouterId":"deepseek/deepseek-v4-flash:free","slug":"deepseek-deepseek-v4-flash-free","name":"DeepSeek: DeepSeek V4 Flash (free)","description":"DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...","contextLength":1048576,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"60926290951","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2026-04-23T23:17:46.710Z","avgThroughputTps":43,"avgLatencyMs":4094,"isActive":true},{"id":"cmoxkjdle006f6whdlui8z6np","openrouterId":"google/gemini-2.5-pro","slug":"google-gemini-2.5-pro","name":"Google: Gemini 2.5 Pro","description":"Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...","contextLength":1048576,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0.0000015,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"60803028087","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-06-17T10:12:24.000Z","avgThroughputTps":97.25,"avgLatencyMs":2570.5,"isActive":true},{"id":"cmoxkjdnb006j6whdmb7plhl4","openrouterId":"google/gemini-2.5-pro-preview","slug":"google-gemini-2.5-pro-preview","name":"Google: Gemini 2.5 Pro Preview 06-05","description":"Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...","contextLength":1048576,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0.0000015,"request":0},"modalities":["text","image","file","audio->text"],"perWeekTokens":"60803028087","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-06-17T10:12:24.000Z","avgThroughputTps":97.25,"avgLatencyMs":2570.5,"isActive":true},{"id":"cmoxkjdpz006p6whda3synw17","openrouterId":"google/gemini-2.5-pro-preview-05-06","slug":"google-gemini-2.5-pro-preview-05-06","name":"Google: Gemini 2.5 Pro Preview 05-06","description":"Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...","contextLength":1048576,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0.0000015,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"60803028087","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-06-17T10:12:24.000Z","avgThroughputTps":97.25,"avgLatencyMs":2570.5,"isActive":true},{"id":"cmoxkjc85003h6whd1vjkkid5","openrouterId":"arcee-ai/trinity-mini","slug":"arcee-ai-trinity-mini","name":"Arcee AI: Trinity Mini","description":"Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function...","contextLength":131072,"pricing":{"prompt":5.3999999999999994e-8,"completion":1.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"606264682","provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2025-12-01T10:08:40.000Z","avgThroughputTps":161,"avgLatencyMs":350,"isActive":true},{"id":"cmoxkjbh1001x6whd8oiq9xo9","openrouterId":"bytedance-seed/seed-2.0-mini","slug":"bytedance-seed-seed-2.0-mini","name":"ByteDance Seed: Seed-2.0-Mini","description":"Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/medium/high), multimodal understanding,...","contextLength":262144,"pricing":{"prompt":1.2e-7,"completion":4.8e-7,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"600818901","provider":"bytedance-seed","authorName":"bytedance-seed","authorSlug":"bytedance-seed","iconUrl":"https://openrouter.ai/images/icons/bytedance-seed.svg","releaseDate":"2026-02-26T13:38:27.000Z","avgThroughputTps":20,"avgLatencyMs":451,"isActive":true},{"id":"cmoxkjfca00a66whdvai29c3l","openrouterId":"openai/gpt-3.5-turbo","slug":"openai-gpt-3.5-turbo","name":"OpenAI: GPT-3.5 Turbo","description":"GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.\n\nTraining data up to Sep 2021.","contextLength":16385,"pricing":{"prompt":6e-7,"completion":0.0000018,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"599192053","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2023-05-27T20:00:00.000Z","avgThroughputTps":84,"avgLatencyMs":383.5,"isActive":true},{"id":"cmoxkjbvc002q6whdk480zfw9","openrouterId":"openai/gpt-audio","slug":"openai-gpt-audio","name":"OpenAI: GPT Audio","description":"The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced...","contextLength":128000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text","audio->text","audio"],"perWeekTokens":"5942815","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-01-19T17:42:49.327Z","avgThroughputTps":1,"avgLatencyMs":5785.5,"isActive":true},{"id":"cmoxkjd1h00586whdhndgzu8d","openrouterId":"deepseek/deepseek-chat-v3.1","slug":"deepseek-deepseek-chat-v3.1","name":"DeepSeek: DeepSeek V3.1","description":"DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context...","contextLength":163840,"pricing":{"prompt":2.52e-7,"completion":9.48e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"59054046458","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-08-21T08:33:48.000Z","avgThroughputTps":23.571428571428573,"avgLatencyMs":1337.5,"isActive":true},{"id":"cmoxkjc9k003k6whds2mdlffw","openrouterId":"prime-intellect/intellect-3","slug":"prime-intellect-intellect-3","name":"Prime Intellect: INTELLECT-3","description":"INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math,...","contextLength":131072,"pricing":{"prompt":2.4e-7,"completion":0.00000132,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"58559214","provider":"prime-intellect","authorName":"prime-intellect","authorSlug":"prime-intellect","iconUrl":"https://openrouter.ai/images/icons/prime-intellect.svg","releaseDate":"2025-11-26T22:02:14.494Z","avgThroughputTps":49,"avgLatencyMs":344,"isActive":true},{"id":"cmoxkjbux002p6whd8y5zwrtq","openrouterId":"liquid/lfm-2.5-1.2b-instruct:free","slug":"liquid-lfm-2.5-1.2b-instruct-free","name":"LiquidAI: LFM2.5-1.2B-Instruct (free)","description":"LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.","contextLength":32768,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"583134438","provider":"liquid","authorName":"Liquid","authorSlug":"liquid","iconUrl":"https://openrouter.ai/images/icons/Liquid.svg","releaseDate":"2026-01-20T11:45:21.850Z","avgThroughputTps":104,"avgLatencyMs":269,"isActive":true},{"id":"cmoxkjcbu003p6whdxxuh4hnu","openrouterId":"deepcogito/cogito-v2.1-671b","slug":"deepcogito-cogito-v2.1-671b","name":"Deep Cogito: Cogito v2.1 671B","description":"Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using self play with reinforcement learning...","contextLength":128000,"pricing":{"prompt":0.0000015,"completion":0.0000015,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"57778208","provider":"deepcogito","authorName":"deepcogito","authorSlug":"deepcogito","iconUrl":"https://openrouter.ai/images/icons/deepcogito.svg","releaseDate":"2025-11-13T17:00:33.034Z","avgThroughputTps":32,"avgLatencyMs":285,"isActive":true},{"id":"cmoxkjd67005i6whd39vr2dpd","openrouterId":"openai/gpt-5-nano","slug":"openai-gpt-5-nano","name":"OpenAI: GPT-5 Nano","description":"GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger...","contextLength":400000,"pricing":{"prompt":6e-8,"completion":4.8e-7,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"57621368031","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-07T13:23:22.000Z","avgThroughputTps":109,"avgLatencyMs":3391.75,"isActive":true},{"id":"cmoxkjeob008q6whd1bmue9to","openrouterId":"mistralai/pixtral-large-2411","slug":"mistralai-pixtral-large-2411","name":"Mistral: Pixtral Large 2411","description":"Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-large-2411). The model is able to understand documents, charts and natural images. The model is...","contextLength":131072,"pricing":{"prompt":0.0000024,"completion":0.0000072,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"56965542","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2024-11-18T19:49:48.873Z","avgThroughputTps":52.5,"avgLatencyMs":504,"isActive":true},{"id":"cmoxkje2v007h6whdaqhd42db","openrouterId":"openai/gpt-4o-mini-search-preview","slug":"openai-gpt-4o-mini-search-preview","name":"OpenAI: GPT-4o-mini Search Preview","description":"GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.","contextLength":128000,"pricing":{"prompt":1.8e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"56841144","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-03-12T18:22:02.718Z","avgThroughputTps":23,"avgLatencyMs":2040,"isActive":true},{"id":"cmoxkjdcv005x6whdmpw9hhnt","openrouterId":"qwen/qwen3-coder","slug":"qwen-qwen3-coder","name":"Qwen: Qwen3 Coder 480B A35B","description":"Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over...","contextLength":1048576,"pricing":{"prompt":2.64e-7,"completion":0.0000021599999999999996,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"56276963688","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-07-22T20:29:06.000Z","avgThroughputTps":43.75,"avgLatencyMs":1442.9375,"isActive":true},{"id":"cmoxkjdbo005u6whd454rv2qa","openrouterId":"qwen/qwen3-235b-a22b-thinking-2507","slug":"qwen-qwen3-235b-a22b-thinking-2507","name":"Qwen: Qwen3 235B A22B Thinking 2507","description":"Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144...","contextLength":262144,"pricing":{"prompt":1.794e-7,"completion":0.000001794,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"5466608268","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-07-25T09:19:17.179Z","avgThroughputTps":33.875,"avgLatencyMs":1179.25,"isActive":true},{"id":"cmoxkjdt9006w6whdzyxylt0v","openrouterId":"qwen/qwen3-8b","slug":"qwen-qwen3-8b","name":"Qwen: Qwen3 8B","description":"Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between \"thinking\" mode for math,...","contextLength":131072,"pricing":{"prompt":6e-8,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"5454904786","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-04-28T17:43:52.421Z","avgThroughputTps":76.5,"avgLatencyMs":616.25,"isActive":true},{"id":"cmoxkjbcu001o6whdzy1oo8hh","openrouterId":"nvidia/nemotron-3-super-120b-a12b:free","slug":"nvidia-nemotron-3-super-120b-a12b-free","name":"NVIDIA: Nemotron 3 Super (free)","description":"NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...","contextLength":1000000,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"531115178816","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2026-03-11T12:07:19.000Z","avgThroughputTps":24,"avgLatencyMs":15340,"isActive":true},{"id":"cmoxkjdlx006g6whd75t9dzsl","openrouterId":"openai/o3-pro","slug":"openai-o3-pro","name":"OpenAI: o3 Pro","description":"The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently...","contextLength":200000,"pricing":{"prompt":0.000024,"completion":0.000096,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"5302184","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-06-10T19:32:32.266Z","avgThroughputTps":10,"avgLatencyMs":6498,"isActive":true},{"id":"cmoxkjbhz001z6whdd97w7sne","openrouterId":"qwen/qwen3.5-35b-a3b","slug":"qwen-qwen3.5-35b-a3b","name":"Qwen: Qwen3.5-35B-A3B","description":"The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall...","contextLength":262144,"pricing":{"prompt":1.6679999999999998e-7,"completion":0.0000012,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"52814807650","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-25T16:10:22.000Z","avgThroughputTps":82.875,"avgLatencyMs":766.75,"isActive":true},{"id":"cmoxkjd8t005o6whdqqq7m4gu","openrouterId":"mistralai/codestral-2508","slug":"mistralai-codestral-2508","name":"Mistral: Codestral 2508","description":"Mistral's cutting-edge language model for coding released end of July 2025. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middle (FIM), code correction and test generation.\n\n[Blog Post](https://mistral.ai/news/codestral-25-08)","contextLength":256000,"pricing":{"prompt":3.6e-7,"completion":0.0000010799999999999998,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"524803347","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-08-01T16:20:30.639Z","avgThroughputTps":64,"avgLatencyMs":229,"isActive":true},{"id":"cmoxkjf2y009m6whd14mszzdm","openrouterId":"meta-llama/llama-3-8b-instruct","slug":"meta-llama-llama-3-8b-instruct","name":"Meta: Llama 3 8B Instruct","description":"Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...","contextLength":8192,"pricing":{"prompt":4.8e-8,"completion":4.8e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"518695449","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-04-17T20:00:00.000Z","avgThroughputTps":26,"avgLatencyMs":606.1666666666666,"isActive":true},{"id":"cmoxkjbpp002e6whd372kulgf","openrouterId":"qwen/qwen3-max-thinking","slug":"qwen-qwen3-max-thinking","name":"Qwen: Qwen3 Max Thinking","description":"Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...","contextLength":262144,"pricing":{"prompt":9.36e-7,"completion":0.00000468,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"510109763","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-09T16:18:21.683Z","avgThroughputTps":27.5,"avgLatencyMs":1112.5,"isActive":true},{"id":"cmoxkjak900006whd7lggax56","openrouterId":"inclusionai/ring-2.6-1t:free","slug":"inclusionai-ring-2.6-1t-free","name":"inclusionAI: Ring-2.6-1T (free)","description":"Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency. It is optimized for coding agents, tool...","contextLength":262144,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"509263276752","provider":"inclusionai","authorName":"inclusionai","authorSlug":"inclusionai","iconUrl":"https://openrouter.ai/images/icons/inclusionai.svg","releaseDate":"2026-05-08T09:37:20.749Z","avgThroughputTps":66,"avgLatencyMs":2176.5,"isActive":true},{"id":"cmoxkjech00816whdo7gskt4r","openrouterId":"qwen/qwen-turbo","slug":"qwen-qwen-turbo","name":"Qwen: Qwen-Turbo","description":"Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost, suitable for simple tasks.","contextLength":131072,"pricing":{"prompt":3.9e-8,"completion":1.56e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"5085630527","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-02-01T06:56:14.820Z","avgThroughputTps":46,"avgLatencyMs":510,"isActive":true},{"id":"cmoxkjcuo004t6whd57nvq6ms","openrouterId":"x-ai/grok-4-fast","slug":"x-ai-grok-4-fast","name":"xAI: Grok 4 Fast","description":"Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. It comes in two flavors: non-reasoning and reasoning. Read more about the model...","contextLength":2000000,"pricing":{"prompt":2.4e-7,"completion":6e-7,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"50549572333","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-09-18T20:01:30.267Z","avgThroughputTps":65,"avgLatencyMs":6546.5,"isActive":true},{"id":"cmoxkjasn000i6whdxre821vj","openrouterId":"qwen/qwen3.5-plus-20260420","slug":"qwen-qwen3.5-plus-20260420","name":"Qwen: Qwen3.5 Plus 2026-04-20","description":"Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces text output, with a 1M token context window. This...","contextLength":1000000,"pricing":{"prompt":3.6e-7,"completion":0.0000021599999999999996,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"5022349693","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-04-26T23:42:48.608Z","avgThroughputTps":54,"avgLatencyMs":1168,"isActive":true},{"id":"cmoxkjeeq00856whdze2narc4","openrouterId":"openai/o3-mini","slug":"openai-o3-mini","name":"OpenAI: o3 Mini","description":"OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter, which can be set to...","contextLength":200000,"pricing":{"prompt":0.00000132,"completion":0.00000528,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"5016645928","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-01-31T14:28:41.132Z","avgThroughputTps":160,"avgLatencyMs":1952,"isActive":true},{"id":"cmoxkjef800866whd79kv4oyc","openrouterId":"mistralai/mistral-small-24b-instruct-2501","slug":"mistralai-mistral-small-24b-instruct-2501","name":"Mistral: Mistral Small 3","description":"Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed...","contextLength":32768,"pricing":{"prompt":6e-8,"completion":9.6e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"4993359428","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-01-30T11:43:29.335Z","avgThroughputTps":57,"avgLatencyMs":287,"isActive":true},{"id":"cmoxkjeqm008v6whdwd4s0apt","openrouterId":"qwen/qwen-2.5-7b-instruct","slug":"qwen-qwen-2.5-7b-instruct","name":"Qwen: Qwen2.5 7B Instruct","description":"Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...","contextLength":131072,"pricing":{"prompt":4.8e-8,"completion":1.2e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"4949818692","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2024-10-15T20:00:00.000Z","avgThroughputTps":65,"avgLatencyMs":442,"isActive":true},{"id":"cmoxkjezw009f6whdrxwutyvj","openrouterId":"openai/gpt-4o-mini-2024-07-18","slug":"openai-gpt-4o-mini-2024-07-18","name":"OpenAI: GPT-4o-mini (2024-07-18)","description":"GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...","contextLength":128000,"pricing":{"prompt":1.8e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"4923205862","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-07-17T20:00:00.000Z","avgThroughputTps":37,"avgLatencyMs":556,"isActive":true},{"id":"cmoxkjbq6002f6whdz5tkdkht","openrouterId":"anthropic/claude-opus-4.6","slug":"anthropic-claude-opus-4.6","name":"Anthropic: Claude Opus 4.6","description":"Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective...","contextLength":1000000,"pricing":{"prompt":0.000006,"completion":0.00003,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"488633727783","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2026-02-04T10:30:50.029Z","avgThroughputTps":37.3,"avgLatencyMs":1904.4,"isActive":true},{"id":"cmoxkjbhh001y6whdpr6m7p3n","openrouterId":"google/gemini-3.1-flash-image-preview","slug":"google-gemini-3.1-flash-image-preview","name":"Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)","description":"Gemini 3.1 Flash Image Preview, a.k.a. \"Nano Banana 2,\" is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines...","contextLength":131072,"pricing":{"prompt":6e-7,"completion":0.0000036,"image":0,"request":0},"modalities":["text","image->text","image"],"perWeekTokens":"4885163529","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-02-26T10:25:58.367Z","avgThroughputTps":92.5,"avgLatencyMs":12798.5,"isActive":true},{"id":"cmoxkjdky006e6whddwmj53zx","openrouterId":"google/gemini-2.5-flash","slug":"google-gemini-2.5-flash","name":"Google: Gemini 2.5 Flash","description":"Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in \"thinking\" capabilities, enabling it to provide responses with greater...","contextLength":1048576,"pricing":{"prompt":3.6e-7,"completion":0.000003,"image":3.6e-7,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"483617064153","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-06-17T11:01:28.103Z","avgThroughputTps":84.25,"avgLatencyMs":2956.25,"isActive":true},{"id":"cmoxkjcg6003y6whdaupfsn2e","openrouterId":"openai/gpt-oss-safeguard-20b","slug":"openai-gpt-oss-safeguard-20b","name":"OpenAI: gpt-oss-safeguard-20b","description":"gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust...","contextLength":131072,"pricing":{"prompt":9e-8,"completion":3.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"4827718352","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-10-29T11:47:16.557Z","avgThroughputTps":613,"avgLatencyMs":225,"isActive":true},{"id":"cmoxkjewm00986whdjlpyo62a","openrouterId":"nousresearch/hermes-3-llama-3.1-405b:free","slug":"nousresearch-hermes-3-llama-3.1-405b-free","name":"Nous: Hermes 3 405B Instruct (free)","description":"Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"47073152","provider":"nousresearch","authorName":"Nous Research","authorSlug":"nousresearch","iconUrl":"https://openrouter.ai/images/icons/Nous Research.svg","releaseDate":"2024-08-15T20:00:00.000Z","avgThroughputTps":12,"avgLatencyMs":2561.5,"isActive":true},{"id":"cmoxkjee900846whdim24cc81","openrouterId":"qwen/qwen-max","slug":"qwen-qwen-max","name":"Qwen: Qwen-Max ","description":"Qwen-Max, based on Qwen2.5, provides the best inference performance among [Qwen models](/qwen), especially for complex multi-step tasks. It's a large-scale MoE model that has been pretrained on over 20 trillion...","contextLength":32768,"pricing":{"prompt":0.000001248,"completion":0.000004992,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"46910578","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-02-01T04:31:29.206Z","avgThroughputTps":15,"avgLatencyMs":2382,"isActive":true},{"id":"cmoxkjc3400366whd48vyjhpi","openrouterId":"relace/relace-search","slug":"relace-relace-search","name":"Relace: Relace Search","description":"The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic...","contextLength":256000,"pricing":{"prompt":0.0000012,"completion":0.0000036,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"4679025","provider":"relace","authorName":"relace","authorSlug":"relace","iconUrl":"https://openrouter.ai/images/icons/relace.svg","releaseDate":"2025-12-08T12:06:00.000Z","avgThroughputTps":2.5,"avgLatencyMs":1698.5,"isActive":true},{"id":"cmoxkjepz008u6whdw4ucg187","openrouterId":"anthracite-org/magnum-v4-72b","slug":"anthracite-org-magnum-v4-72b","name":"Magnum v4 72B","description":"This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus).\n\nThe model is fine-tuned on top of [Qwen2.5 72B](https://openrouter.ai/qwen/qwen-2.5-72b-instruct).","contextLength":32768,"pricing":{"prompt":0.0000036,"completion":0.000006,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"46649757","provider":"anthracite-org","authorName":"anthracite-org","authorSlug":"anthracite-org","iconUrl":"https://openrouter.ai/images/icons/anthracite-org.svg","releaseDate":"2024-10-21T20:00:00.000Z","avgThroughputTps":25,"avgLatencyMs":939,"isActive":true},{"id":"cmoxkjddt005z6whdnarp35h5","openrouterId":"google/gemini-2.5-flash-lite","slug":"google-gemini-2.5-flash-lite","name":"Google: Gemini 2.5 Flash Lite","description":"Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...","contextLength":1048576,"pricing":{"prompt":1.2e-7,"completion":4.8e-7,"image":1.2e-7,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"463709230926","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-07-22T12:04:36.283Z","avgThroughputTps":97.33333333333333,"avgLatencyMs":588.5,"isActive":true},{"id":"cmoxkjes0008y6whdwgucdcu6","openrouterId":"thedrummer/rocinante-12b","slug":"thedrummer-rocinante-12b","name":"TheDrummer: Rocinante 12B","description":"Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives -...","contextLength":32768,"pricing":{"prompt":2.0399999999999997e-7,"completion":5.16e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"457016148","provider":"thedrummer","authorName":"Drummer","authorSlug":"thedrummer","iconUrl":"https://openrouter.ai/images/icons/Drummer.svg","releaseDate":"2024-09-29T20:00:00.000Z","avgThroughputTps":82,"avgLatencyMs":496.5,"isActive":true},{"id":"cmoxkjesh008z6whd4rgr8m95","openrouterId":"meta-llama/llama-3.2-3b-instruct:free","slug":"meta-llama-llama-3.2-3b-instruct-free","name":"Meta: Llama 3.2 3B Instruct (free)","description":"Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"45454096","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-09-24T20:00:00.000Z","avgThroughputTps":46,"avgLatencyMs":590,"isActive":true},{"id":"cmoxkjc7n003g6whdjr43lz57","openrouterId":"mistralai/mistral-large-2512","slug":"mistralai-mistral-large-2512","name":"Mistral: Mistral Large 3 2512","description":"Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and released under the Apache 2.0 license.","contextLength":262144,"pricing":{"prompt":6e-7,"completion":0.0000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"4507593666","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-12-01T16:27:52.651Z","avgThroughputTps":38,"avgLatencyMs":504,"isActive":true},{"id":"cmoxkjfbu00a56whd5zmvfo8e","openrouterId":"openai/gpt-4","slug":"openai-gpt-4","name":"OpenAI: GPT-4","description":"OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning...","contextLength":8191,"pricing":{"prompt":0.000036,"completion":0.000072,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"45016910","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2023-05-27T20:00:00.000Z","avgThroughputTps":28,"avgLatencyMs":699.5,"isActive":true},{"id":"cmoxkjbiy00216whd4v20psks","openrouterId":"qwen/qwen3.5-122b-a10b","slug":"qwen-qwen3.5-122b-a10b","name":"Qwen: Qwen3.5-122B-A10B","description":"The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of...","contextLength":262144,"pricing":{"prompt":3.12e-7,"completion":0.000002496,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"4497907469","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-25T16:09:49.000Z","avgThroughputTps":69.8,"avgLatencyMs":825.3,"isActive":true},{"id":"cmoxkjf8j009y6whd9a6tgblr","openrouterId":"openai/gpt-3.5-turbo-instruct","slug":"openai-gpt-3.5-turbo-instruct","name":"OpenAI: GPT-3.5 Turbo Instruct","description":"This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.","contextLength":4095,"pricing":{"prompt":0.0000018,"completion":0.0000024,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"4495457","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2023-09-27T20:00:00.000Z","avgThroughputTps":107,"avgLatencyMs":855,"isActive":true},{"id":"cmoxkjcbg003o6whdfkrmgxup","openrouterId":"x-ai/grok-4.1-fast","slug":"x-ai-grok-4.1-fast","name":"xAI: Grok 4.1 Fast","description":"Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using...","contextLength":2000000,"pricing":{"prompt":2.4e-7,"completion":6e-7,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"447106070474","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-11-19T16:25:02.724Z","avgThroughputTps":56,"avgLatencyMs":763,"isActive":true},{"id":"cmoxkjb9h001h6whd7phqsfkr","openrouterId":"xiaomi/mimo-v2-omni","slug":"xiaomi-mimo-v2-omni","name":"Xiaomi: MiMo-V2-Omni","description":"MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...","contextLength":262144,"pricing":{"prompt":4.8e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","audio","video->text"],"perWeekTokens":"4461994440","provider":"xiaomi","authorName":"Xiaomi","authorSlug":"xiaomi","iconUrl":"https://openrouter.ai/images/icons/Xiaomi.svg","releaseDate":"2026-03-18T15:55:03.185Z","avgThroughputTps":84,"avgLatencyMs":1663,"isActive":true},{"id":"cmoxkjcxo00506whdzchbcyqw","openrouterId":"qwen/qwen-plus-2025-07-28","slug":"qwen-qwen-plus-2025-07-28","name":"Qwen: Qwen Plus 0728","description":"Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.","contextLength":1000000,"pricing":{"prompt":3.12e-7,"completion":9.36e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"440424510","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-08T12:06:39.935Z","avgThroughputTps":63,"avgLatencyMs":526,"isActive":true},{"id":"cmoxkjalz00046whd9knpg3cq","openrouterId":"x-ai/grok-4.3","slug":"x-ai-grok-4.3","name":"xAI: Grok 4.3","description":"Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual...","contextLength":1000000,"pricing":{"prompt":0.0000015,"completion":0.000003,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"43658706990","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2026-04-30T19:30:21.141Z","avgThroughputTps":79,"avgLatencyMs":734,"isActive":true},{"id":"cmoxkjdg400646whd6r61z8hk","openrouterId":"mistralai/devstral-small","slug":"mistralai-devstral-small","name":"Mistral: Devstral Small 1.1","description":"Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI. Finetuned from Mistral Small 3.1 and...","contextLength":131072,"pricing":{"prompt":1.2e-7,"completion":3.6e-7,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"4359456153","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-07-10T11:19:11.726Z","avgThroughputTps":74,"avgLatencyMs":271,"isActive":true},{"id":"cmoxkjcln004a6whdv91u6ezy","openrouterId":"openai/o4-mini-deep-research","slug":"openai-o4-mini-deep-research","name":"OpenAI: o4 Mini Deep Research","description":"o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks.\n\nNote: This model always uses the 'web_search' tool which adds additional cost.","contextLength":200000,"pricing":{"prompt":0.0000024,"completion":0.0000096,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"43132243","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-10-10T16:54:02.725Z","avgThroughputTps":57.5,"avgLatencyMs":14736,"isActive":true},{"id":"cmoxkjb0s000z6whd2decmdo1","openrouterId":"baidu/qianfan-ocr-fast:free","slug":"baidu-qianfan-ocr-fast-free","name":"Baidu: Qianfan-OCR-Fast (free)","description":"Qianfan-OCR-Fast is a domain-specific multimodal large model purpose-built for OCR. By leveraging specialized OCR training data while preserving versatile multimodal intelligence, it provides a powerful performance upgrade over Qianfan-OCR.","contextLength":65536,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"428989261","provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2026-04-20T13:51:12.864Z","avgThroughputTps":48,"avgLatencyMs":2094,"isActive":true},{"id":"cmoxkjbqo002g6whdlromycmi","openrouterId":"qwen/qwen3-coder-next","slug":"qwen-qwen3-coder-next","name":"Qwen: Qwen3 Coder Next","description":"Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per...","contextLength":262144,"pricing":{"prompt":1.32e-7,"completion":9.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"42847575665","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-03T19:15:01.820Z","avgThroughputTps":33.25,"avgLatencyMs":683.75,"isActive":true},{"id":"cmoxkjatm000k6whd7ik5g6bk","openrouterId":"qwen/qwen3.6-35b-a3b","slug":"qwen-qwen3.6-35b-a3b","name":"Qwen: Qwen3.6 35B A3B","description":"Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hybrid sparse mixture-of-experts architecture combining Gated...","contextLength":262144,"pricing":{"prompt":1.8e-7,"completion":0.0000012,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"42391080105","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-04-26T23:24:15.340Z","avgThroughputTps":151.25,"avgLatencyMs":632.875,"isActive":true},{"id":"cmoxkjc1b00326whd9dl0duoh","openrouterId":"openai/gpt-5.2-chat","slug":"openai-gpt-5.2-chat","name":"OpenAI: GPT-5.2 Chat","description":"GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...","contextLength":128000,"pricing":{"prompt":0.0000021,"completion":0.0000168,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"4192770908","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-12-10T13:03:03.398Z","avgThroughputTps":42.75,"avgLatencyMs":1696.5,"isActive":true},{"id":"cmoxkjely008l6whdhw0hfia9","openrouterId":"amazon/nova-micro-v1","slug":"amazon-nova-micro-v1","name":"Amazon: Nova Micro 1.0","description":"Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length...","contextLength":128000,"pricing":{"prompt":4.2e-8,"completion":1.68e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"4178156119","provider":"amazon","authorName":"Amazon","authorSlug":"amazon","iconUrl":"https://openrouter.ai/images/icons/Amazon.svg","releaseDate":"2024-12-05T17:20:37.903Z","avgThroughputTps":77.5,"avgLatencyMs":306.5,"isActive":true},{"id":"cmoxkjend008o6whds8oxpmkk","openrouterId":"mistralai/mistral-large-2411","slug":"mistralai-mistral-large-2411","name":"Mistral Large 2411","description":"Mistral Large 2 2411 is an update of [Mistral Large 2](/mistralai/mistral-large) released together with [Pixtral Large 2411](/mistralai/pixtral-large-2411) It provides a significant upgrade on the previous [Mistral Large 24.07](/mistralai/mistral-large-2407), with notable...","contextLength":131072,"pricing":{"prompt":0.0000024,"completion":0.0000072,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"417094007","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2024-11-18T20:11:25.108Z","avgThroughputTps":33,"avgLatencyMs":407.5,"isActive":true},{"id":"cmoxkjavg000o6whdfjuj712l","openrouterId":"openai/gpt-5.5","slug":"openai-gpt-5.5","name":"OpenAI: GPT-5.5","description":"GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...","contextLength":1050000,"pricing":{"prompt":0.000006,"completion":0.000036,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"415453432103","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-04-24T13:31:33.253Z","avgThroughputTps":36,"avgLatencyMs":4257,"isActive":true},{"id":"cmoxkjf79009v6whdkie6gh2g","openrouterId":"alpindale/goliath-120b","slug":"alpindale-goliath-120b","name":"Goliath 120B","description":"A large LLM created by combining two fine-tuned Llama 70B models into one 120B model. Combines Xwin and Euryale. Credits to - [@chargoddard](https://huggingface.co/chargoddard) for developing the framework used to merge...","contextLength":6144,"pricing":{"prompt":0.0000045,"completion":0.000009,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"4135815","provider":"alpindale","authorName":"Alpindale","authorSlug":"alpindale","iconUrl":"https://openrouter.ai/images/icons/Alpindale.svg","releaseDate":"2023-11-09T19:00:00.000Z","avgThroughputTps":20,"avgLatencyMs":825,"isActive":true},{"id":"cmoxkjehz008c6whdbg8wrdjd","openrouterId":"microsoft/phi-4","slug":"microsoft-phi-4","name":"Microsoft: Phi 4","description":"[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion...","contextLength":16384,"pricing":{"prompt":7.8e-8,"completion":1.68e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"413189283","provider":"microsoft","authorName":"Microsoft","authorSlug":"microsoft","iconUrl":"https://openrouter.ai/images/icons/Microsoft.svg","releaseDate":"2025-01-10T01:17:52.163Z","avgThroughputTps":59.5,"avgLatencyMs":523.25,"isActive":true},{"id":"cmoxkjbm500276whdzzytmdbm","openrouterId":"google/gemini-3.1-pro-preview","slug":"google-gemini-3.1-pro-preview","name":"Google: Gemini 3.1 Pro Preview","description":"Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...","contextLength":1048576,"pricing":{"prompt":0.0000024,"completion":0.0000144,"image":0.0000024,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"403226919519","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-02-19T09:00:27.000Z","avgThroughputTps":83.5,"avgLatencyMs":3526.5,"isActive":true},{"id":"cmoxkjby4002w6whdv4cp7670","openrouterId":"minimax/minimax-m2.1","slug":"minimax-minimax-m2.1","name":"MiniMax: MiniMax M2.1","description":"MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...","contextLength":204800,"pricing":{"prompt":3.4799999999999994e-7,"completion":0.00000114,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3993976435","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2025-12-22T20:56:37.000Z","avgThroughputTps":91.5,"avgLatencyMs":2058.25,"isActive":true},{"id":"cmoxkjb7n001d6whdiwwa7f9h","openrouterId":"google/lyria-3-pro-preview","slug":"google-lyria-3-pro-preview","name":"Google: Lyria 3 Pro Preview","description":"Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz...","contextLength":1048576,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image->text","audio"],"perWeekTokens":"3988699","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-03-30T17:48:06.546Z","avgThroughputTps":0.5,"avgLatencyMs":5943.5,"isActive":true},{"id":"cmoxkjd59005g6whdtora9fru","openrouterId":"openai/gpt-5","slug":"openai-gpt-5","name":"OpenAI: GPT-5","description":"GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy...","contextLength":400000,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"39489392815","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-07T13:23:33.000Z","avgThroughputTps":40,"avgLatencyMs":2599.75,"isActive":true},{"id":"cmoxkjeh4008a6whdee70prna","openrouterId":"deepseek/deepseek-r1","slug":"deepseek-deepseek-r1","name":"DeepSeek: R1","description":"DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....","contextLength":163840,"pricing":{"prompt":8.399999999999999e-7,"completion":0.000003,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3948271282","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-01-20T08:51:35.969Z","avgThroughputTps":46.5,"avgLatencyMs":1994.75,"isActive":true},{"id":"cmoxkjd71005k6whdswqjakxq","openrouterId":"openai/gpt-oss-120b","slug":"openai-gpt-oss-120b","name":"OpenAI: gpt-oss-120b","description":"gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized...","contextLength":131072,"pricing":{"prompt":4.68e-8,"completion":2.1600000000000003e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"390672199047","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-05T13:17:11.000Z","avgThroughputTps":182.55263157894737,"avgLatencyMs":633.078947368421,"isActive":true},{"id":"cmoxkjbti002m6whdl34pffs4","openrouterId":"minimax/minimax-m2-her","slug":"minimax-minimax-m2-her","name":"MiniMax: MiniMax M2-her","description":"MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...","contextLength":65536,"pricing":{"prompt":3.6e-7,"completion":0.00000144,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"388669596","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2026-01-23T09:07:19.197Z","avgThroughputTps":25,"avgLatencyMs":743.5,"isActive":true},{"id":"cmoxkjeds00836whdc52gndp3","openrouterId":"qwen/qwen-plus","slug":"qwen-qwen-plus","name":"Qwen: Qwen-Plus","description":"Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.","contextLength":1000000,"pricing":{"prompt":3.12e-7,"completion":9.36e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3872521580","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-02-01T06:37:20.886Z","avgThroughputTps":48,"avgLatencyMs":449.5,"isActive":true},{"id":"cmoxkjdc4005v6whdr2cemj8u","openrouterId":"z-ai/glm-4-32b","slug":"z-ai-glm-4-32b","name":"Z.ai: GLM 4 32B ","description":"GLM 4 32B is a cost-effective foundation language model. It can efficiently perform complex tasks and has significantly enhanced capabilities in tool use, online search, and code-related intelligent tasks. It...","contextLength":128000,"pricing":{"prompt":1.2e-7,"completion":1.2e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3854736590","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-07-24T13:03:37.099Z","avgThroughputTps":4,"avgLatencyMs":682,"isActive":true},{"id":"cmoxkjcdw003t6whdlc66wx2p","openrouterId":"openai/gpt-5.1-codex-mini","slug":"openai-gpt-5.1-codex-mini","name":"OpenAI: GPT-5.1-Codex-Mini","description":"GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex","contextLength":400000,"pricing":{"prompt":3e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"3846117581","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-11-13T13:17:00.379Z","avgThroughputTps":98,"avgLatencyMs":4276.5,"isActive":true},{"id":"cmoxkjbty002n6whd8z8cla51","openrouterId":"writer/palmyra-x5","slug":"writer-palmyra-x5","name":"Writer: Palmyra X5","description":"Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million...","contextLength":1040000,"pricing":{"prompt":7.2e-7,"completion":0.0000072,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"38190463","provider":"writer","authorName":"Writer","authorSlug":"writer","iconUrl":"https://openrouter.ai/images/icons/Writer.svg","releaseDate":"2026-01-21T08:57:03.149Z","avgThroughputTps":12,"avgLatencyMs":387,"isActive":true},{"id":"cmoxkjca1003l6whdzyj7k91p","openrouterId":"anthropic/claude-opus-4.5","slug":"anthropic-claude-opus-4.5","name":"Anthropic: Claude Opus 4.5","description":"Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and...","contextLength":200000,"pricing":{"prompt":0.000006,"completion":0.00003,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"37940404654","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-11-24T13:56:20.000Z","avgThroughputTps":41.6,"avgLatencyMs":1585.7,"isActive":true},{"id":"cmoxkjcv2004u6whdegwvcsfe","openrouterId":"alibaba/tongyi-deepresearch-30b-a3b","slug":"alibaba-tongyi-deepresearch-30b-a3b","name":"Tongyi DeepResearch 30B A3B","description":"Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks...","contextLength":131072,"pricing":{"prompt":1.0800000000000001e-7,"completion":5.399999999999999e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"375394824","provider":"alibaba","authorName":"alibaba","authorSlug":"alibaba","iconUrl":"https://openrouter.ai/images/icons/alibaba.svg","releaseDate":"2025-09-18T11:53:24.337Z","avgThroughputTps":83,"avgLatencyMs":193,"isActive":true},{"id":"cmoxkjd0200556whd24c4nt2k","openrouterId":"x-ai/grok-code-fast-1","slug":"x-ai-grok-code-fast-1","name":"xAI: Grok Code Fast 1","description":"Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding. With reasoning traces visible in the response, developers can steer Grok Code for high-quality...","contextLength":256000,"pricing":{"prompt":2.4e-7,"completion":0.0000018,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3686945509","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-08-26T16:08:47.000Z","avgThroughputTps":51,"avgLatencyMs":950,"isActive":true},{"id":"cmoxkjdf600626whdbcneq0lk","openrouterId":"moonshotai/kimi-k2","slug":"moonshotai-kimi-k2","name":"MoonshotAI: Kimi K2 0711","description":"Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for...","contextLength":131072,"pricing":{"prompt":6.84e-7,"completion":0.00000276,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3667294919","provider":"moonshotai","authorName":"moonshotai","authorSlug":"moonshotai","iconUrl":"https://openrouter.ai/images/icons/moonshotai.svg","releaseDate":"2025-07-11T15:47:32.565Z","avgThroughputTps":11,"avgLatencyMs":1450,"isActive":true},{"id":"cmoxkjd8d005n6whd30y6bsm9","openrouterId":"anthropic/claude-opus-4.1","slug":"anthropic-claude-opus-4.1","name":"Anthropic: Claude Opus 4.1","description":"Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains...","contextLength":200000,"pricing":{"prompt":0.000018,"completion":0.00008999999999999999,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"364505476","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-08-05T12:33:11.634Z","avgThroughputTps":7,"avgLatencyMs":2222.75,"isActive":true},{"id":"cmoxkjc5v003c6whd141uxh91","openrouterId":"amazon/nova-2-lite-v1","slug":"amazon-nova-2-lite-v1","name":"Amazon: Nova 2 Lite","description":"Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing...","contextLength":1000000,"pricing":{"prompt":3.6e-7,"completion":0.000003,"image":0,"request":0},"modalities":["text","image","file","video->text"],"perWeekTokens":"3630315209","provider":"amazon","authorName":"Amazon","authorSlug":"amazon","iconUrl":"https://openrouter.ai/images/icons/Amazon.svg","releaseDate":"2025-12-02T12:31:12.000Z","avgThroughputTps":84,"avgLatencyMs":871,"isActive":true},{"id":"cmoxkjdoj006m6whdav5cf3z0","openrouterId":"anthropic/claude-sonnet-4","slug":"anthropic-claude-sonnet-4","name":"Anthropic: Claude Sonnet 4","description":"Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...","contextLength":1000000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"36278969743","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-05-22T12:12:51.381Z","avgThroughputTps":34.4,"avgLatencyMs":1301.2,"isActive":true},{"id":"cmoxkjbo8002b6whd8ry0mrs5","openrouterId":"minimax/minimax-m2.5:free","slug":"minimax-minimax-m2.5-free","name":"MiniMax: MiniMax M2.5 (free)","description":"MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...","contextLength":204800,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"36141511301","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2026-02-12T10:01:42.000Z","avgThroughputTps":19,"avgLatencyMs":950,"isActive":true},{"id":"cmoxkjb3100146whdj93j6f9o","openrouterId":"google/gemma-4-26b-a4b-it:free","slug":"google-gemma-4-26b-a4b-it-free","name":"Google: Gemma 4 26B A4B  (free)","description":"Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...","contextLength":262144,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"3613592123","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-04-03T10:53:09.504Z","avgThroughputTps":28,"avgLatencyMs":6227,"isActive":true},{"id":"cmoxkjdyc00776whde0pp7p5d","openrouterId":"x-ai/grok-3-mini-beta","slug":"x-ai-grok-3-mini-beta","name":"xAI: Grok 3 Mini Beta","description":"Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that generate answers immediately, Grok 3 Mini thinks before responding. It’s ideal for reasoning-heavy tasks that don’t demand...","contextLength":131072,"pricing":{"prompt":3.6e-7,"completion":6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3605521492","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-06-10T15:20:45.198Z","avgThroughputTps":46.75,"avgLatencyMs":4992,"isActive":true},{"id":"cmoxkjd1v00596whdw0w1ruky","openrouterId":"openai/gpt-4o-audio-preview","slug":"openai-gpt-4o-audio-preview","name":"OpenAI: GPT-4o Audio","description":"The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs...","contextLength":128000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text","audio->text","audio"],"perWeekTokens":"35899312","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-15T00:44:21.353Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"48214ff115df4776b5ceee25","openrouterId":"arcee-ai/trinity-large-thinking:free","slug":"arcee-ai-trinity-large-thinking-free","name":"Arcee AI: Trinity Large Thinking (free)","description":"Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7...","contextLength":262144,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"35471775186","provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2026-04-01T11:45:18.036Z","avgThroughputTps":110,"avgLatencyMs":769,"isActive":true},{"id":"cmoxkjcoi004g6whdr8xm6izt","openrouterId":"openai/gpt-5-pro","slug":"openai-gpt-5-pro","name":"OpenAI: GPT-5 Pro","description":"GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and...","contextLength":400000,"pricing":{"prompt":0.000018,"completion":0.000144,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"35364924","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-10-06T14:51:03.215Z","avgThroughputTps":16,"avgLatencyMs":50969,"isActive":true},{"id":"cmoxkjc2a00346whd7v4dkgmi","openrouterId":"openai/gpt-5.2","slug":"openai-gpt-5.2","name":"OpenAI: GPT-5.2","description":"GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically, responding quickly...","contextLength":400000,"pricing":{"prompt":0.0000021,"completion":0.0000168,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"35022753366","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-12-10T13:02:55.765Z","avgThroughputTps":43,"avgLatencyMs":2324.5,"isActive":true},{"id":"cmoxkjf5c009r6whddlpc5yjn","openrouterId":"anthropic/claude-3-haiku","slug":"anthropic-claude-3-haiku","name":"Anthropic: Claude 3 Haiku","description":"Claude 3 Haiku is Anthropic's fastest and most compact model for\nnear-instant responsiveness. Quick and accurate targeted performance.\n\nSee the launch announcement and benchmark results [here](https://www.anthropic.com/news/claude-3-haiku)\n\n#multimodal","contextLength":200000,"pricing":{"prompt":3e-7,"completion":0.0000015,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"3494530957","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2024-03-12T20:00:00.000Z","avgThroughputTps":64,"avgLatencyMs":687,"isActive":true},{"id":"cmoxkjd1100576whdeomlssni","openrouterId":"nousresearch/hermes-4-405b","slug":"nousresearch-hermes-4-405b","name":"Nous: Hermes 4 405B","description":"Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...","contextLength":131072,"pricing":{"prompt":0.0000012,"completion":0.0000036,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"347270386","provider":"nousresearch","authorName":"Nous Research","authorSlug":"nousresearch","iconUrl":"https://openrouter.ai/images/icons/Nous Research.svg","releaseDate":"2025-08-26T15:11:03.380Z","avgThroughputTps":28,"avgLatencyMs":343,"isActive":true},{"id":"cmoxkjb2k00136whdq8lx1byg","openrouterId":"z-ai/glm-5.1","slug":"z-ai-glm-5.1","name":"Z.ai: GLM 5.1","description":"GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...","contextLength":202800,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"345218600825","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2026-04-07T12:07:05.031Z","avgThroughputTps":36.22222222222222,"avgLatencyMs":2531.0833333333335,"isActive":true},{"id":"cmoxkjdwe00736whd539rrtit","openrouterId":"openai/gpt-4.1","slug":"openai-gpt-4.1","name":"OpenAI: GPT-4.1","description":"GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and...","contextLength":1047576,"pricing":{"prompt":0.0000024,"completion":0.0000096,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"34340412467","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-04-14T13:23:05.000Z","avgThroughputTps":50,"avgLatencyMs":813,"isActive":true},{"id":"cmoxkjaoc00096whdxmiswacj","openrouterId":"poolside/laguna-xs.2:free","slug":"poolside-laguna-xs.2-free","name":"Poolside: Laguna XS.2 (free)","description":"Laguna XS.2 is the second-generation model in the XS size class from [Poolside](https://poolside.ai), their efficient coding agent series. It combines tool calling and reasoning capabilities with a compact footprint, offering...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"34223431549","provider":"poolside","authorName":"poolside","authorSlug":"poolside","iconUrl":"https://openrouter.ai/images/icons/poolside.svg","releaseDate":"2026-04-28T11:20:04.226Z","avgThroughputTps":100,"avgLatencyMs":603,"isActive":true},{"id":"cmoxkjer2008w6whd353rl90i","openrouterId":"inflection/inflection-3-productivity","slug":"inflection-inflection-3-productivity","name":"Inflection: Inflection 3 Productivity","description":"Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines. It has access to recent news. For emotional...","contextLength":8000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3396481","provider":"inflection","authorName":"Inflection","authorSlug":"inflection","iconUrl":"https://openrouter.ai/images/icons/Inflection.svg","releaseDate":"2024-10-10T20:00:00.000Z","avgThroughputTps":3,"avgLatencyMs":2775,"isActive":true},{"id":"cmoxkjbgk001w6whdxk301o84","openrouterId":"google/gemini-3.1-flash-lite-preview","slug":"google-gemini-3.1-flash-lite-preview","name":"Google: Gemini 3.1 Flash Lite Preview","description":"Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across...","contextLength":1048576,"pricing":{"prompt":3e-7,"completion":0.0000018,"image":3e-7,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"334825520097","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-03-02T23:37:53.564Z","avgThroughputTps":106,"avgLatencyMs":890.75,"isActive":true},{"id":"cmoxkjand00076whdu1z2ynlm","openrouterId":"openrouter/owl-alpha","slug":"openrouter-owl-alpha","name":"Owl Alpha","description":"Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with strong performance in code generation, automated workflows, and complex instruction execution....","contextLength":1048756,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"332813283561","provider":"openrouter","authorName":"OpenRouter","authorSlug":"openrouter","iconUrl":"https://openrouter.ai/images/icons/OpenRouter.svg","releaseDate":"2026-04-28T13:49:49.056Z","avgThroughputTps":13,"avgLatencyMs":9590,"isActive":true},{"id":"cmoxkjdjn006b6whdg2s67911","openrouterId":"baidu/ernie-4.5-300b-a47b","slug":"baidu-ernie-4.5-300b-a47b","name":"Baidu: ERNIE 4.5 300B A47B ","description":"ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE 4.5 series. It activates 47B parameters per token and supports text generation in...","contextLength":131072,"pricing":{"prompt":3.36e-7,"completion":0.00000132,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"33042069","provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2025-06-30T12:15:39.588Z","avgThroughputTps":14,"avgLatencyMs":1809,"isActive":true},{"id":"cmoxkjdmc006h6whd1qwp6jjg","openrouterId":"x-ai/grok-3-mini","slug":"x-ai-grok-3-mini","name":"xAI: Grok 3 Mini","description":"A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. The raw thinking traces are accessible.","contextLength":131072,"pricing":{"prompt":3.6e-7,"completion":6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3202603790","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-06-10T15:20:45.198Z","avgThroughputTps":54,"avgLatencyMs":859,"isActive":true},{"id":"cmoxkjeif008d6whd32ny7swf","openrouterId":"sao10k/l3.1-70b-hanami-x1","slug":"sao10k-l3.1-70b-hanami-x1","name":"Sao10K: Llama 3.1 70B Hanami x1","description":"This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).","contextLength":16000,"pricing":{"prompt":0.0000036,"completion":0.0000036,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3186738","provider":"sao10k","authorName":"Sao10K","authorSlug":"sao10k","iconUrl":"https://openrouter.ai/images/icons/Sao10K.svg","releaseDate":"2025-01-07T21:20:54.222Z","avgThroughputTps":10,"avgLatencyMs":618.5,"isActive":true},{"id":"cmoxkjced003u6whdbmqdgj6z","openrouterId":"moonshotai/kimi-k2-thinking","slug":"moonshotai-kimi-k2-thinking","name":"MoonshotAI: Kimi K2 Thinking","description":"Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in...","contextLength":262144,"pricing":{"prompt":7.2e-7,"completion":0.000003,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"3157229659","provider":"moonshotai","authorName":"moonshotai","authorSlug":"moonshotai","iconUrl":"https://openrouter.ai/images/icons/moonshotai.svg","releaseDate":"2025-11-06T09:50:22.752Z","avgThroughputTps":85,"avgLatencyMs":796,"isActive":true},{"id":"cmoxkje1l007e6whdo65k2wmz","openrouterId":"google/gemma-3-4b-it","slug":"google-gemma-3-4b-it","name":"Google: Gemma 3 4B","description":"Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...","contextLength":131072,"pricing":{"prompt":4.8e-8,"completion":9.6e-8,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"3139939380","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-03-13T18:38:30.653Z","avgThroughputTps":30,"avgLatencyMs":286,"isActive":true},{"id":"cmoxkjc5g003b6whdp8ouqxwd","openrouterId":"openai/gpt-5.1-codex-max","slug":"openai-gpt-5.1-codex-max","name":"OpenAI: GPT-5.1-Codex-Max","description":"GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained on agentic...","contextLength":400000,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"312838882","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-12-04T15:08:54.100Z","avgThroughputTps":138,"avgLatencyMs":29473,"isActive":true},{"id":"cmoxkjej9008f6whd8o1g6on5","openrouterId":"sao10k/l3.3-euryale-70b","slug":"sao10k-l3.3-euryale-70b","name":"Sao10K: Llama 3.3 Euryale 70B","description":"Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.2](/models/sao10k/l3-euryale-70b).","contextLength":131072,"pricing":{"prompt":7.8e-7,"completion":9e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"310939164","provider":"sao10k","authorName":"Sao10K","authorSlug":"sao10k","iconUrl":"https://openrouter.ai/images/icons/Sao10K.svg","releaseDate":"2024-12-18T10:32:08.468Z","avgThroughputTps":5,"avgLatencyMs":5403,"isActive":true},{"id":"cmoxkjd3n005d6whdcc6znv6k","openrouterId":"z-ai/glm-4.5v","slug":"z-ai-glm-4.5v","name":"Z.ai: GLM 4.5V","description":"GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding,...","contextLength":65536,"pricing":{"prompt":7.2e-7,"completion":0.0000021599999999999996,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"30573065","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-08-11T10:24:48.340Z","avgThroughputTps":28.5,"avgLatencyMs":2769.5,"isActive":true},{"id":"cmoxkjcnk004e6whd80l32t9a","openrouterId":"qwen/qwen3-vl-30b-a3b-thinking","slug":"qwen-qwen3-vl-30b-a3b-thinking","name":"Qwen: Qwen3 VL 30B A3B Thinking","description":"Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks. It excels...","contextLength":131072,"pricing":{"prompt":1.56e-7,"completion":0.000001872,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"305422432","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-10-06T19:47:59.575Z","avgThroughputTps":66.33333333333333,"avgLatencyMs":1082.8333333333333,"isActive":true},{"id":"cmoxkjd35005c6whd0wx639gz","openrouterId":"baidu/ernie-4.5-vl-28b-a3b","slug":"baidu-ernie-4.5-vl-28b-a3b","name":"Baidu: ERNIE 4.5 VL 28B A3B","description":"A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with 3B activated per token, delivering exceptional text and vision understanding through its innovative heterogeneous MoE structure with modality-isolated routing....","contextLength":131072,"pricing":{"prompt":1.68e-7,"completion":6.72e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"30430717","provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2025-08-12T17:07:16.565Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjf6v009u6whdkmuouo0a","openrouterId":"openai/gpt-3.5-turbo-0613","slug":"openai-gpt-3.5-turbo-0613","name":"OpenAI: GPT-3.5 Turbo (older v0613)","description":"GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.\n\nTraining data up to Sep 2021.","contextLength":4095,"pricing":{"prompt":0.0000012,"completion":0.0000024,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"30047200","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-01-24T19:00:00.000Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjemf008m6whdbsbo9o04","openrouterId":"amazon/nova-pro-v1","slug":"amazon-nova-pro-v1","name":"Amazon: Nova Pro 1.0","description":"Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks. As of December...","contextLength":300000,"pricing":{"prompt":9.6e-7,"completion":0.00000384,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"300447629","provider":"amazon","authorName":"Amazon","authorSlug":"amazon","iconUrl":"https://openrouter.ai/images/icons/Amazon.svg","releaseDate":"2024-12-05T17:05:03.587Z","avgThroughputTps":13.25,"avgLatencyMs":558.25,"isActive":true},{"id":"cmoxkjcrv004n6whd69nz8zsm","openrouterId":"qwen/qwen3-vl-235b-a22b-thinking","slug":"qwen-qwen3-vl-235b-a22b-thinking","name":"Qwen: Qwen3 VL 235B A22B Thinking","description":"Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STEM and math....","contextLength":131072,"pricing":{"prompt":3.12e-7,"completion":0.00000312,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"300187648","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-23T19:04:50.000Z","avgThroughputTps":36.5,"avgLatencyMs":1910.25,"isActive":true},{"id":"cmoxkjbbv001m6whdr7jxzaie","openrouterId":"mistralai/mistral-small-2603","slug":"mistralai-mistral-small-2603","name":"Mistral: Mistral Small 4","description":"Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It combines strong reasoning from...","contextLength":262144,"pricing":{"prompt":1.8e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"29715998556","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2026-03-16T17:14:45.935Z","avgThroughputTps":112,"avgLatencyMs":611,"isActive":true},{"id":"cmoxkjdst006v6whd185590lb","openrouterId":"qwen/qwen3-30b-a3b","slug":"qwen-qwen3-30b-a3b","name":"Qwen: Qwen3 30B A3B","description":"Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique...","contextLength":131072,"pricing":{"prompt":1.0800000000000001e-7,"completion":5.399999999999999e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2948338250","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-04-28T18:16:44.177Z","avgThroughputTps":38.25,"avgLatencyMs":1035,"isActive":true},{"id":"cmoxkjb4z00186whdu6cxwfd2","openrouterId":"qwen/qwen3.6-plus","slug":"qwen-qwen3.6-plus","name":"Qwen: Qwen3.6 Plus","description":"Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...","contextLength":1000000,"pricing":{"prompt":3.9e-7,"completion":0.00000234,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"294728818215","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-04-02T08:39:17.573Z","avgThroughputTps":41,"avgLatencyMs":1623,"isActive":true},{"id":"cmoxkjb84001e6whd0wxj2dcb","openrouterId":"google/lyria-3-clip-preview","slug":"google-lyria-3-clip-preview","name":"Google: Lyria 3 Clip Preview","description":"30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate...","contextLength":1048576,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image->text","audio"],"perWeekTokens":"2944337","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-03-30T17:47:35.284Z","avgThroughputTps":5,"avgLatencyMs":3463,"isActive":true},{"id":"cmoxkjc77003f6whdv6yn6w23","openrouterId":"mistralai/ministral-3b-2512","slug":"mistralai-ministral-3b-2512","name":"Mistral: Ministral 3 3B 2512","description":"The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.","contextLength":131072,"pricing":{"prompt":1.2e-7,"completion":1.2e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"2944082162","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-12-02T08:19:20.726Z","avgThroughputTps":45,"avgLatencyMs":367,"isActive":true},{"id":"cmoxkjc2q00356whdmtgoznoq","openrouterId":"mistralai/devstral-2512","slug":"mistralai-devstral-2512","name":"Mistral: Devstral 2 2512","description":"Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring...","contextLength":262144,"pricing":{"prompt":4.8e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"2922378217","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-12-09T08:03:39.000Z","avgThroughputTps":77,"avgLatencyMs":1639.5,"isActive":true},{"id":"cmoxkjeuu00946whdy4r0i5x5","openrouterId":"cohere/command-r-plus-08-2024","slug":"cohere-command-r-plus-08-2024","name":"Cohere: Command R+ (08-2024)","description":"command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint...","contextLength":128000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"29209087","provider":"cohere","authorName":"Cohere","authorSlug":"cohere","iconUrl":"https://openrouter.ai/images/icons/Cohere.svg","releaseDate":"2024-08-29T20:00:00.000Z","avgThroughputTps":42,"avgLatencyMs":238,"isActive":true},{"id":"cmoxkje9y007w6whd598y6wpu","openrouterId":"qwen/qwen-vl-plus","slug":"qwen-qwen-vl-plus","name":"Qwen: Qwen VL Plus","description":"Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixels and extreme aspect ratios for...","contextLength":131072,"pricing":{"prompt":1.638e-7,"completion":4.914e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"291123096","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-02-04T23:54:15.216Z","avgThroughputTps":96,"avgLatencyMs":214,"isActive":true},{"id":"cmoxkjbwq002t6whdoadxr9ak","openrouterId":"openai/gpt-5.2-codex","slug":"openai-gpt-5.2-codex","name":"OpenAI: GPT-5.2-Codex","description":"GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....","contextLength":400000,"pricing":{"prompt":0.0000021,"completion":0.0000168,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"2909306139","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-01-14T11:48:35.067Z","avgThroughputTps":39,"avgLatencyMs":1761,"isActive":true},{"id":"cmoxkjf0u009h6whdgyhzyrs7","openrouterId":"google/gemma-2-27b-it","slug":"google-gemma-2-27b-it","name":"Google: Gemma 2 27B","description":"Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of...","contextLength":8192,"pricing":{"prompt":7.8e-7,"completion":7.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"29071808","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2024-07-12T20:00:00.000Z","avgThroughputTps":8,"avgLatencyMs":592,"isActive":true},{"id":"cmoxkjcl600496whdm1q81vjm","openrouterId":"openai/o3-deep-research","slug":"openai-o3-deep-research","name":"OpenAI: o3 Deep Research","description":"o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks.\n\nNote: This model always uses the 'web_search' tool which adds additional cost.","contextLength":200000,"pricing":{"prompt":0.000012,"completion":0.000048,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"28748980","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-10-10T16:54:21.971Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjc68003d6whdi75kop49","openrouterId":"mistralai/ministral-14b-2512","slug":"mistralai-ministral-14b-2512","name":"Mistral: Ministral 3 14B 2512","description":"The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language...","contextLength":262144,"pricing":{"prompt":2.4e-7,"completion":2.4e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"2870431339","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-12-02T08:22:15.851Z","avgThroughputTps":40.25,"avgLatencyMs":527.25,"isActive":true},{"id":"cmoxkjcca003q6whd4q4izqnf","openrouterId":"openai/gpt-5.1","slug":"openai-gpt-5.1","name":"OpenAI: GPT-5.1","description":"GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning...","contextLength":400000,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"28702920035","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-11-13T13:58:25.000Z","avgThroughputTps":81.83333333333333,"avgLatencyMs":1122.8333333333333,"isActive":true},{"id":"cmoxkjegm00896whd72letp1x","openrouterId":"deepseek/deepseek-r1-distill-llama-70b","slug":"deepseek-deepseek-r1-distill-llama-70b","name":"DeepSeek: R1 Distill Llama 70B","description":"DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...","contextLength":131072,"pricing":{"prompt":8.399999999999999e-7,"completion":9.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"285920860","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-01-23T15:12:49.780Z","avgThroughputTps":40.75,"avgLatencyMs":298.75,"isActive":true},{"id":"cmoxkjc0400306whdbos4okab","openrouterId":"nvidia/nemotron-3-nano-30b-a3b:free","slug":"nvidia-nemotron-3-nano-30b-a3b-free","name":"NVIDIA: Nemotron 3 Nano 30B A3B (free)","description":"NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...","contextLength":256000,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"28586817394","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2025-12-14T11:54:35.000Z","avgThroughputTps":149,"avgLatencyMs":375,"isActive":true},{"id":"cmoxkjd4l005f6whdig2v50vh","openrouterId":"openai/gpt-5-chat","slug":"openai-gpt-5-chat","name":"OpenAI: GPT-5 Chat","description":"GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.","contextLength":128000,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"28359032002","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-07T13:30:37.425Z","avgThroughputTps":79,"avgLatencyMs":479,"isActive":true},{"id":"cmoxkjdxi00756whd47dgc9xs","openrouterId":"openai/gpt-4.1-nano","slug":"openai-gpt-4.1-nano","name":"OpenAI: GPT-4.1 Nano","description":"For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million...","contextLength":1047576,"pricing":{"prompt":1.2e-7,"completion":4.8e-7,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"28260165684","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-04-14T13:22:49.000Z","avgThroughputTps":42,"avgLatencyMs":849.5,"isActive":true},{"id":"cmoxkje6n007p6whdn4kn63pg","openrouterId":"google/gemini-2.0-flash-lite-001","slug":"google-gemini-2.0-flash-lite-001","name":"Google: Gemini 2.0 Flash Lite","description":"Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5),...","contextLength":1048576,"pricing":{"prompt":9e-8,"completion":3.6e-7,"image":9e-8,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"28246625263","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-02-25T12:56:52.206Z","avgThroughputTps":73.5,"avgLatencyMs":923,"isActive":true},{"id":"cmoxkjb5h00196whdg8q9xk4a","openrouterId":"z-ai/glm-5v-turbo","slug":"z-ai-glm-5v-turbo","name":"Z.ai: GLM 5V Turbo","description":"GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding,...","contextLength":202752,"pricing":{"prompt":0.00000144,"completion":0.0000048,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"2815199029","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2026-04-01T12:37:38.290Z","avgThroughputTps":37,"avgLatencyMs":4444,"isActive":true},{"id":"cmoxkjehl008b6whdikvmh27b","openrouterId":"minimax/minimax-01","slug":"minimax-minimax-01","name":"MiniMax: MiniMax-01","description":"MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion parameters activated per inference, and can handle a context...","contextLength":1000192,"pricing":{"prompt":2.4e-7,"completion":0.00000132,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"280040145","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2025-01-14T23:31:02.677Z","avgThroughputTps":31,"avgLatencyMs":883,"isActive":true},{"id":"cmoxkjcfc003w6whdc8zeln7z","openrouterId":"perplexity/sonar-pro-search","slug":"perplexity-sonar-pro-search","name":"Perplexity: Sonar Pro Search","description":"Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based...","contextLength":200000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"277882785","provider":"perplexity","authorName":"Perplexity","authorSlug":"perplexity","iconUrl":"https://openrouter.ai/images/icons/Perplexity.svg","releaseDate":"2025-10-30T15:59:26.000Z","avgThroughputTps":64,"avgLatencyMs":3969,"isActive":true},{"id":"cmoxkjb2200126whdo5gin94c","openrouterId":"anthropic/claude-opus-4.6-fast","slug":"anthropic-claude-opus-4.6-fast","name":"Anthropic: Claude Opus 4.6 (Fast)","description":"Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing.\n\nLearn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode","contextLength":1000000,"pricing":{"prompt":0.000036,"completion":0.00017999999999999998,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"2761013842","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2026-04-07T16:07:52.818Z","avgThroughputTps":60.5,"avgLatencyMs":1098.5,"isActive":true},{"id":"cmoxkjcjr00466whdyfcy18iy","openrouterId":"qwen/qwen3-vl-8b-thinking","slug":"qwen-qwen3-vl-8b-thinking","name":"Qwen: Qwen3 VL 8B Thinking","description":"Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and...","contextLength":256000,"pricing":{"prompt":1.404e-7,"completion":0.000001638,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"273878539","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-10-14T13:42:26.552Z","avgThroughputTps":125,"avgLatencyMs":930.5,"isActive":true},{"id":"cmoxkje3t007j6whdq37bq5f9","openrouterId":"rekaai/reka-flash-3","slug":"rekaai-reka-flash-3","name":"Reka Flash 3","description":"Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a...","contextLength":65536,"pricing":{"prompt":1.2e-7,"completion":2.4e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2725869","provider":"rekaai","authorName":"rekaai","authorSlug":"rekaai","iconUrl":"https://openrouter.ai/images/icons/rekaai.svg","releaseDate":"2025-03-12T16:53:33.296Z","avgThroughputTps":46,"avgLatencyMs":42422.5,"isActive":true},{"id":"cmoxkjdtr006x6whde9f56t8k","openrouterId":"qwen/qwen3-14b","slug":"qwen-qwen3-14b","name":"Qwen: Qwen3 14B","description":"Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a \"thinking\" mode for...","contextLength":131702,"pricing":{"prompt":1.2e-7,"completion":2.88e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2725066512","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-04-28T17:41:18.320Z","avgThroughputTps":37.666666666666664,"avgLatencyMs":432,"isActive":true},{"id":"cmoxkjdz800796whd5jducggj","openrouterId":"meta-llama/llama-4-maverick","slug":"meta-llama-llama-4-maverick","name":"Meta: Llama 4 Maverick","description":"Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...","contextLength":1048576,"pricing":{"prompt":1.8e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"27169523661","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2025-04-05T15:37:02.129Z","avgThroughputTps":27.375,"avgLatencyMs":706.75,"isActive":true},{"id":"cmoxkjeud00936whd5an37saa","openrouterId":"qwen/qwen-2.5-72b-instruct","slug":"qwen-qwen-2.5-72b-instruct","name":"Qwen2.5 72B Instruct","description":"Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...","contextLength":131072,"pricing":{"prompt":4.3199999999999995e-7,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2660056816","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2024-09-18T20:00:00.000Z","avgThroughputTps":16,"avgLatencyMs":7920.75,"isActive":true},{"id":"cmoxkjbxo002v6whdlkymeixs","openrouterId":"bytedance-seed/seed-1.6","slug":"bytedance-seed-seed-1.6","name":"ByteDance Seed: Seed 1.6","description":"Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.","contextLength":262144,"pricing":{"prompt":3e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"263301853","provider":"bytedance-seed","authorName":"bytedance-seed","authorSlug":"bytedance-seed","iconUrl":"https://openrouter.ai/images/icons/bytedance-seed.svg","releaseDate":"2025-12-23T10:49:57.000Z","avgThroughputTps":54.5,"avgLatencyMs":2009,"isActive":true},{"id":"cmoxkjd7f005l6whdgx7717o4","openrouterId":"openai/gpt-oss-20b:free","slug":"openai-gpt-oss-20b-free","name":"OpenAI: gpt-oss-20b (free)","description":"gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"26267024943","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-05T13:17:09.000Z","avgThroughputTps":19,"avgLatencyMs":10139.5,"isActive":true},{"id":"cmoxkjcis00446whd60jobrys","openrouterId":"openai/gpt-5-image-mini","slug":"openai-gpt-5-image-mini","name":"OpenAI: GPT-5 Image Mini","description":"GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for efficient image generation. This natively multimodal model features superior instruction following, text...","contextLength":400000,"pricing":{"prompt":0.000003,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","file->text","image"],"perWeekTokens":"262417406","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-10-16T10:23:03.143Z","avgThroughputTps":112,"avgLatencyMs":5886,"isActive":true},{"id":"cmoxkjcre004m6whd4uoykjrc","openrouterId":"google/gemini-2.5-flash-lite-preview-09-2025","slug":"google-gemini-2.5-flash-lite-preview-09-2025","name":"Google: Gemini 2.5 Flash Lite Preview 09-2025","description":"Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...","contextLength":1048576,"pricing":{"prompt":1.2e-7,"completion":4.8e-7,"image":1.2e-7,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"25974358090","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-09-25T13:01:26.198Z","avgThroughputTps":103,"avgLatencyMs":1550,"isActive":true},{"id":"cmoxkjcsc004o6whd7hi4kmue","openrouterId":"qwen/qwen3-vl-235b-a22b-instruct","slug":"qwen-qwen3-vl-235b-a22b-instruct","name":"Qwen: Qwen3 VL 235B A22B Instruct","description":"Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table...","contextLength":262144,"pricing":{"prompt":2.4e-7,"completion":0.000001056,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"25923911203","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-23T19:04:47.000Z","avgThroughputTps":27.833333333333332,"avgLatencyMs":1426.3333333333333,"isActive":true},{"id":"cmoxkjcet003v6whd71b0obno","openrouterId":"amazon/nova-premier-v1","slug":"amazon-nova-premier-v1","name":"Amazon: Nova Premier 1.0","description":"Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.","contextLength":1000000,"pricing":{"prompt":0.000003,"completion":0.000015,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"258787969","provider":"amazon","authorName":"Amazon","authorSlug":"amazon","iconUrl":"https://openrouter.ai/images/icons/Amazon.svg","releaseDate":"2025-10-31T18:38:52.074Z","avgThroughputTps":25,"avgLatencyMs":470,"isActive":true},{"id":"cmoxkjefo00876whdglkp9i2q","openrouterId":"deepseek/deepseek-r1-distill-qwen-32b","slug":"deepseek-deepseek-r1-distill-qwen-32b","name":"DeepSeek: R1 Distill Qwen 32B","description":"DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new...","contextLength":128000,"pricing":{"prompt":3.4799999999999994e-7,"completion":3.4799999999999994e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"255955668","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-01-29T18:53:50.865Z","avgThroughputTps":20,"avgLatencyMs":988,"isActive":true},{"id":"cmoxkjel4008j6whdug3jksap","openrouterId":"meta-llama/llama-3.3-70b-instruct","slug":"meta-llama-llama-3.3-70b-instruct","name":"Meta: Llama 3.3 70B Instruct","description":"The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...","contextLength":131072,"pricing":{"prompt":1.2e-7,"completion":3.84e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"25324834046","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-12-06T12:28:57.828Z","avgThroughputTps":38.67857142857143,"avgLatencyMs":528.8928571428571,"isActive":true},{"id":"cmoxkjcvw004w6whdfmrtxejk","openrouterId":"qwen/qwen3-next-80b-a3b-thinking","slug":"qwen-qwen3-next-80b-a3b-thinking","name":"Qwen: Qwen3 Next 80B A3B Thinking","description":"Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code synthesis/debugging, logic, and agentic...","contextLength":262144,"pricing":{"prompt":1.17e-7,"completion":9.36e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"252230593","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-11T13:38:04.192Z","avgThroughputTps":157.8,"avgLatencyMs":1463.1,"isActive":true},{"id":"cmoxkjbw9002s6whdd9om99mb","openrouterId":"z-ai/glm-4.7-flash","slug":"z-ai-glm-4.7-flash","name":"Z.ai: GLM 4.7 Flash","description":"As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...","contextLength":202752,"pricing":{"prompt":7.2e-8,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"25016530875","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2026-01-19T09:45:13.352Z","avgThroughputTps":43.333333333333336,"avgLatencyMs":1458.1666666666667,"isActive":true},{"id":"cmoxkjcta004q6whdumr0iw61","openrouterId":"qwen/qwen3-coder-plus","slug":"qwen-qwen3-coder-plus","name":"Qwen: Qwen3 Coder Plus","description":"Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...","contextLength":1000000,"pricing":{"prompt":7.8e-7,"completion":0.0000039,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2480705181","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-23T17:25:07.000Z","avgThroughputTps":39,"avgLatencyMs":935,"isActive":true},{"id":"cmoxkjf1a009i6whdopuvrbrn","openrouterId":"sao10k/l3-euryale-70b","slug":"sao10k-l3-euryale-70b","name":"Sao10k: Llama 3 Euryale 70B v2.1","description":"Euryale 70B v2.1 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). - Better prompt adherence. - Better anatomy / spatial awareness. - Adapts much better to unique and custom...","contextLength":8192,"pricing":{"prompt":0.000001776,"completion":0.000001776,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"24803626","provider":"sao10k","authorName":"Sao10K","authorSlug":"sao10k","iconUrl":"https://openrouter.ai/images/icons/Sao10K.svg","releaseDate":"2024-06-17T20:00:00.000Z","avgThroughputTps":28,"avgLatencyMs":745,"isActive":true},{"id":"cmoxkjbfm001u6whdgjov75wo","openrouterId":"inception/mercury-2","slug":"inception-mercury-2","name":"Inception: Mercury 2","description":"Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...","contextLength":128000,"pricing":{"prompt":3e-7,"completion":9e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2451598770","provider":"inception","authorName":"inception","authorSlug":"inception","iconUrl":"https://openrouter.ai/images/icons/inception.svg","releaseDate":"2026-03-04T09:57:55.767Z","avgThroughputTps":425,"avgLatencyMs":506,"isActive":true},{"id":"cmoxkjb4i00176whdrd6duzf3","openrouterId":"google/gemma-4-31b-it","slug":"google-gemma-4-31b-it","name":"Google: Gemma 4 31B","description":"Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...","contextLength":262144,"pricing":{"prompt":1.44e-7,"completion":4.44e-7,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"238062047537","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-04-02T12:48:06.471Z","avgThroughputTps":16.25,"avgLatencyMs":1349.875,"isActive":true},{"id":"cmoxkjaxa000s6whdsdg5jq9l","openrouterId":"tencent/hy3-preview","slug":"tencent-hy3-preview","name":"Tencent: Hy3 preview","description":"Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...","contextLength":262144,"pricing":{"prompt":7.92e-8,"completion":3.12e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2379299237139","provider":"tencent","authorName":"tencent","authorSlug":"tencent","iconUrl":"https://openrouter.ai/images/icons/tencent.svg","releaseDate":"2026-04-22T13:15:50.112Z","avgThroughputTps":32,"avgLatencyMs":4197,"isActive":true},{"id":"cmoxkjav2000n6whdg7mb4m1i","openrouterId":"openai/gpt-5.5-pro","slug":"openai-gpt-5.5-pro","name":"OpenAI: GPT-5.5 Pro","description":"GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...","contextLength":1050000,"pricing":{"prompt":0.000036,"completion":0.000216,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"2357478690","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-04-24T13:31:36.719Z","avgThroughputTps":15,"avgLatencyMs":32435,"isActive":true},{"id":"cmoxkjb9y001i6whdrvowwrag","openrouterId":"xiaomi/mimo-v2-pro","slug":"xiaomi-mimo-v2-pro","name":"Xiaomi: MiMo-V2-Pro","description":"MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like...","contextLength":1048576,"pricing":{"prompt":0.0000012,"completion":0.0000036,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"23338878686","provider":"xiaomi","authorName":"Xiaomi","authorSlug":"xiaomi","iconUrl":"https://openrouter.ai/images/icons/Xiaomi.svg","releaseDate":"2026-03-18T15:54:03.793Z","avgThroughputTps":54,"avgLatencyMs":1297,"isActive":true},{"id":"cmoxkjbk200236whdklhwfrr0","openrouterId":"liquid/lfm-2-24b-a2b","slug":"liquid-lfm-2-24b-a2b","name":"LiquidAI: LFM2-24B-A2B","description":"LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...","contextLength":128000,"pricing":{"prompt":3.6e-8,"completion":1.44e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2326576719","provider":"liquid","authorName":"Liquid","authorSlug":"liquid","iconUrl":"https://openrouter.ai/images/icons/Liquid.svg","releaseDate":"2026-02-25T14:45:11.111Z","avgThroughputTps":44,"avgLatencyMs":4794.5,"isActive":true},{"id":"cmoxkjda9005r6whdz77hxjwe","openrouterId":"z-ai/glm-4.5","slug":"z-ai-glm-4.5","name":"Z.ai: GLM 4.5","description":"GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly...","contextLength":131072,"pricing":{"prompt":7.2e-7,"completion":0.00000264,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2311717447","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-07-25T15:22:27.278Z","avgThroughputTps":26.5,"avgLatencyMs":2250.75,"isActive":true},{"id":"cmoxkjdk4006c6whd94x9oc9p","openrouterId":"mistralai/mistral-small-3.2-24b-instruct","slug":"mistralai-mistral-small-3.2-24b-instruct","name":"Mistral: Mistral Small 3.2 24B","description":"Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...","contextLength":128000,"pricing":{"prompt":9e-8,"completion":2.4e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"22948696010","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-06-20T14:10:16.960Z","avgThroughputTps":45.75,"avgLatencyMs":343.875,"isActive":true},{"id":"cmoxkjc4i00396whdz8lll2yx","openrouterId":"essentialai/rnj-1-instruct","slug":"essentialai-rnj-1-instruct","name":"EssentialAI: Rnj 1 Instruct","description":"Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential AI and trained from scratch with a focus on programming, math, and scientific reasoning. The model demonstrates strong performance...","contextLength":32768,"pricing":{"prompt":1.8e-7,"completion":1.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"229193843","provider":"essentialai","authorName":"essentialai","authorSlug":"essentialai","iconUrl":"https://openrouter.ai/images/icons/essentialai.svg","releaseDate":"2025-12-07T03:07:27.970Z","avgThroughputTps":112,"avgLatencyMs":164,"isActive":true},{"id":"cmoxkjb93001g6whd18ojaon8","openrouterId":"rekaai/reka-edge","slug":"rekaai-reka-edge","name":"Reka Edge","description":"Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding,...","contextLength":16384,"pricing":{"prompt":1.2e-7,"completion":1.2e-7,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"22917090","provider":"rekaai","authorName":"rekaai","authorSlug":"rekaai","iconUrl":"https://openrouter.ai/images/icons/rekaai.svg","releaseDate":"2026-03-20T13:16:05.541Z","avgThroughputTps":22,"avgLatencyMs":976,"isActive":true},{"id":"cmoxkje2i007g6whdj2pv9u42","openrouterId":"cohere/command-a","slug":"cohere-command-a","name":"Cohere: Command A","description":"Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases. Compared to other leading proprietary...","contextLength":256000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"22783012","provider":"cohere","authorName":"Cohere","authorSlug":"cohere","iconUrl":"https://openrouter.ai/images/icons/Cohere.svg","releaseDate":"2025-03-13T15:32:22.069Z","avgThroughputTps":12.5,"avgLatencyMs":326.5,"isActive":true},{"id":"cmoxkjbsj002k6whdcba0qq07","openrouterId":"moonshotai/kimi-k2.5","slug":"moonshotai-kimi-k2.5","name":"MoonshotAI: Kimi K2.5","description":"Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed...","contextLength":262144,"pricing":{"prompt":4.8e-7,"completion":0.00000228,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"224625152255","provider":"moonshotai","authorName":"moonshotai","authorSlug":"moonshotai","iconUrl":"https://openrouter.ai/images/icons/moonshotai.svg","releaseDate":"2026-01-26T23:11:16.000Z","avgThroughputTps":41.23076923076923,"avgLatencyMs":1550.8461538461538,"isActive":true},{"id":"cmoxkje4r007l6whd49927ag0","openrouterId":"thedrummer/skyfall-36b-v2","slug":"thedrummer-skyfall-36b-v2","name":"TheDrummer: Skyfall 36B V2","description":"Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.","contextLength":32768,"pricing":{"prompt":6.6e-7,"completion":9.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"223983130","provider":"thedrummer","authorName":"Drummer","authorSlug":"thedrummer","iconUrl":"https://openrouter.ai/images/icons/Drummer.svg","releaseDate":"2025-03-10T15:56:06.007Z","avgThroughputTps":52,"avgLatencyMs":446.5,"isActive":true},{"id":"cmoxkjcaz003n6whdhswu6ekk","openrouterId":"google/gemini-3-pro-image-preview","slug":"google-gemini-3-pro-image-preview","name":"Google: Nano Banana Pro (Gemini 3 Pro Image Preview)","description":"Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and...","contextLength":65536,"pricing":{"prompt":0.0000024,"completion":0.0000144,"image":0.0000024,"request":0},"modalities":["text","image->text","image"],"perWeekTokens":"2229276736","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-11-20T10:49:57.064Z","avgThroughputTps":61,"avgLatencyMs":11256,"isActive":true},{"id":"cmoxkjbdb001p6whd7qfm4hia","openrouterId":"nvidia/nemotron-3-super-120b-a12b","slug":"nvidia-nemotron-3-super-120b-a12b","name":"NVIDIA: Nemotron 3 Super","description":"NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...","contextLength":1000000,"pricing":{"prompt":1.0800000000000001e-7,"completion":5.399999999999999e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"22152613577","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2026-03-11T12:07:19.000Z","avgThroughputTps":41.666666666666664,"avgLatencyMs":4278,"isActive":true},{"id":"cmoxkjf1o009j6whdehdtnuil","openrouterId":"nousresearch/hermes-2-pro-llama-3-8b","slug":"nousresearch-hermes-2-pro-llama-3-8b","name":"NousResearch: Hermes 2 Pro - Llama-3 8B","description":"Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced...","contextLength":8192,"pricing":{"prompt":1.68e-7,"completion":1.68e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"22112788","provider":"nousresearch","authorName":"Nous Research","authorSlug":"nousresearch","iconUrl":"https://openrouter.ai/images/icons/Nous Research.svg","releaseDate":"2024-05-26T20:00:00.000Z","avgThroughputTps":89.5,"avgLatencyMs":585,"isActive":true},{"id":"cmoxkjec300806whd4y79dfxq","openrouterId":"qwen/qwen-vl-max","slug":"qwen-qwen-vl-max","name":"Qwen: Qwen VL Max","description":"Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in delivering optimal performance for a broader spectrum of complex tasks.\n","contextLength":131072,"pricing":{"prompt":6.24e-7,"completion":0.000002496,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"2208350540","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-02-01T13:25:04.223Z","avgThroughputTps":33,"avgLatencyMs":648.5,"isActive":true},{"id":"cmoxkje4a007k6whdx3fwpsm8","openrouterId":"google/gemma-3-27b-it","slug":"google-gemma-3-27b-it","name":"Google: Gemma 3 27B","description":"Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...","contextLength":131072,"pricing":{"prompt":9.6e-8,"completion":1.92e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"21731896306","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-03-12T01:12:39.645Z","avgThroughputTps":18.8,"avgLatencyMs":739,"isActive":true},{"id":"cmoxkjf0d009g6whdhz6725jy","openrouterId":"openai/gpt-4o-mini","slug":"openai-gpt-4o-mini","name":"OpenAI: GPT-4o-mini","description":"GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...","contextLength":128000,"pricing":{"prompt":1.8e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"216967156575","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-07-17T20:00:00.000Z","avgThroughputTps":33.5,"avgLatencyMs":643,"isActive":true},{"id":"cmoxkjdiz006a6whdcki4dbmx","openrouterId":"baidu/ernie-4.5-vl-424b-a47b","slug":"baidu-ernie-4.5-vl-424b-a47b","name":"Baidu: ERNIE 4.5 VL 424B A47B ","description":"ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data...","contextLength":131072,"pricing":{"prompt":5.04e-7,"completion":0.0000015,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"21586584","provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2025-06-30T12:28:23.022Z","avgThroughputTps":37,"avgLatencyMs":3180,"isActive":true},{"id":"cmoxkjdci005w6whdfg215xtx","openrouterId":"qwen/qwen3-coder:free","slug":"qwen-qwen3-coder-free","name":"Qwen: Qwen3 Coder 480B A35B (free)","description":"Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over...","contextLength":1048576,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2144467293","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-07-22T20:29:06.000Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjawb000q6whdcwmzigt6","openrouterId":"deepseek/deepseek-v4-flash","slug":"deepseek-deepseek-v4-flash","name":"DeepSeek: DeepSeek V4 Flash","description":"DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...","contextLength":1048576,"pricing":{"prompt":1.344e-7,"completion":2.6879999999999997e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2138477930607","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2026-04-23T23:17:46.710Z","avgThroughputTps":45.166666666666664,"avgLatencyMs":1614.5416666666667,"isActive":true},{"id":"cmoxkjaul000m6whda1oeqqjb","openrouterId":"qwen/qwen3.6-27b","slug":"qwen-qwen3.6-27b","name":"Qwen: Qwen3.6 27B","description":"Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities — accepting text, image, and video inputs...","contextLength":262144,"pricing":{"prompt":3.84e-7,"completion":0.00000384,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"21060970316","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-04-26T21:57:44.058Z","avgThroughputTps":39.416666666666664,"avgLatencyMs":1622,"isActive":true},{"id":"cmoxkjbe9001r6whdtmev3bh1","openrouterId":"qwen/qwen3.5-9b","slug":"qwen-qwen3.5-9b","name":"Qwen: Qwen3.5-9B","description":"Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...","contextLength":262144,"pricing":{"prompt":4.8e-8,"completion":1.8e-7,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"20971719763","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-03-10T10:19:56.963Z","avgThroughputTps":67.66666666666667,"avgLatencyMs":540.5,"isActive":true},{"id":"cmoxkjbf8001t6whdnmmz72id","openrouterId":"openai/gpt-5.4","slug":"openai-gpt-5.4","name":"OpenAI: GPT-5.4","description":"GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for...","contextLength":1050000,"pricing":{"prompt":0.000003,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"208447012027","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-03-05T13:12:32.000Z","avgThroughputTps":53.5,"avgLatencyMs":2559.5,"isActive":true},{"id":"cmoxkjat4000j6whd79hco4ae","openrouterId":"qwen/qwen3.6-flash","slug":"qwen-qwen3.6-flash","name":"Qwen: Qwen3.6 Flash","description":"Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window. Tiered pricing kicks in...","contextLength":1000000,"pricing":{"prompt":2.25e-7,"completion":0.00000135,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"20834488821","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-04-26T23:42:42.075Z","avgThroughputTps":156,"avgLatencyMs":807,"isActive":true},{"id":"cmoxkjcja00456whdgxomzz5c","openrouterId":"anthropic/claude-haiku-4.5","slug":"anthropic-claude-haiku-4.5","name":"Anthropic: Claude Haiku 4.5","description":"Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance...","contextLength":200000,"pricing":{"prompt":0.0000012,"completion":0.000006,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"207678481014","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-10-15T13:00:38.000Z","avgThroughputTps":76.2,"avgLatencyMs":718.8,"isActive":true},{"id":"cmoxkjaot000a6whdko3afeeb","openrouterId":"poolside/laguna-m.1:free","slug":"poolside-laguna-m.1-free","name":"Poolside: Laguna M.1 (free)","description":"Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai), optimized for complex software engineering tasks. Designed for agentic coding workflows, it supports tool calling and reasoning, with a 128K...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"207496538661","provider":"poolside","authorName":"poolside","authorSlug":"poolside","iconUrl":"https://openrouter.ai/images/icons/poolside.svg","releaseDate":"2026-04-28T11:01:44.070Z","avgThroughputTps":24,"avgLatencyMs":1720,"isActive":true},{"id":"cmoxkjdph006o6whdhg4qot9u","openrouterId":"mistralai/mistral-medium-3","slug":"mistralai-mistral-medium-3","name":"Mistral: Mistral Medium 3","description":"Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances state-of-the-art reasoning and multimodal performance with 8× lower cost...","contextLength":131072,"pricing":{"prompt":4.8e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"205496013","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-05-07T10:15:41.980Z","avgThroughputTps":28,"avgLatencyMs":530,"isActive":true},{"id":"cmoxkjd2b005a6whd3kwhn9zj","openrouterId":"mistralai/mistral-medium-3.1","slug":"mistralai-mistral-medium-3.1","name":"Mistral: Mistral Medium 3.1","description":"Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances...","contextLength":131072,"pricing":{"prompt":4.8e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"2045488049","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-08-13T10:33:59.459Z","avgThroughputTps":31,"avgLatencyMs":1028,"isActive":true},{"id":"cmoxkjdup006z6whd57h7ru7m","openrouterId":"qwen/qwen3-235b-a22b","slug":"qwen-qwen3-235b-a22b","name":"Qwen: Qwen3 235B A22B","description":"Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a \"thinking\" mode for complex reasoning, math, and...","contextLength":131072,"pricing":{"prompt":5.459999999999999e-7,"completion":0.000002184,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2042540819","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-04-28T17:29:17.256Z","avgThroughputTps":40,"avgLatencyMs":657,"isActive":true},{"id":"cmoxkjboq002c6whdzhjwqc17","openrouterId":"minimax/minimax-m2.5","slug":"minimax-minimax-m2.5","name":"MiniMax: MiniMax M2.5","description":"MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...","contextLength":204800,"pricing":{"prompt":1.8e-7,"completion":0.00000138,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"203269068920","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2026-02-12T10:01:42.000Z","avgThroughputTps":71.32352941176471,"avgLatencyMs":1619.3823529411766,"isActive":true},{"id":"cmoxkjch400406whdodjo2gcm","openrouterId":"minimax/minimax-m2","slug":"minimax-minimax-m2","name":"MiniMax: MiniMax M2","description":"MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning,...","contextLength":204800,"pricing":{"prompt":3.0599999999999996e-7,"completion":0.0000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"2007775029","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2025-10-23T16:41:33.120Z","avgThroughputTps":80.5,"avgLatencyMs":1100.875,"isActive":true},{"id":"cmoxkjevq00966whdfbm44g4v","openrouterId":"sao10k/l3.1-euryale-70b","slug":"sao10k-l3.1-euryale-70b","name":"Sao10K: Llama 3.1 Euryale 70B v2.2","description":"Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).","contextLength":131072,"pricing":{"prompt":0.00000102,"completion":0.00000102,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"199869083","provider":"sao10k","authorName":"Sao10K","authorSlug":"sao10k","iconUrl":"https://openrouter.ai/images/icons/Sao10K.svg","releaseDate":"2024-08-27T20:00:00.000Z","avgThroughputTps":34,"avgLatencyMs":579.25,"isActive":true},{"id":"cmoxkjek6008h6whdzvh1bpdg","openrouterId":"cohere/command-r7b-12-2024","slug":"cohere-command-r7b-12-2024","name":"Cohere: Command R7B (12-2024)","description":"Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, and similar tasks requiring complex reasoning...","contextLength":128000,"pricing":{"prompt":4.5e-8,"completion":1.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"198424162","provider":"cohere","authorName":"Cohere","authorSlug":"cohere","iconUrl":"https://openrouter.ai/images/icons/Cohere.svg","releaseDate":"2024-12-14T01:35:52.905Z","avgThroughputTps":58,"avgLatencyMs":237,"isActive":true},{"id":"cmoxkje90007u6whddkr7xgld","openrouterId":"openai/o3-mini-high","slug":"openai-o3-mini-high","name":"OpenAI: o3 Mini High","description":"OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and...","contextLength":200000,"pricing":{"prompt":0.00000132,"completion":0.00000528,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"19753673","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-02-12T10:03:31.504Z","avgThroughputTps":12.5,"avgLatencyMs":263,"isActive":true},{"id":"cmoxkjdh200666whdgi7f221s","openrouterId":"x-ai/grok-4","slug":"x-ai-grok-4","name":"xAI: Grok 4","description":"Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not...","contextLength":256000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1975182151","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-07-09T15:01:29.595Z","avgThroughputTps":16,"avgLatencyMs":49070,"isActive":true},{"id":"cmoxkjc6p003e6whd4kev7a64","openrouterId":"mistralai/ministral-8b-2512","slug":"mistralai-ministral-8b-2512","name":"Mistral: Ministral 3 8B 2512","description":"A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.","contextLength":262144,"pricing":{"prompt":1.8e-7,"completion":1.8e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"19663047431","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-12-02T08:20:54.103Z","avgThroughputTps":27.5,"avgLatencyMs":446.5,"isActive":true},{"id":"cmoxkjcz300536whd8dxijg73","openrouterId":"moonshotai/kimi-k2-0905","slug":"moonshotai-kimi-k2-0905","name":"MoonshotAI: Kimi K2 0905","description":"Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32...","contextLength":262144,"pricing":{"prompt":7.2e-7,"completion":0.000003,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"19329779665","provider":"moonshotai","authorName":"moonshotai","authorSlug":"moonshotai","iconUrl":"https://openrouter.ai/images/icons/moonshotai.svg","releaseDate":"2025-09-04T17:25:47.673Z","avgThroughputTps":45,"avgLatencyMs":1350.8333333333333,"isActive":true},{"id":"cmoxkjc8l003i6whd6534ebm8","openrouterId":"deepseek/deepseek-v3.2-speciale","slug":"deepseek-deepseek-v3.2-speciale","name":"DeepSeek: DeepSeek V3.2 Speciale","description":"DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context processing, then scales post-training reinforcement learning...","contextLength":163840,"pricing":{"prompt":3.444e-7,"completion":5.172e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"192440673","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-12-01T08:13:57.971Z","avgThroughputTps":40,"avgLatencyMs":1199,"isActive":true},{"id":"cmoxkjdep00616whdjnnz4h2m","openrouterId":"switchpoint/router","slug":"switchpoint-router","name":"Switchpoint Router","description":"Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you...","contextLength":131072,"pricing":{"prompt":0.00000102,"completion":0.00000408,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1911414","provider":"switchpoint","authorName":"switchpoint","authorSlug":"switchpoint","iconUrl":"https://openrouter.ai/images/icons/switchpoint.svg","releaseDate":"2025-07-11T18:28:19.000Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjal500026whdeyuampn0","openrouterId":"baidu/cobuddy:free","slug":"baidu-cobuddy-free","name":"Baidu Qianfan: CoBuddy (free)","description":"CoBuddy is a code generation model from Baidu, optimized for coding tasks and AI Agent workflows. It features high inference throughput and low end-to-end latency, with native support for tool...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"19112403782","provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2026-05-05T22:44:40.376Z","avgThroughputTps":28,"avgLatencyMs":5721,"isActive":true},{"id":"cmoxkjbcc001n6whdzz001yvp","openrouterId":"z-ai/glm-5-turbo","slug":"z-ai-glm-5-turbo","name":"Z.ai: GLM 5 Turbo","description":"GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows...","contextLength":202752,"pricing":{"prompt":0.00000144,"completion":0.0000048,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"18550367878","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2026-03-15T10:06:13.880Z","avgThroughputTps":27.5,"avgLatencyMs":2449,"isActive":true},{"id":"cmoxkjctq004r6whd4ti1msnr","openrouterId":"openai/gpt-5-codex","slug":"openai-gpt-5-codex","name":"OpenAI: GPT-5 Codex","description":"GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....","contextLength":400000,"pricing":{"prompt":0.0000015,"completion":0.000012,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"1843746972","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-09-23T12:03:23.098Z","avgThroughputTps":133,"avgLatencyMs":6006,"isActive":true},{"id":"1d29569d3ce84f55a0342957","openrouterId":"perceptron/perceptron-mk1","slug":"perceptron-perceptron-mk1","name":"Perceptron: Perceptron Mk1","description":"Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding...","contextLength":32768,"pricing":{"prompt":1.8e-7,"completion":0.0000018,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"183597152","provider":"perceptron","authorName":"perceptron","authorSlug":"perceptron","iconUrl":"https://openrouter.ai/images/icons/perceptron.svg","releaseDate":"2026-05-12T10:43:49.920Z","avgThroughputTps":46,"avgLatencyMs":583,"isActive":true},{"id":"cmoxkjayq000v6whdk03wa1wa","openrouterId":"openai/gpt-5.4-image-2","slug":"openai-gpt-5.4-image-2","name":"OpenAI: GPT-5.4 Image 2","description":"[GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...","contextLength":272000,"pricing":{"prompt":0.0000096,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text","image"],"perWeekTokens":"1824503436","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-04-21T14:52:08.583Z","avgThroughputTps":41,"avgLatencyMs":505,"isActive":true},{"id":"cmoxkjeg600886whdf577rxws","openrouterId":"perplexity/sonar","slug":"perplexity-sonar","name":"Perplexity: Sonar","description":"Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed for companies seeking to integrate lightweight question-and-answer features...","contextLength":127072,"pricing":{"prompt":0.0000012,"completion":0.0000012,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"1816851962","provider":"perplexity","authorName":"Perplexity","authorSlug":"perplexity","iconUrl":"https://openrouter.ai/images/icons/Perplexity.svg","releaseDate":"2025-01-27T16:36:48.666Z","avgThroughputTps":99.5,"avgLatencyMs":1850.5,"isActive":true},{"id":"cmoxkjbt0002l6whda3rjmut1","openrouterId":"upstage/solar-pro-3","slug":"upstage-solar-pro-3","name":"Upstage: Solar Pro 3","description":"Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized...","contextLength":128000,"pricing":{"prompt":1.8e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"181660124","provider":"upstage","authorName":"upstage","authorSlug":"upstage","iconUrl":"https://openrouter.ai/images/icons/upstage.svg","releaseDate":"2026-01-26T21:33:20.601Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjc1s00336whdohq6p5um","openrouterId":"openai/gpt-5.2-pro","slug":"openai-gpt-5.2-pro","name":"OpenAI: GPT-5.2 Pro","description":"GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning,...","contextLength":400000,"pricing":{"prompt":0.0000252,"completion":0.0002016,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"180666764","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-12-10T13:03:00.055Z","avgThroughputTps":2,"avgLatencyMs":3482,"isActive":true},{"id":"cmoxkjcu8004s6whdbjphv4y0","openrouterId":"deepseek/deepseek-v3.1-terminus","slug":"deepseek-deepseek-v3.1-terminus","name":"DeepSeek: DeepSeek V3.1 Terminus","description":"DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's...","contextLength":163840,"pricing":{"prompt":3.24e-7,"completion":0.00000114,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"18023552399","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-09-22T09:37:55.611Z","avgThroughputTps":18.5,"avgLatencyMs":1586.375,"isActive":true},{"id":"cmoxkjcwq004y6whdmw0t3ild","openrouterId":"qwen/qwen3-next-80b-a3b-instruct","slug":"qwen-qwen3-next-80b-a3b-instruct","name":"Qwen: Qwen3 Next 80B A3B Instruct","description":"Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...","contextLength":262144,"pricing":{"prompt":1.0800000000000001e-7,"completion":0.00000132,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"17948270898","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-11T13:36:53.637Z","avgThroughputTps":56,"avgLatencyMs":835.5833333333334,"isActive":true},{"id":"cmoxkjf9h00a06whdmns7dl0j","openrouterId":"openai/gpt-3.5-turbo-16k","slug":"openai-gpt-3.5-turbo-16k","name":"OpenAI: GPT-3.5 Turbo 16k","description":"This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up...","contextLength":16385,"pricing":{"prompt":0.0000036,"completion":0.0000048,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1781399","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2023-08-27T20:00:00.000Z","avgThroughputTps":12,"avgLatencyMs":558,"isActive":true},{"id":"cmoxkjcpv004j6whdqo5a2spg","openrouterId":"deepseek/deepseek-v3.2-exp","slug":"deepseek-deepseek-v3.2-exp","name":"DeepSeek: DeepSeek V3.2 Exp","description":"DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...","contextLength":163840,"pricing":{"prompt":3.24e-7,"completion":4.92e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"17748840477","provider":"deepseek","authorName":"DeepSeek","authorSlug":"deepseek","iconUrl":"https://openrouter.ai/images/icons/DeepSeek.svg","releaseDate":"2025-09-29T08:54:41.802Z","avgThroughputTps":21.666666666666668,"avgLatencyMs":1654.6666666666667,"isActive":true},{"id":"cmoxkjc0u00316whd2f3cukdf","openrouterId":"nvidia/nemotron-3-nano-30b-a3b","slug":"nvidia-nemotron-3-nano-30b-a3b","name":"NVIDIA: Nemotron 3 Nano 30B A3B","description":"NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...","contextLength":262144,"pricing":{"prompt":6e-8,"completion":2.4e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"17314286376","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2025-12-14T11:54:35.000Z","avgThroughputTps":60,"avgLatencyMs":1959.5,"isActive":true},{"id":"cmoxkje3c007i6whdwclrpe2m","openrouterId":"openai/gpt-4o-search-preview","slug":"openai-gpt-4o-search-preview","name":"OpenAI: GPT-4o Search Preview","description":"GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.","contextLength":128000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"16909386","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-03-12T18:19:09.996Z","avgThroughputTps":11,"avgLatencyMs":2015,"isActive":true},{"id":"cmoxkjejq008g6whd8smmwphq","openrouterId":"openai/o1","slug":"openai-o1","name":"OpenAI: o1","description":"The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason...","contextLength":200000,"pricing":{"prompt":0.000018,"completion":0.000072,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"16902951","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-12-17T13:26:39.576Z","avgThroughputTps":58,"avgLatencyMs":8285,"isActive":true},{"id":"cmoxkjb8m001f6whdcjronzmk","openrouterId":"kwaipilot/kat-coder-pro-v2","slug":"kwaipilot-kat-coder-pro-v2","name":"Kwaipilot: KAT-Coder-Pro V2","description":"KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions,...","contextLength":256000,"pricing":{"prompt":3.6e-7,"completion":0.00000144,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1689662334","provider":"kwaipilot","authorName":"kwaipilot","authorSlug":"kwaipilot","iconUrl":"https://openrouter.ai/images/icons/kwaipilot.svg","releaseDate":"2026-03-27T18:08:30.640Z","avgThroughputTps":55,"avgLatencyMs":1837.5,"isActive":true},{"id":"cmoxkjawt000r6whd1ip859xd","openrouterId":"inclusionai/ling-2.6-1t","slug":"inclusionai-ling-2.6-1t","name":"inclusionAI: Ling-2.6-1T","description":"Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast execution and high efficiency at scale. It uses a “fast...","contextLength":262144,"pricing":{"prompt":3.6e-7,"completion":0.000003,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1682529656","provider":"inclusionai","authorName":"inclusionai","authorSlug":"inclusionai","iconUrl":"https://openrouter.ai/images/icons/inclusionai.svg","releaseDate":"2026-04-23T08:43:58.266Z","avgThroughputTps":11,"avgLatencyMs":4154,"isActive":true},{"id":"cmoxkjdgl00656whdlx9tswiv","openrouterId":"cognitivecomputations/dolphin-mistral-24b-venice-edition:free","slug":"cognitivecomputations-dolphin-mistral-24b-venice-edition-free","name":"Venice: Uncensored (free)","description":"Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving...","contextLength":32768,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"167922202","provider":"cognitivecomputations","authorName":"venice","authorSlug":"venice","iconUrl":"https://openrouter.ai/images/icons/venice.svg","releaseDate":"2025-07-09T17:02:46.328Z","avgThroughputTps":75,"avgLatencyMs":599.5,"isActive":true},{"id":"cmoxkjemw008n6whdkw9fkubz","openrouterId":"openai/gpt-4o-2024-11-20","slug":"openai-gpt-4o-2024-11-20","name":"OpenAI: GPT-4o (2024-11-20)","description":"The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with uploaded...","contextLength":128000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1678981957","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-11-20T13:33:14.771Z","avgThroughputTps":97,"avgLatencyMs":371,"isActive":true},{"id":"cmoxkjcm5004b6whd5kn3l846","openrouterId":"nvidia/llama-3.3-nemotron-super-49b-v1.5","slug":"nvidia-llama-3.3-nemotron-super-49b-v1.5","name":"NVIDIA: Llama 3.3 Nemotron Super 49B V1.5","description":"Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...","contextLength":131072,"pricing":{"prompt":1.2e-7,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"167289943","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2025-10-10T09:03:15.135Z","avgThroughputTps":46.5,"avgLatencyMs":208,"isActive":true},{"id":"cmoxkjfaf00a26whdrwdue9jd","openrouterId":"undi95/remm-slerp-l2-13b","slug":"undi95-remm-slerp-l2-13b","name":"ReMM SLERP 13B","description":"A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge","contextLength":6144,"pricing":{"prompt":5.399999999999999e-7,"completion":7.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"164893071","provider":"undi95","authorName":"Undi","authorSlug":"undi95","iconUrl":"https://openrouter.ai/images/icons/Undi.svg","releaseDate":"2023-07-21T20:00:00.000Z","avgThroughputTps":24.5,"avgLatencyMs":837,"isActive":true},{"id":"cmoxkjedf00826whdt5e3zyqh","openrouterId":"qwen/qwen2.5-vl-72b-instruct","slug":"qwen-qwen2.5-vl-72b-instruct","name":"Qwen: Qwen2.5 VL 72B Instruct","description":"Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.","contextLength":131072,"pricing":{"prompt":3e-7,"completion":9e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"1648648470","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-02-01T06:45:11.997Z","avgThroughputTps":17,"avgLatencyMs":1220.5,"isActive":true},{"id":"cmoxkjb3j00156whdixpjip5h","openrouterId":"google/gemma-4-26b-a4b-it","slug":"google-gemma-4-26b-a4b-it","name":"Google: Gemma 4 26B A4B ","description":"Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...","contextLength":262144,"pricing":{"prompt":7.2e-8,"completion":3.96e-7,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"163996285590","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-04-03T10:53:09.504Z","avgThroughputTps":19.9,"avgLatencyMs":699.9,"isActive":true},{"id":"cmoxkje82007s6whdy45992ts","openrouterId":"mistralai/mistral-saba","slug":"mistralai-mistral-saba","name":"Mistral: Saba","description":"Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance. Trained on curated regional...","contextLength":32768,"pricing":{"prompt":2.4e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"16058032","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-02-17T09:40:39.116Z","avgThroughputTps":9,"avgLatencyMs":413,"isActive":true},{"id":"cmoxkjb75001c6whdz05i6jw3","openrouterId":"x-ai/grok-4.20","slug":"x-ai-grok-4.20","name":"xAI: Grok 4.20","description":"Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering...","contextLength":2000000,"pricing":{"prompt":0.0000015,"completion":0.000003,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"15779477251","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2026-03-31T13:43:39.677Z","avgThroughputTps":55,"avgLatencyMs":1145.5,"isActive":true},{"id":"cmoxkjbyi002x6whd65yuoz02","openrouterId":"z-ai/glm-4.7","slug":"z-ai-glm-4.7","name":"Z.ai: GLM 4.7","description":"GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while...","contextLength":202752,"pricing":{"prompt":4.8e-7,"completion":0.0000021,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"156608769027","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-12-21T23:33:34.884Z","avgThroughputTps":88.7,"avgLatencyMs":2370.5,"isActive":true},{"id":"cmoxkjamv00066whddhtuw5un","openrouterId":"mistralai/mistral-medium-3-5","slug":"mistralai-mistral-medium-3-5","name":"Mistral: Mistral Medium 3.5","description":"Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with text output, and is designed for agentic workflows, coding, and complex...","contextLength":262144,"pricing":{"prompt":0.0000018,"completion":0.000009,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1564728849","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2026-04-30T13:33:59.822Z","avgThroughputTps":37,"avgLatencyMs":996,"isActive":true},{"id":"cmoxkjd44005e6whdt5464mse","openrouterId":"ai21/jamba-large-1.7","slug":"ai21-jamba-large-1.7","name":"AI21: Jamba Large 1.7","description":"Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context...","contextLength":256000,"pricing":{"prompt":0.0000024,"completion":0.0000096,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"155419242","provider":"ai21","authorName":"AI21","authorSlug":"ai21","iconUrl":"https://openrouter.ai/images/icons/AI21.svg","releaseDate":"2025-08-08T12:03:40.335Z","avgThroughputTps":49.5,"avgLatencyMs":757,"isActive":true},{"id":"cmoxkjetf00916whd8qt4j2px","openrouterId":"meta-llama/llama-3.2-1b-instruct","slug":"meta-llama-llama-3.2-1b-instruct","name":"Meta: Llama 3.2 1B Instruct","description":"Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate...","contextLength":131072,"pricing":{"prompt":3.24e-8,"completion":2.412e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"151001164","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-09-24T20:00:00.000Z","avgThroughputTps":73,"avgLatencyMs":180,"isActive":true},{"id":"cmoxkjcvh004v6whd0ijnvs8d","openrouterId":"qwen/qwen3-coder-flash","slug":"qwen-qwen3-coder-flash","name":"Qwen: Qwen3 Coder Flash","description":"Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autonomous programming via tool calling...","contextLength":1000000,"pricing":{"prompt":2.34e-7,"completion":0.00000117,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1494962824","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-17T09:25:36.216Z","avgThroughputTps":54,"avgLatencyMs":935,"isActive":true},{"id":"cmoxkjddc005y6whdogeehlcl","openrouterId":"bytedance/ui-tars-1.5-7b","slug":"bytedance-ui-tars-1.5-7b","name":"ByteDance: UI-TARS 7B ","description":"UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...","contextLength":128000,"pricing":{"prompt":1.2e-7,"completion":2.4e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"148465954","provider":"bytedance","authorName":"bytedance","authorSlug":"bytedance","iconUrl":"https://openrouter.ai/images/icons/bytedance.svg","releaseDate":"2025-07-22T13:24:16.947Z","avgThroughputTps":2,"avgLatencyMs":1426,"isActive":true},{"id":"cmoxkjd0j00566whdisaulu74","openrouterId":"nousresearch/hermes-4-70b","slug":"nousresearch-hermes-4-70b","name":"Nous: Hermes 4 70B","description":"Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...","contextLength":131072,"pricing":{"prompt":1.56e-7,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1483783582","provider":"nousresearch","authorName":"Nous Research","authorSlug":"nousresearch","iconUrl":"https://openrouter.ai/images/icons/Nous Research.svg","releaseDate":"2025-08-26T15:23:02.446Z","avgThroughputTps":31,"avgLatencyMs":396,"isActive":true},{"id":"cmoxkjesy00906whdpjpqlcdk","openrouterId":"meta-llama/llama-3.2-3b-instruct","slug":"meta-llama-llama-3.2-3b-instruct","name":"Meta: Llama 3.2 3B Instruct","description":"Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...","contextLength":131072,"pricing":{"prompt":6.108e-8,"completion":4.02e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1471259520","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2024-09-24T20:00:00.000Z","avgThroughputTps":41,"avgLatencyMs":178,"isActive":true},{"id":"cmoxkjbig00206whdvmiyv5a4","openrouterId":"qwen/qwen3.5-27b","slug":"qwen-qwen3.5-27b","name":"Qwen: Qwen3.5-27B","description":"The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of...","contextLength":262144,"pricing":{"prompt":2.34e-7,"completion":0.000001872,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"14597478153","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-25T16:10:10.000Z","avgThroughputTps":58.4,"avgLatencyMs":954.2,"isActive":true},{"id":"cmoxkjeit008e6whdcuo1bv1o","openrouterId":"deepseek/deepseek-chat","slug":"deepseek-deepseek-chat","name":"DeepSeek: DeepSeek V3","description":"DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations...","contextLength":163840,"pricing":{"prompt":3.84e-7,"completion":0.000001068,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"14574024293","provider":"deepseek","authorName":"deepseek-ai","authorSlug":"deepseek-ai","iconUrl":"https://openrouter.ai/images/icons/deepseek-ai.svg","releaseDate":"2024-12-26T14:28:40.559Z","avgThroughputTps":16,"avgLatencyMs":1486.5,"isActive":true},{"id":"cmoxkjdkk006d6whdnrejfcic","openrouterId":"minimax/minimax-m1","slug":"minimax-minimax-m1","name":"MiniMax: MiniMax M1","description":"MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom \"lightning attention\" mechanism, allowing it...","contextLength":1000000,"pricing":{"prompt":4.8e-7,"completion":0.00000264,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1443021169","provider":"minimax","authorName":"MiniMax","authorSlug":"minimax","iconUrl":"https://openrouter.ai/images/icons/MiniMax.svg","releaseDate":"2025-06-17T18:46:54.257Z","avgThroughputTps":13.5,"avgLatencyMs":1320.5,"isActive":true},{"id":"cmoxkjcic00436whdxr7z8f1d","openrouterId":"microsoft/phi-4-mini-instruct","slug":"microsoft-phi-4-mini-instruct","name":"Microsoft: Phi 4 Mini Instruct","description":"Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4...","contextLength":131072,"pricing":{"prompt":9.6e-8,"completion":4.1999999999999995e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"144209704","provider":"microsoft","authorName":"Microsoft","authorSlug":"microsoft","iconUrl":"https://openrouter.ai/images/icons/Microsoft.svg","releaseDate":"2025-10-17T14:34:09.607Z","avgThroughputTps":233,"avgLatencyMs":118,"isActive":true},{"id":"cmoxkjbvv002r6whd7cs4dnl6","openrouterId":"openai/gpt-audio-mini","slug":"openai-gpt-audio-mini","name":"OpenAI: GPT Audio Mini","description":"A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million...","contextLength":128000,"pricing":{"prompt":7.2e-7,"completion":0.00000288,"image":0,"request":0},"modalities":["text","audio->text","audio"],"perWeekTokens":"140828079","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-01-19T16:50:19.564Z","avgThroughputTps":66,"avgLatencyMs":2237,"isActive":true},{"id":"cmoxkjaxr000t6whdga9me2hn","openrouterId":"xiaomi/mimo-v2.5-pro","slug":"xiaomi-mimo-v2.5-pro","name":"Xiaomi: MiMo-V2.5-Pro","description":"MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....","contextLength":1048576,"pricing":{"prompt":0.0000012,"completion":0.0000036,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"140216121873","provider":"xiaomi","authorName":"Xiaomi","authorSlug":"xiaomi","iconUrl":"https://openrouter.ai/images/icons/Xiaomi.svg","releaseDate":"2026-04-22T12:11:13.016Z","avgThroughputTps":38,"avgLatencyMs":1311.5,"isActive":true},{"id":"cmoxkjdu8006y6whdh5gnr41u","openrouterId":"qwen/qwen3-32b","slug":"qwen-qwen3-32b","name":"Qwen: Qwen3 32B","description":"Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a \"thinking\" mode for...","contextLength":131072,"pricing":{"prompt":9.6e-8,"completion":3.36e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"14008709381","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-04-28T17:32:25.189Z","avgThroughputTps":70.85714285714286,"avgLatencyMs":778.8571428571429,"isActive":true},{"id":"cmoxkjanu00086whd7xgqck19","openrouterId":"nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free","slug":"nvidia-nemotron-3-nano-omni-30b-a3b-reasoning-free","name":"NVIDIA: Nemotron 3 Nano Omni (free)","description":"NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems. It accepts text, image, video, and...","contextLength":256000,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image","audio","video->text"],"perWeekTokens":"13851895117","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2026-04-28T12:18:15.164Z","avgThroughputTps":82,"avgLatencyMs":899.5,"isActive":true},{"id":"cmoxkjfaw00a36whdwl1k9agv","openrouterId":"gryphe/mythomax-l2-13b","slug":"gryphe-mythomax-l2-13b","name":"MythoMax 13B","description":"One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge","contextLength":4096,"pricing":{"prompt":7.2e-8,"completion":7.2e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1372137755","provider":"gryphe","authorName":"Gryphe","authorSlug":"gryphe","iconUrl":"https://openrouter.ai/images/icons/Gryphe.svg","releaseDate":"2023-07-01T20:00:00.000Z","avgThroughputTps":42,"avgLatencyMs":544.6666666666666,"isActive":true},{"id":"cmoxkjbn900296whdq8zztqu3","openrouterId":"qwen/qwen3.5-plus-02-15","slug":"qwen-qwen3.5-plus-02-15","name":"Qwen: Qwen3.5 Plus 2026-02-15","description":"The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of...","contextLength":1000000,"pricing":{"prompt":3.12e-7,"completion":0.000001872,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"13584872098","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-16T03:10:16.477Z","avgThroughputTps":40,"avgLatencyMs":1842.5,"isActive":true},{"id":"cmoxkjf21009k6whd67ci9g5n","openrouterId":"openai/gpt-4o-2024-05-13","slug":"openai-gpt-4o-2024-05-13","name":"OpenAI: GPT-4o (2024-05-13)","description":"GPT-4o (\"o\" for \"omni\") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as...","contextLength":128000,"pricing":{"prompt":0.000006,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1357767905","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-05-12T20:00:00.000Z","avgThroughputTps":23,"avgLatencyMs":626,"isActive":true},{"id":"cmoxkjf90009z6whdsn6pifjz","openrouterId":"mistralai/mistral-7b-instruct-v0.1","slug":"mistralai-mistral-7b-instruct-v0.1","name":"Mistral: Mistral 7B Instruct v0.1","description":"A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for speed and context length.","contextLength":4096,"pricing":{"prompt":1.32e-7,"completion":2.28e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"133950645","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2023-09-27T20:00:00.000Z","avgThroughputTps":14,"avgLatencyMs":481.5,"isActive":true},{"id":"cmoxkje58007m6whds3cpiva3","openrouterId":"perplexity/sonar-reasoning-pro","slug":"perplexity-sonar-reasoning-pro","name":"Perplexity: Sonar Reasoning Pro","description":"Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) Sonar Reasoning Pro is a premier reasoning model powered by DeepSeek R1 with Chain of Thought (CoT). Designed for...","contextLength":128000,"pricing":{"prompt":0.0000024,"completion":0.0000096,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"133715612","provider":"perplexity","authorName":"Perplexity","authorSlug":"perplexity","iconUrl":"https://openrouter.ai/images/icons/Perplexity.svg","releaseDate":"2025-03-06T21:08:28.125Z","avgThroughputTps":25,"avgLatencyMs":12444,"isActive":true},{"id":"cmoxkjbms00286whdju2xnle8","openrouterId":"anthropic/claude-sonnet-4.6","slug":"anthropic-claude-sonnet-4.6","name":"Anthropic: Claude Sonnet 4.6","description":"Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...","contextLength":1000000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1323135978536","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2026-02-17T10:43:10.807Z","avgThroughputTps":41.57142857142857,"avgLatencyMs":1335,"isActive":true},{"id":"cmoxkjent008p6whdvdhfr6xs","openrouterId":"mistralai/mistral-large-2407","slug":"mistralai-mistral-large-2407","name":"Mistral Large 2407","description":"This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....","contextLength":131072,"pricing":{"prompt":0.0000024,"completion":0.0000072,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"131661401","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2024-11-18T20:06:55.274Z","avgThroughputTps":11,"avgLatencyMs":416.5,"isActive":true},{"id":"cmoxkjerj008x6whdqakemv8v","openrouterId":"inflection/inflection-3-pi","slug":"inflection-inflection-3-pi","name":"Inflection: Inflection 3 Pi","description":"Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety. It has access to recent news, and excels in scenarios like customer support and roleplay. Pi...","contextLength":8000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1309234","provider":"inflection","authorName":"Inflection","authorSlug":"inflection","iconUrl":"https://openrouter.ai/images/icons/Inflection.svg","releaseDate":"2024-10-10T20:00:00.000Z","avgThroughputTps":19,"avgLatencyMs":3151,"isActive":true},{"id":"cmoxkjcfp003x6whdhcxty74n","openrouterId":"mistralai/voxtral-small-24b-2507","slug":"mistralai-voxtral-small-24b-2507","name":"Mistral: Voxtral Small 24B 2507","description":"Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio...","contextLength":32000,"pricing":{"prompt":1.2e-7,"completion":3.6e-7,"image":0,"request":0},"modalities":["text","file","audio->text"],"perWeekTokens":"130296953","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-10-30T10:39:04.000Z","avgThroughputTps":35.5,"avgLatencyMs":343,"isActive":true},{"id":"cmoxkjb6p001b6whdc0gy9z86","openrouterId":"x-ai/grok-4.20-multi-agent","slug":"x-ai-grok-4.20-multi-agent","name":"xAI: Grok 4.20 Multi-Agent","description":"Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...","contextLength":2000000,"pricing":{"prompt":0.0000024,"completion":0.0000072,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1298890725","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2026-03-31T13:45:58.343Z","avgThroughputTps":270,"avgLatencyMs":11900,"isActive":true},{"id":"3d82077735be4f6c97eadd9d","openrouterId":"inclusionai/ring-2.6-1t","slug":"inclusionai-ring-2.6-1t","name":"inclusionAI: Ring-2.6-1T","description":"Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency. It is optimized for coding agents, tool...","contextLength":262144,"pricing":{"prompt":9e-8,"completion":7.5e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"12850348494","provider":"inclusionai","authorName":"inclusionai","authorSlug":"inclusionai","iconUrl":"https://openrouter.ai/images/icons/inclusionai.svg","releaseDate":"2026-05-08T09:37:20.749Z","avgThroughputTps":73,"avgLatencyMs":2068,"isActive":true},{"id":"cmoxkjczk00546whddnqzc1se","openrouterId":"qwen/qwen3-30b-a3b-thinking-2507","slug":"qwen-qwen3-30b-a3b-thinking-2507","name":"Qwen: Qwen3 30B A3B Thinking 2507","description":"Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking mode,” where internal reasoning traces are separated...","contextLength":131072,"pricing":{"prompt":9.6e-8,"completion":4.8e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1282682631","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-08-28T12:39:52.539Z","avgThroughputTps":116.5,"avgLatencyMs":659.25,"isActive":true},{"id":"cmoxkjbs5002j6whdx37m9txd","openrouterId":"arcee-ai/trinity-large-preview","slug":"arcee-ai-trinity-large-preview","name":"Arcee AI: Trinity Large Preview","description":"Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It excels in creative writing,...","contextLength":131000,"pricing":{"prompt":1.8e-7,"completion":5.399999999999999e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1281668769","provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2026-01-27T17:24:30.000Z","avgThroughputTps":78,"avgLatencyMs":346.5,"isActive":true},{"id":"cmoxkjdsb006u6whd1m8fl9h6","openrouterId":"meta-llama/llama-guard-4-12b","slug":"meta-llama-llama-guard-4-12b","name":"Meta: Llama Guard 4 12B","description":"Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...","contextLength":163840,"pricing":{"prompt":2.1600000000000003e-7,"completion":2.1600000000000003e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"1280960092","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2025-04-29T21:06:33.531Z","avgThroughputTps":17,"avgLatencyMs":162.75,"isActive":true},{"id":"cmoxkjf3w009o6whd2png2c3p","openrouterId":"mistralai/mixtral-8x22b-instruct","slug":"mistralai-mixtral-8x22b-instruct","name":"Mistral: Mixtral 8x22B Instruct","description":"Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding,...","contextLength":65536,"pricing":{"prompt":0.0000024,"completion":0.0000072,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"127063643","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2024-04-16T20:00:00.000Z","avgThroughputTps":104.5,"avgLatencyMs":390,"isActive":true},{"id":"cmoxkjdv600706whd9qxtudp4","openrouterId":"openai/o4-mini-high","slug":"openai-o4-mini-high","name":"OpenAI: o4 Mini High","description":"OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining...","contextLength":200000,"pricing":{"prompt":0.00000132,"completion":0.00000528,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"126749053","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-04-16T13:23:32.042Z","avgThroughputTps":126,"avgLatencyMs":5505,"isActive":true},{"id":"cmoxkjb1l00116whdcu8jjz7t","openrouterId":"anthropic/claude-opus-4.7","slug":"anthropic-claude-opus-4.7","name":"Anthropic: Claude Opus 4.7","description":"Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...","contextLength":1000000,"pricing":{"prompt":0.000006,"completion":0.00003,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1247574644053","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2026-04-16T10:51:40.905Z","avgThroughputTps":57.666666666666664,"avgLatencyMs":2280.4166666666665,"isActive":true},{"id":"cmoxkjdhj00676whdf5xy85e8","openrouterId":"tencent/hunyuan-a13b-instruct","slug":"tencent-hunyuan-a13b-instruct","name":"Tencent: Hunyuan A13B Instruct","description":"Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark...","contextLength":131072,"pricing":{"prompt":1.68e-7,"completion":6.84e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"12429630","provider":"tencent","authorName":"tencent","authorSlug":"tencent","iconUrl":"https://openrouter.ai/images/icons/tencent.svg","releaseDate":"2025-07-08T11:14:24.006Z","avgThroughputTps":2,"avgLatencyMs":1606,"isActive":true},{"id":"cmoxkjbep001s6whdupgsqcl2","openrouterId":"openai/gpt-5.4-pro","slug":"openai-gpt-5.4-pro","name":"OpenAI: GPT-5.4 Pro","description":"GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922K input, 128K...","contextLength":1050000,"pricing":{"prompt":0.000036,"completion":0.000216,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1239030609","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-03-05T13:12:46.000Z","avgThroughputTps":6.75,"avgLatencyMs":80536.5,"isActive":true},{"id":"cmoxkjaze000w6whdet8dslva","openrouterId":"inclusionai/ling-2.6-flash","slug":"inclusionai-ling-2.6-flash","name":"inclusionAI: Ling-2.6-flash","description":"Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency....","contextLength":262144,"pricing":{"prompt":1.2e-8,"completion":3.6e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"12358555520","provider":"inclusionai","authorName":"inclusionai","authorSlug":"inclusionai","iconUrl":"https://openrouter.ai/images/icons/inclusionai.svg","releaseDate":"2026-04-21T14:24:46.183Z","avgThroughputTps":117,"avgLatencyMs":1094.5,"isActive":true},{"id":"cmoxkjezg009e6whdp53qgrcx","openrouterId":"mistralai/mistral-nemo","slug":"mistralai-mistral-nemo","name":"Mistral: Mistral Nemo","description":"A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese,...","contextLength":131072,"pricing":{"prompt":2.4e-8,"completion":3.6e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"123383834219","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2024-07-18T20:00:00.000Z","avgThroughputTps":43.75,"avgLatencyMs":500,"isActive":true},{"id":"cmoxkje5p007n6whd5gkj57hp","openrouterId":"perplexity/sonar-pro","slug":"perplexity-sonar-pro","name":"Perplexity: Sonar Pro","description":"Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) For enterprises seeking more advanced capabilities, the Sonar Pro API can handle in-depth, multi-step queries with added extensibility, like...","contextLength":200000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"1228910471","provider":"perplexity","authorName":"Perplexity","authorSlug":"perplexity","iconUrl":"https://openrouter.ai/images/icons/Perplexity.svg","releaseDate":"2025-03-06T20:53:43.000Z","avgThroughputTps":76,"avgLatencyMs":1852,"isActive":true},{"id":"cmoxkjf2i009l6whdb0xoef7i","openrouterId":"openai/gpt-4o","slug":"openai-gpt-4o","name":"OpenAI: GPT-4o","description":"GPT-4o (\"o\" for \"omni\") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as...","contextLength":128000,"pricing":{"prompt":0.000003,"completion":0.000012,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"12263032390","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-05-12T20:00:00.000Z","avgThroughputTps":40,"avgLatencyMs":752.25,"isActive":true},{"id":"cmoxkjcss004p6whd269qrs4p","openrouterId":"qwen/qwen3-max","slug":"qwen-qwen3-max","name":"Qwen: Qwen3 Max","description":"Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version. It...","contextLength":262144,"pricing":{"prompt":9.36e-7,"completion":0.00000468,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1225934756","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-23T17:26:48.000Z","avgThroughputTps":16,"avgLatencyMs":999,"isActive":true},{"id":"cmoxkjdvm00716whdn5snc8c8","openrouterId":"openai/o3","slug":"openai-o3","name":"OpenAI: o3","description":"o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following....","contextLength":200000,"pricing":{"prompt":0.0000024,"completion":0.0000096,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1218854044","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-04-16T13:10:57.049Z","avgThroughputTps":87.5,"avgLatencyMs":7586,"isActive":true},{"id":"cmoxkjcpg004i6whdf0vayx3h","openrouterId":"anthropic/claude-sonnet-4.5","slug":"anthropic-claude-sonnet-4.5","name":"Anthropic: Claude Sonnet 4.5","description":"Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-bench Verified, with...","contextLength":1000000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"121341552462","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-09-29T12:01:16.552Z","avgThroughputTps":36.833333333333336,"avgLatencyMs":1646.3333333333333,"isActive":true},{"id":"cmoxkjdzp007a6whdfzx246yy","openrouterId":"meta-llama/llama-4-scout","slug":"meta-llama-llama-4-scout","name":"Meta: Llama 4 Scout","description":"Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...","contextLength":10000000,"pricing":{"prompt":9.6e-8,"completion":3.6e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"12096034553","provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2025-04-05T15:31:59.735Z","avgThroughputTps":117.25,"avgLatencyMs":321,"isActive":true},{"id":"cmoxkjdw000726whdlq1away0","openrouterId":"openai/o4-mini","slug":"openai-o4-mini","name":"OpenAI: o4 Mini","description":"OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning...","contextLength":200000,"pricing":{"prompt":0.00000132,"completion":0.00000528,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1204809410","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-04-16T12:29:02.980Z","avgThroughputTps":99,"avgLatencyMs":3551,"isActive":true},{"id":"cmoxkjcqx004l6whdpax9cny2","openrouterId":"relace/relace-apply-3","slug":"relace-relace-apply-3","name":"Relace: Relace Apply 3","description":"Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and others into your files at...","contextLength":256000,"pricing":{"prompt":0.00000102,"completion":0.0000015,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"120249040","provider":"relace","authorName":"relace","authorSlug":"relace","iconUrl":"https://openrouter.ai/images/icons/relace.svg","releaseDate":"2025-09-26T08:59:32.878Z","avgThroughputTps":807,"avgLatencyMs":367,"isActive":true},{"id":"cmoxkjeaf007x6whd0qldgmjp","openrouterId":"aion-labs/aion-1.0","slug":"aion-labs-aion-1.0","name":"AionLabs: Aion-1.0","description":"Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It is built on DeepSeek-R1, augmented with additional models and techniques such as Tree...","contextLength":131072,"pricing":{"prompt":0.0000048,"completion":0.0000096,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"12008424","provider":"aion-labs","authorName":"Aion Labs","authorSlug":"aion-labs","iconUrl":"https://openrouter.ai/images/icons/Aion Labs.svg","releaseDate":"2025-02-04T14:32:37.000Z","avgThroughputTps":7,"avgLatencyMs":1726,"isActive":true},{"id":"cmoxkjevb00956whdang94j98","openrouterId":"cohere/command-r-08-2024","slug":"cohere-command-r-08-2024","name":"Cohere: Command R (08-2024)","description":"command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, it is better at math, code and reasoning and...","contextLength":128000,"pricing":{"prompt":1.8e-7,"completion":7.2e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"119869775","provider":"cohere","authorName":"Cohere","authorSlug":"cohere","iconUrl":"https://openrouter.ai/images/icons/Cohere.svg","releaseDate":"2024-08-29T20:00:00.000Z","avgThroughputTps":19,"avgLatencyMs":231,"isActive":true},{"id":"cmoxkjepl008t6whd2k98q47t","openrouterId":"anthropic/claude-3.5-haiku","slug":"anthropic-claude-3.5-haiku","name":"Anthropic: Claude 3.5 Haiku","description":"Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic...","contextLength":200000,"pricing":{"prompt":9.6e-7,"completion":0.0000048,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"11963724438","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2024-11-03T19:00:00.000Z","avgThroughputTps":36.75,"avgLatencyMs":889.5,"isActive":true},{"id":"cmoxkjakp00016whdnswicgo6","openrouterId":"google/gemini-3.1-flash-lite","slug":"google-gemini-3.1-flash-lite","name":"Google: Gemini 3.1 Flash Lite","description":"Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...","contextLength":1048576,"pricing":{"prompt":3e-7,"completion":0.0000018,"image":3e-7,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":"119169648945","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-05-07T11:47:08.228Z","avgThroughputTps":58.5,"avgLatencyMs":620.5,"isActive":true},{"id":"cmoxkje0n007c6whdceclnoeb","openrouterId":"openai/o1-pro","slug":"openai-o1-pro","name":"OpenAI: o1-pro","description":"The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide...","contextLength":200000,"pricing":{"prompt":0.00017999999999999998,"completion":0.0007199999999999999,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1189820","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-03-19T18:26:51.610Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjcqa004k6whduk0b2yxp","openrouterId":"thedrummer/cydonia-24b-v4.1","slug":"thedrummer-cydonia-24b-v4.1","name":"TheDrummer: Cydonia 24B V4.1","description":"Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.","contextLength":131072,"pricing":{"prompt":3.6e-7,"completion":6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1189127527","provider":"thedrummer","authorName":"Drummer","authorSlug":"thedrummer","iconUrl":"https://openrouter.ai/images/icons/Drummer.svg","releaseDate":"2025-09-26T20:11:18.116Z","avgThroughputTps":48,"avgLatencyMs":314,"isActive":true},{"id":"cmoxkjdfn00636whdp185bfmn","openrouterId":"mistralai/devstral-medium","slug":"mistralai-devstral-medium","name":"Mistral: Devstral Medium","description":"Devstral Medium is a high-performance code generation and agentic reasoning model developed jointly by Mistral AI and All Hands AI. Positioned as a step up from Devstral Small, it achieves...","contextLength":131072,"pricing":{"prompt":4.8e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","file->text"],"perWeekTokens":"118142128","provider":"mistralai","authorName":"Mistral AI","authorSlug":"mistralai","iconUrl":"https://openrouter.ai/images/icons/Mistral AI.svg","releaseDate":"2025-07-10T11:28:41.981Z","avgThroughputTps":8,"avgLatencyMs":353,"isActive":true},{"id":"cmoxkje7l007r6whdj0w470zj","openrouterId":"anthropic/claude-3.7-sonnet:thinking","slug":"anthropic-claude-3.7-sonnet-thinking","name":"Anthropic: Claude 3.7 Sonnet (thinking)","description":"Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and...","contextLength":200000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1180124943","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-02-24T13:35:10.000Z","avgThroughputTps":48.5,"avgLatencyMs":1288.5,"isActive":true},{"id":"cmoxkjd6n005j6whdigd1e9fd","openrouterId":"openai/gpt-oss-120b:free","slug":"openai-gpt-oss-120b-free","name":"OpenAI: gpt-oss-120b (free)","description":"gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized...","contextLength":131072,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"117714518871","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-05T13:17:11.000Z","avgThroughputTps":8,"avgLatencyMs":15525,"isActive":true},{"id":"cmoxkjd5q005h6whdejj6ajx0","openrouterId":"openai/gpt-5-mini","slug":"openai-gpt-5-mini","name":"OpenAI: GPT-5 Mini","description":"GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost....","contextLength":400000,"pricing":{"prompt":3e-7,"completion":0.0000024,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"117587173372","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2025-08-07T13:23:27.000Z","avgThroughputTps":64.66666666666667,"avgLatencyMs":5018.5,"isActive":true},{"id":"cmoxkjf9y00a16whdpsvacz7c","openrouterId":"mancer/weaver","slug":"mancer-weaver","name":"Mancer: Weaver (alpha)","description":"An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.","contextLength":8000,"pricing":{"prompt":9e-7,"completion":0.0000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"11485549","provider":"mancer","authorName":"Mancer","authorSlug":"mancer","iconUrl":"https://openrouter.ai/images/icons/Mancer.svg","releaseDate":"2023-08-01T20:00:00.000Z","avgThroughputTps":85,"avgLatencyMs":909,"isActive":true},{"id":"cmoxkjebm007z6whd999lmhn0","openrouterId":"aion-labs/aion-rp-llama-3.1-8b","slug":"aion-labs-aion-rp-llama-3.1-8b","name":"AionLabs: Aion-RP 1.0 (8B)","description":"Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto benchmark, a roleplaying-specific variant of Arena-Hard-Auto, where LLMs evaluate each other’s responses. It is a fine-tuned base model...","contextLength":32768,"pricing":{"prompt":9.6e-7,"completion":0.00000192,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"11409001","provider":"aion-labs","authorName":"Aion Labs","authorSlug":"aion-labs","iconUrl":"https://openrouter.ai/images/icons/Aion Labs.svg","releaseDate":"2025-02-04T14:18:38.521Z","avgThroughputTps":31,"avgLatencyMs":835,"isActive":true},{"id":"cmoxkjcy500516whdd6cbvpjy","openrouterId":"nvidia/nemotron-nano-9b-v2:free","slug":"nvidia-nemotron-nano-9b-v2-free","name":"NVIDIA: Nemotron Nano 9B V2 (free)","description":"NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and...","contextLength":128000,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"11353475012","provider":"nvidia","authorName":"Nvidia","authorSlug":"nvidia","iconUrl":"https://openrouter.ai/images/icons/Nvidia.svg","releaseDate":"2025-09-05T17:13:27.486Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjbnr002a6whdy3y8zvqx","openrouterId":"qwen/qwen3.5-397b-a17b","slug":"qwen-qwen3.5-397b-a17b","name":"Qwen: Qwen3.5 397B A17B","description":"The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...","contextLength":262144,"pricing":{"prompt":4.68e-7,"completion":0.000002808,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"113002891661","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-02-16T01:23:38.867Z","avgThroughputTps":6.416666666666667,"avgLatencyMs":1295.125,"isActive":true},{"id":"cmoxkjelh008k6whd9dw03ci9","openrouterId":"amazon/nova-lite-v1","slug":"amazon-nova-lite-v1","name":"Amazon: Nova Lite 1.0","description":"Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output. Amazon Nova Lite...","contextLength":300000,"pricing":{"prompt":7.2e-8,"completion":2.88e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"1127603991","provider":"amazon","authorName":"Amazon","authorSlug":"amazon","iconUrl":"https://openrouter.ai/images/icons/Amazon.svg","releaseDate":"2024-12-05T17:22:43.403Z","avgThroughputTps":94.75,"avgLatencyMs":451,"isActive":true},{"id":"376466c7754d4f6382a2d28c","openrouterId":"baidu/qianfan-ocr-fast","slug":"baidu-qianfan-ocr-fast","name":"Baidu: Qianfan-OCR-Fast","description":"Qianfan-OCR-Fast is a domain-specific multimodal large model purpose-built for OCR. By leveraging specialized OCR training data while preserving versatile multimodal intelligence, it provides a powerful performance upgrade over Qianfan-OCR.","contextLength":65536,"pricing":{"prompt":8.159999999999999e-7,"completion":0.000003372,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"11239282","provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2026-04-20T13:51:12.864Z","avgThroughputTps":7,"avgLatencyMs":2282,"isActive":true},{"id":"cmoxkjamf00056whd6zwykp4m","openrouterId":"ibm-granite/granite-4.1-8b","slug":"ibm-granite-granite-4.1-8b","name":"IBM: Granite 4.1 8B","description":"Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...","contextLength":131072,"pricing":{"prompt":6e-8,"completion":1.2e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1121404944","provider":"ibm-granite","authorName":"ibm-granite","authorSlug":"ibm-granite","iconUrl":"https://openrouter.ai/images/icons/ibm-granite.svg","releaseDate":"2026-04-30T15:24:31.868Z","avgThroughputTps":123,"avgLatencyMs":229,"isActive":true},{"id":"cmoxkjbp7002d6whdzlqne8x8","openrouterId":"z-ai/glm-5","slug":"z-ai-glm-5","name":"Z.ai: GLM 5","description":"GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading...","contextLength":202752,"pricing":{"prompt":7.2e-7,"completion":0.000002304,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"111203618774","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2026-02-11T11:59:42.000Z","avgThroughputTps":44.166666666666664,"avgLatencyMs":2365.133333333333,"isActive":true},{"id":"cmoxkjcx7004z6whdg1ug764h","openrouterId":"qwen/qwen-plus-2025-07-28:thinking","slug":"qwen-qwen-plus-2025-07-28-thinking","name":"Qwen: Qwen Plus 0728 (thinking)","description":"Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.","contextLength":1000000,"pricing":{"prompt":3.12e-7,"completion":9.36e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"109403969","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-09-08T12:06:39.935Z","avgThroughputTps":46,"avgLatencyMs":520,"isActive":true},{"id":"cmoxkjeor008r6whdxlkku2ue","openrouterId":"qwen/qwen-2.5-coder-32b-instruct","slug":"qwen-qwen-2.5-coder-32b-instruct","name":"Qwen2.5 Coder 32B Instruct","description":"Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in **code generation**, **code reasoning**...","contextLength":128000,"pricing":{"prompt":7.92e-7,"completion":0.0000012,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"108476125","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2024-11-11T18:40:00.276Z","avgThroughputTps":19.5,"avgLatencyMs":314.5,"isActive":true},{"id":"cmoxkje1z007f6whdgfgwxfdn","openrouterId":"google/gemma-3-12b-it","slug":"google-gemma-3-12b-it","name":"Google: Gemma 3 12B","description":"Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...","contextLength":131072,"pricing":{"prompt":4.8e-8,"completion":1.56e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":"10746428360","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2025-03-13T17:50:25.140Z","avgThroughputTps":39,"avgLatencyMs":534.6666666666666,"isActive":true},{"id":"cmoxkjbbe001l6whd6k3y4qwm","openrouterId":"openai/gpt-5.4-mini","slug":"openai-gpt-5.4-mini","name":"OpenAI: GPT-5.4 Mini","description":"GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding,...","contextLength":400000,"pricing":{"prompt":9e-7,"completion":0.0000054,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"107321873649","provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2026-03-17T07:49:38.000Z","avgThroughputTps":83.5,"avgLatencyMs":580.5,"isActive":true},{"id":"cmoxkjdb7005t6whdvl7ppch2","openrouterId":"z-ai/glm-4.5-air","slug":"z-ai-glm-4.5-air","name":"Z.ai: GLM 4.5 Air","description":"GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter...","contextLength":131072,"pricing":{"prompt":1.56e-7,"completion":0.00000102,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"106961649000","provider":"z-ai","authorName":"Z.ai","authorSlug":"z-ai","iconUrl":"https://openrouter.ai/images/icons/Z.ai.svg","releaseDate":"2025-07-25T15:20:58.066Z","avgThroughputTps":30.333333333333332,"avgLatencyMs":1910,"isActive":true},{"id":"cmoxkjdmv006i6whda3hmm08a","openrouterId":"x-ai/grok-3","slug":"x-ai-grok-3","name":"xAI: Grok 3","description":"Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in...","contextLength":131072,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1067702785","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-06-10T15:15:08.007Z","avgThroughputTps":15,"avgLatencyMs":1814,"isActive":true},{"id":"cmoxkjdyv00786whd5hkrwcjg","openrouterId":"x-ai/grok-3-beta","slug":"x-ai-grok-3-beta","name":"xAI: Grok 3 Beta","description":"Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in...","contextLength":131072,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"1067702785","provider":"x-ai","authorName":"xAI","authorSlug":"x-ai","iconUrl":"https://openrouter.ai/images/icons/xAI.svg","releaseDate":"2025-06-10T15:15:08.007Z","avgThroughputTps":15,"avgLatencyMs":1814,"isActive":true},{"id":"cmoxkjdo2006l6whduv2pg57x","openrouterId":"anthropic/claude-opus-4","slug":"anthropic-claude-opus-4","name":"Anthropic: Claude Opus 4","description":"Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...","contextLength":200000,"pricing":{"prompt":0.000018,"completion":0.00008999999999999999,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":"1060434787","provider":"anthropic","authorName":"Anthropic","authorSlug":"anthropic","iconUrl":"https://openrouter.ai/images/icons/Anthropic.svg","releaseDate":"2025-05-22T12:27:25.029Z","avgThroughputTps":9.5,"avgLatencyMs":2302,"isActive":true},{"id":"cmoxkjau3000l6whdnwo8hqif","openrouterId":"qwen/qwen3.6-max-preview","slug":"qwen-qwen3.6-max-preview","name":"Qwen: Qwen3.6 Max Preview","description":"Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters. It is optimized for agentic coding, tool use, and...","contextLength":262144,"pricing":{"prompt":0.000001248,"completion":0.000007488,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"10268023171","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2026-04-26T23:24:02.092Z","avgThroughputTps":29,"avgLatencyMs":5192.5,"isActive":true},{"id":"cmoxkjd9r005q6whd7spywu36","openrouterId":"qwen/qwen3-30b-a3b-instruct-2507","slug":"qwen-qwen3-30b-a3b-instruct-2507","name":"Qwen: Qwen3 30B A3B Instruct 2507","description":"Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference. It operates in non-thinking mode and is designed for high-quality instruction following, multilingual understanding, and...","contextLength":262144,"pricing":{"prompt":1.0800000000000001e-7,"completion":3.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"10230363635","provider":"qwen","authorName":"Qwen","authorSlug":"qwen","iconUrl":"https://openrouter.ai/images/icons/Qwen.svg","releaseDate":"2025-07-29T12:36:05.687Z","avgThroughputTps":47.166666666666664,"avgLatencyMs":1123,"isActive":true},{"id":"cmoxkje66007o6whd305fg539","openrouterId":"perplexity/sonar-deep-research","slug":"perplexity-sonar-deep-research","name":"Perplexity: Sonar Deep Research","description":"Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its approach as it gathers...","contextLength":128000,"pricing":{"prompt":0.0000024,"completion":0.0000096,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"101303630","provider":"perplexity","authorName":"Perplexity","authorSlug":"perplexity","iconUrl":"https://openrouter.ai/images/icons/Perplexity.svg","releaseDate":"2025-03-06T20:34:06.000Z","avgThroughputTps":40,"avgLatencyMs":61718,"isActive":true},{"id":"cmoxkjeb5007y6whdhz55igfs","openrouterId":"aion-labs/aion-1.0-mini","slug":"aion-labs-aion-1.0-mini","name":"AionLabs: Aion-1.0-Mini","description":"Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic. It is a modified variant...","contextLength":131072,"pricing":{"prompt":8.399999999999999e-7,"completion":0.0000016799999999999998,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"10119151","provider":"aion-labs","authorName":"Aion Labs","authorSlug":"aion-labs","iconUrl":"https://openrouter.ai/images/icons/Aion Labs.svg","releaseDate":"2025-02-04T14:25:07.903Z","avgThroughputTps":6,"avgLatencyMs":1553.5,"isActive":true},{"id":"cmoxkjb4000166whdjhncerm6","openrouterId":"google/gemma-4-31b-it:free","slug":"google-gemma-4-31b-it-free","name":"Google: Gemma 4 31B (free)","description":"Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...","contextLength":262144,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image","video->text"],"perWeekTokens":"10105769861","provider":"google","authorName":"Google","authorSlug":"google","iconUrl":"https://openrouter.ai/images/icons/Google.svg","releaseDate":"2026-04-02T12:48:06.471Z","avgThroughputTps":24,"avgLatencyMs":3324,"isActive":true},{"id":"cmoxkjb68001a6whdok3m2hqt","openrouterId":"arcee-ai/trinity-large-thinking","slug":"arcee-ai-trinity-large-thinking","name":"Arcee AI: Trinity Large Thinking","description":"Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7...","contextLength":262144,"pricing":{"prompt":2.64e-7,"completion":0.00000102,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":"10081898488","provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2026-04-01T11:45:18.036Z","avgThroughputTps":68.5,"avgLatencyMs":735.1666666666666,"isActive":true},{"id":"372c1f651e774d2084b8e953","openrouterId":"google/gemini-3.5-flash","slug":null,"name":"Google: Gemini 3.5 Flash","description":"Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...","contextLength":1048576,"pricing":{"prompt":0.0000018,"completion":0.0000108,"image":0.0000018,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":null,"provider":"google","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjapg000b6whduuy3ok7w","openrouterId":"~anthropic/claude-haiku-latest","slug":null,"name":"Anthropic Claude Haiku Latest","description":"This model always redirects to the latest model in the Anthropic Claude Haiku family.","contextLength":200000,"pricing":{"prompt":0.0000012,"completion":0.000006,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":null,"provider":"~anthropic","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjapu000c6whd1dflwiyx","openrouterId":"~openai/gpt-mini-latest","slug":null,"name":"OpenAI GPT Mini Latest","description":"This model always redirects to the latest model in the OpenAI GPT Mini family.","contextLength":400000,"pricing":{"prompt":9e-7,"completion":0.0000054,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":null,"provider":"~openai","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjaqa000d6whd16bx4yzf","openrouterId":"~google/gemini-pro-latest","slug":null,"name":"Google Gemini Pro Latest","description":"This model always redirects to the latest model in the Google Gemini Pro family.","contextLength":1048576,"pricing":{"prompt":0.0000024,"completion":0.0000144,"image":0.0000024,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":null,"provider":"~google","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjaqv000e6whdolcn3olh","openrouterId":"~moonshotai/kimi-latest","slug":null,"name":"MoonshotAI Kimi Latest","description":"This model always redirects to the latest model in the MoonshotAI Kimi family.","contextLength":262144,"pricing":{"prompt":8.76e-7,"completion":0.0000041879999999999995,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":null,"provider":"~moonshotai","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjara000f6whdbv4m6yq4","openrouterId":"~google/gemini-flash-latest","slug":null,"name":"Google Gemini Flash Latest","description":"This model always redirects to the latest model in the Google Gemini Flash family.","contextLength":1048576,"pricing":{"prompt":0.0000018,"completion":0.0000108,"image":0.0000018,"request":0},"modalities":["text","image","file","audio","video->text"],"perWeekTokens":null,"provider":"~google","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjarp000g6whdaku3psiw","openrouterId":"~anthropic/claude-sonnet-latest","slug":null,"name":"Anthropic Claude Sonnet Latest","description":"This model always redirects to the latest model in the Anthropic Claude Sonnet family.","contextLength":1000000,"pricing":{"prompt":0.0000036,"completion":0.000018,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":null,"provider":"~anthropic","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjas5000h6whdy01vevku","openrouterId":"~openai/gpt-latest","slug":null,"name":"OpenAI GPT Latest","description":"This model always redirects to the latest model in the OpenAI GPT family.","contextLength":1050000,"pricing":{"prompt":0.000006,"completion":0.000036,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":null,"provider":"~openai","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjazu000x6whd367vv2c0","openrouterId":"~anthropic/claude-opus-latest","slug":null,"name":"Anthropic: Claude Opus Latest","description":"This model always redirects to the latest model in the Claude Opus family.","contextLength":1000000,"pricing":{"prompt":0.000006,"completion":0.00003,"image":0,"request":0},"modalities":["text","image","file->text"],"perWeekTokens":null,"provider":"~anthropic","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjb0c000y6whd482vjdkf","openrouterId":"openrouter/pareto-code","slug":null,"name":"Pareto Code Router","description":"The Pareto Router maintains a tiered shortlist of strong coding models, ranked by [Artificial Analysis](https://artificialanalysis.ai/) coding percentiles. Set min_coding_score between 0 and 1 on the [pareto-router plugin](https://openrouter.ai/docs/guides/routing/routers/pareto-router#the-min_coding_score-parameter) to control how...","contextLength":2000000,"pricing":{"prompt":-1.2,"completion":-1.2,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"openrouter","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjbr5002h6whdl6mssw3l","openrouterId":"openrouter/free","slug":null,"name":"Free Models Router","description":"The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that...","contextLength":200000,"pricing":{"prompt":0,"completion":0,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":null,"provider":"openrouter","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjc51003a6whdzmd9zt3m","openrouterId":"openrouter/bodybuilder","slug":null,"name":"Body Builder (beta)","description":"Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example:...","contextLength":128000,"pricing":{"prompt":-1.2,"completion":-1.2,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"openrouter","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjcai003m6whdck5t0jgo","openrouterId":"allenai/olmo-3-32b-think","slug":null,"name":"AllenAI: Olmo 3 32B Think","description":"Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and...","contextLength":65536,"pricing":{"prompt":1.8e-7,"completion":6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"allenai","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjfbd00a46whdevoet836","openrouterId":"openai/gpt-4-0314","slug":"openai-gpt-4-0314","name":"OpenAI: GPT-4 (older v0314)","description":"GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and was supported until June 14. Training data: up to Sep 2021.","contextLength":8191,"pricing":{"prompt":0.000036,"completion":0.000072,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2023-05-27T20:00:00.000Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjf82009x6whdillgolc4","openrouterId":"openai/gpt-4-1106-preview","slug":"openai-gpt-4-1106-preview","name":"OpenAI: GPT-4 Turbo (older v1106)","description":"The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling.\n\nTraining data: up to April 2023.","contextLength":128000,"pricing":{"prompt":0.000012,"completion":0.000036,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2023-11-05T19:00:00.000Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjf7m009w6whdm5djs0uu","openrouterId":"openrouter/auto","slug":null,"name":"Auto Router","description":"Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...","contextLength":2000000,"pricing":{"prompt":-1.2,"completion":-1.2,"image":0,"request":0},"modalities":["text","image","file","audio","video->text","image"],"perWeekTokens":null,"provider":"openrouter","authorName":null,"authorSlug":null,"iconUrl":null,"releaseDate":null,"avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjf6f009t6whdimqdjoq9","openrouterId":"openai/gpt-4-turbo-preview","slug":"openai-gpt-4-turbo-preview","name":"OpenAI: GPT-4 Turbo Preview","description":"The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while...","contextLength":128000,"pricing":{"prompt":0.000012,"completion":0.000036,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"openai","authorName":"OpenAI","authorSlug":"openai","iconUrl":"https://openrouter.ai/images/icons/OpenAI.svg","releaseDate":"2024-01-24T19:00:00.000Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjcml004c6whdn257ufyd","openrouterId":"baidu/ernie-4.5-21b-a3b-thinking","slug":"baidu-ernie-4.5-21b-a3b-thinking","name":"Baidu: ERNIE 4.5 21B A3B Thinking","description":"ERNIE-4.5-21B-A3B-Thinking is Baidu's upgraded lightweight MoE model, refined to boost reasoning depth and quality for top-tier performance in logical puzzles, math, science, coding, text generation, and expert-level academic benchmarks.","contextLength":131072,"pricing":{"prompt":8.4e-8,"completion":3.36e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"baidu","authorName":"baidu","authorSlug":"baidu","iconUrl":"https://openrouter.ai/images/icons/baidu.svg","releaseDate":"2025-10-09T18:28:07.216Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjdqf006q6whdedihgdbb","openrouterId":"arcee-ai/spotlight","slug":"arcee-ai-spotlight","name":"Arcee AI: Spotlight","description":"Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned by Arcee AI for tight image‑text grounding tasks. It offers a 32 k‑token context window, enabling rich multimodal...","contextLength":131072,"pricing":{"prompt":2.1600000000000003e-7,"completion":2.1600000000000003e-7,"image":0,"request":0},"modalities":["text","image->text"],"perWeekTokens":null,"provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2025-05-05T17:45:52.249Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjdqx006r6whdsv2h225y","openrouterId":"arcee-ai/maestro-reasoning","slug":"arcee-ai-maestro-reasoning","name":"Arcee AI: Maestro Reasoning","description":"Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32 B tuned with DPO and chain‑of‑thought RL for step‑by‑step logic. Compared to the earlier 7 B...","contextLength":131072,"pricing":{"prompt":0.0000010799999999999998,"completion":0.00000396,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2025-05-05T17:41:09.235Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjdrd006s6whdclxkkuqp","openrouterId":"arcee-ai/virtuoso-large","slug":"arcee-ai-virtuoso-large","name":"Arcee AI: Virtuoso Large","description":"Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k...","contextLength":131072,"pricing":{"prompt":9e-7,"completion":0.00000144,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2025-05-05T17:01:25.294Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkjdrv006t6whdzirisfkq","openrouterId":"arcee-ai/coder-large","slug":"arcee-ai-coder-large","name":"Arcee AI: Coder Large","description":"Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file...","contextLength":32768,"pricing":{"prompt":6e-7,"completion":9.6e-7,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"arcee-ai","authorName":"arcee-ai","authorSlug":"arcee-ai","iconUrl":"https://openrouter.ai/images/icons/arcee-ai.svg","releaseDate":"2025-05-05T16:57:43.438Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true},{"id":"cmoxkje8j007t6whdoa78toq9","openrouterId":"meta-llama/llama-guard-3-8b","slug":"meta-llama-llama-guard-3-8b","name":"Llama Guard 3 8B","description":"Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification)...","contextLength":131072,"pricing":{"prompt":5.808e-7,"completion":3.6e-8,"image":0,"request":0},"modalities":["text->text"],"perWeekTokens":null,"provider":"meta-llama","authorName":"Meta Llama","authorSlug":"meta-llama","iconUrl":"https://openrouter.ai/images/icons/Meta Llama.svg","releaseDate":"2025-02-12T18:01:58.468Z","avgThroughputTps":0,"avgLatencyMs":0,"isActive":true}]}