AI Models at Okara
Current as of Sep 10, 2025
Know exactly which AI models are available and their capabilities. Okara's comprehensive model directory.
| Model Name | Purpose | Company | Credit Usage | Chat |
|---|---|---|---|---|
Qwen 2.5 | Qwen2.5 has demonstrated top-tier performance on a wide range of benchmarks evaluating language understanding, reasoning, mathematics, coding, human preference alignment | ... | ||
Qwen 3 | Qwen 3 235B A22B Instruct 2507 model. Mixture-of-experts LLM with math and reasoning capabilities | ... | ||
Qwen 3 Coder | Qwen3 Coder 480B is a specialized programming model designed for ultra-efficient agentic code generation with long context and state-of-the-art performance | ... | ||
Qwen 3 Next | A new generation of open-source, non-thinking mode model powered by Qwen3. This version demonstrates superior Chinese text understanding, augmented logical reasoning, and enhanced capabilities in text generation tasks over the previous iteration (Qwen3-235B-A22B-Instruct-2507). | ... | ||
Qwen 3 Next Thinking | A new generation of Qwen3-based open-source thinking mode models. This version offers improved instruction following and streamlined summary responses over the previous iteration (Qwen3-235B-A22B-Thinking-2507). | ... | ||
Qwen 3 VL Instruct | The Qwen3 series VL models has been comprehensively upgraded in areas such as visual coding and spatial perception. Its visual perception and recognition capabilities have significantly improved, supporting the understanding of ultra-long videos, and its OCR functionality has undergone a major enhancement. | ... | ||
Qwen Image/Edit | Qwen Image/Edit generation model | ... | ||
Claude Sonnet 4.5 | Anthropic Claude Sonnet 4.5 model | ... | ||
Flux 2 Dev | Flux 2 Dev generation model | ... | ||
Deepseek Reasoner | DeepSeek-R1 provides customers a state-of-the-art reasoning model, optimized for general reasoning tasks, math, science, and code generation. | ... | ||
Deepseek V3.1 | DeepSeek V3.1 is an open-source, hybrid Mixture-of-Experts (MoE) model released by DeepSeek AI, featuring 671 billion total parameters, 37 billion active parameters per query, and a 128k token context window. | ... | ||
Deepseek V3.2 | DeepSeek-V3.2: A state-of-the-art model optimized for general tasks, math, science, and code generation, featuring a 128k token context window. | ... | ||
Deepseek V3.2 Speciale | DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities (thinking mode only) | ... | ||
Deepseek V3.2 Thinking | DeepSeek-V3.2-Thinking: A state-of-the-art model optimized for reasoning tasks, math, science, and code generation, featuring a 128k token context window. | ... | ||
Llama 3.3 | The upgraded Llama 3.1 70B model features enhanced reasoning, tool use, and multilingual abilities, along with a significantly expanded 128K context window. These improvements make it well-suited for demanding tasks such as long-form summarization, multilingual conversations, and coding assistance. | ... | ||
Llama 4 Maverick | Llama 4 Maverick 17B-128E is Llama 4's largest and most capable model. It uses the Mixture-of-Experts (MoE) architecture and early fusion to provide coding, reasoning, and image capabilities. | ... | ||
Llama 4 Scout | Llama-4-Scout-17B-16E-Instruct model is a state-of-the-art, instruction-tuned, multimodal AI model developed by Meta as part of the Llama 4 family. It is designed to handle both text and image inputs, making it suitable for a wide range of applications, including conversational AI, code generation, and visual reasoning. | ... | ||
Minimax M2 | MiniMax-M2 redefines efficiency for agents. It is a compact, fast, and cost-effective MoE model (230 billion total parameters with 10 billion active parameters) built for elite performance in coding and agentic tasks, all while maintaining powerful general intelligence. | ... | ||
Ministral 3B | A compact, efficient model for on-device tasks like smart assistants and local analytics, offering low-latency performance. Part of the Mistral 3 family. | ... | ||
Ministral 8B | A powerful small model with faster, memory-efficient inference, ideal for complex workflows and demanding edge applications. Part of the Mistral 3 family with multimodal capabilities. | ... | ||
Mistral Large 3 | Mistral's most capable model with 41B active parameters (675B total) using sparse mixture-of-experts architecture. Excels at multilingual conversations, image understanding, and general reasoning tasks. | ... | ||
Mistral Small | Mistral-small-3.2 is a 24-billion-parameter open-source language model that is an incremental update to its predecessor, 3.1. It features improved instruction following, reduced repetitive outputs, and enhanced performance in coding and STEM tasks | ... | ||
Kimi K2 | Kimi K2 0905 has shown strong performance on agentic tasks thanks to its tool calling, reasoning abilities, and long context handling. But as a large parameter model (1T parameters), it’s also resource-intensive. Running it in production requires a highly optimized inference stack to avoid excessive latency. | ... | ||
Kimi K2 Thinking | Kimi K2 Thinking is an advanced open-source thinking model by Moonshot AI. It can execute up to 200 – 300 sequential tool calls without human interference, reasoning coherently across hundreds of steps to solve complex problems. Built as a thinking agent, it reasons step by step while using tools, achieving state-of-the-art performance on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks, with major gains in reasoning, agentic search, coding, writing, and general capabilities. | ... | ||
GPT-5.1 | OpenAI GPT-5 model | ... | ||
GPT-OSS 120B | This model excels at efficient reasoning across science, math, and coding applications. It's ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads. | ... | ||
GPT-OSS 20B | A compact, open-weight language model optimized for low-latency and resource-constrained environments, including local and edge deployments | ... | ||
Diffusion 3.5 Large | Stability Stable Diffusion 3.5 Large model | ... | ||
Z Image Turbo | Z Image Turbo generation model | ... | ||
intellect 3 | INTELLECT-3 is a state-of-the-art performance for its size across math, code and reasoning. | ... | ||
GLM 4.5 Air | GLM-4.5-Air built as foundational models for agent-oriented applications. Leverage a Mixture-of-Experts (MoE) architecture. GLM-4.5-Air adopts a more streamlined design with 106B total parameters and 12B active parameters. | ... | ||
GLM 4.6 | GLM-4.6 achieves comprehensive enhancements across multiple domains, including real-world coding, long-context processing, reasoning, searching, writing, and agentic applications. | ... |
Alibaba
Qwen2.5 has demonstrated top-tier performance on a wide range of benchmarks evaluating language understanding, reasoning, mathematics, coding, human preference alignment
Alibaba
Qwen 3 235B A22B Instruct 2507 model. Mixture-of-experts LLM with math and reasoning capabilities
Alibaba
Qwen3 Coder 480B is a specialized programming model designed for ultra-efficient agentic code generation with long context and state-of-the-art performance
Alibaba
A new generation of open-source, non-thinking mode model powered by Qwen3. This version demonstrates superior Chinese text understanding, augmented logical reasoning, and enhanced capabilities in text generation tasks over the previous iteration (Qwen3-235B-A22B-Instruct-2507).
Alibaba
A new generation of Qwen3-based open-source thinking mode models. This version offers improved instruction following and streamlined summary responses over the previous iteration (Qwen3-235B-A22B-Thinking-2507).
Alibaba
The Qwen3 series VL models has been comprehensively upgraded in areas such as visual coding and spatial perception. Its visual perception and recognition capabilities have significantly improved, supporting the understanding of ultra-long videos, and its OCR functionality has undergone a major enhancement.
DeepSeek
DeepSeek-R1 provides customers a state-of-the-art reasoning model, optimized for general reasoning tasks, math, science, and code generation.
DeepSeek
DeepSeek V3.1 is an open-source, hybrid Mixture-of-Experts (MoE) model released by DeepSeek AI, featuring 671 billion total parameters, 37 billion active parameters per query, and a 128k token context window.
DeepSeek
DeepSeek-V3.2: A state-of-the-art model optimized for general tasks, math, science, and code generation, featuring a 128k token context window.
DeepSeek
DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities (thinking mode only)
DeepSeek
DeepSeek-V3.2-Thinking: A state-of-the-art model optimized for reasoning tasks, math, science, and code generation, featuring a 128k token context window.
Meta
The upgraded Llama 3.1 70B model features enhanced reasoning, tool use, and multilingual abilities, along with a significantly expanded 128K context window. These improvements make it well-suited for demanding tasks such as long-form summarization, multilingual conversations, and coding assistance.
Meta
Llama 4 Maverick 17B-128E is Llama 4's largest and most capable model. It uses the Mixture-of-Experts (MoE) architecture and early fusion to provide coding, reasoning, and image capabilities.
Meta
Llama-4-Scout-17B-16E-Instruct model is a state-of-the-art, instruction-tuned, multimodal AI model developed by Meta as part of the Llama 4 family. It is designed to handle both text and image inputs, making it suitable for a wide range of applications, including conversational AI, code generation, and visual reasoning.
MiniMax
MiniMax-M2 redefines efficiency for agents. It is a compact, fast, and cost-effective MoE model (230 billion total parameters with 10 billion active parameters) built for elite performance in coding and agentic tasks, all while maintaining powerful general intelligence.
Mistral AI
A compact, efficient model for on-device tasks like smart assistants and local analytics, offering low-latency performance. Part of the Mistral 3 family.
Mistral AI
A powerful small model with faster, memory-efficient inference, ideal for complex workflows and demanding edge applications. Part of the Mistral 3 family with multimodal capabilities.
Mistral AI
Mistral's most capable model with 41B active parameters (675B total) using sparse mixture-of-experts architecture. Excels at multilingual conversations, image understanding, and general reasoning tasks.
Mistral AI
Mistral-small-3.2 is a 24-billion-parameter open-source language model that is an incremental update to its predecessor, 3.1. It features improved instruction following, reduced repetitive outputs, and enhanced performance in coding and STEM tasks
Moonshot AI
Kimi K2 0905 has shown strong performance on agentic tasks thanks to its tool calling, reasoning abilities, and long context handling. But as a large parameter model (1T parameters), it’s also resource-intensive. Running it in production requires a highly optimized inference stack to avoid excessive latency.
Moonshot AI
Kimi K2 Thinking is an advanced open-source thinking model by Moonshot AI. It can execute up to 200 – 300 sequential tool calls without human interference, reasoning coherently across hundreds of steps to solve complex problems. Built as a thinking agent, it reasons step by step while using tools, achieving state-of-the-art performance on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks, with major gains in reasoning, agentic search, coding, writing, and general capabilities.
OpenAI
This model excels at efficient reasoning across science, math, and coding applications. It's ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads.
OpenAI
A compact, open-weight language model optimized for low-latency and resource-constrained environments, including local and edge deployments
Unknown
INTELLECT-3 is a state-of-the-art performance for its size across math, code and reasoning.
Zhipu AI
GLM-4.5-Air built as foundational models for agent-oriented applications. Leverage a Mixture-of-Experts (MoE) architecture. GLM-4.5-Air adopts a more streamlined design with 106B total parameters and 12B active parameters.
Zhipu AI
GLM-4.6 achieves comprehensive enhancements across multiple domains, including real-world coding, long-context processing, reasoning, searching, writing, and agentic applications.
Showing 32 AI models