Supported Models

Compare Claude, GPT, Gemini, GLM, and BYOK providers — capabilities, pricing, and recommendations

MiraBridge currently runs four managed AI providers (Anthropic, OpenAI, Google, GLM) and supports an additional set of BYOK providers reachable through the same gateway. The exact model list evolves as provider catalogs change, so this page focuses on the current flagship options and family coverage rather than a frozen full list.

Current Flagship Picks

| Model | Provider | Best For | Context Window | | --------------- | --------- | ------------------------------------------------- | -------------- | | Claude Opus 4.7 | Anthropic | Deep reasoning, complex architecture, coding | 1M tokens | | GPT-5.5 | OpenAI | Complex professional work, coding, agents | 1M tokens | | Gemini 2.5 Pro | Google | Long-context analysis, research, multimodal tasks | 2M tokens | | GLM-5 | GLM | Balanced general-purpose coding and chat | 128K tokens |

Provider Coverage

  • Anthropic — Claude Opus 4.7 is active, alongside Claude 4.6, 4.5, and selected 3.5 models
  • OpenAI — GPT-5.5 is active, alongside GPT-5.4, GPT-5.2, GPT-5.1, GPT-5 mini, GPT-4.1, and reasoning models
  • Google — Gemini 2.5 model family is fully active. Gemini 3.1 Flash Lite is the default Google model. Gemini 3.1 Pro Preview and Gemini 3 Flash Preview are available as preview-tier models.
  • GLM — GLM-5 is the default, alongside GLM-4.5 Air, GLM-4.7, GLM-5 Turbo, and GLM-5.1

Claude (Anthropic)

Claude excels at code generation, architectural reasoning, and following complex instructions. It remains a strong default for BUILD and ARCHITECT-style work. Claude Opus 4.7 is active for the most demanding reasoning and architecture tasks.

GPT (OpenAI)

GPT covers everything from fast day-to-day requests to high-end reasoning and agent workflows. GPT-5.5 is active as the flagship choice for demanding tasks, while smaller GPT variants are useful for speed-sensitive work.

Gemini (Google)

Gemini offers especially strong long-context performance. Gemini 2.5 Pro is the active flagship recommendation for large codebases and research-heavy tasks, with 2M tokens of context window. The Gemini 2.5 Flash variants stay cost-effective for higher-volume usage, and Gemini 3.1 Flash Lite is the default Google model for most agent workflows. Gemini 3.1 Pro Preview is also available as a preview-tier option for teams that want to evaluate the next Gemini generation early.

GLM

GLM models are useful for cost-efficient general-purpose coding and chat workflows. The full GLM family is available through the same Gateway as the other providers:

  • GLM-5 — balanced general-purpose default
  • GLM-4.5 Air — fast, lightweight option for chat and small coding tasks
  • GLM-4.7 — stronger reasoning model
  • GLM-5 Turbo — low-latency variant of GLM-5
  • GLM-5.1 — latest GLM model for complex reasoning and coding

GLM is supported on Desktop, VS Code, and Mobile through the standard provider routing path.

BYOK Providers

In addition to the four managed providers, MiraBridge can route to a wider set of providers when you supply your own API key. See Bring Your Own Key for setup. Flagship model picks per BYOK provider:

| Provider | Flagship model IDs | Context window | Notes | | --- | --- | --- | --- | | xAI Grok | grok-4, grok-4-1-fast, grok-code-fast-1 | up to 2M (grok-4-1-fast) | Native tool calling. Live-search tool carries a separate per-request surcharge. | | DeepSeek | deepseek-v4-pro, deepseek-v4-flash | 128K | Cache-read discount applies to long sessions. | | Mistral | mistral-large-latest, codestral-latest | 128K | Codestral is tuned for code tasks. | | Qwen | qwen3-coder-plus, qwen3-coder-flash | up to 256K | Alibaba DashScope OpenAI-compatible endpoint. | | Perplexity Sonar | sonar, sonar-pro | 128K | Search-augmented. Tool calling limited; Sonar Pro carries a search surcharge. | | Cohere | command-a-plus-05-2026, command-r-plus-08-2024 | 128K | Citations + tool_plan preserved in responses. | | Cerebras | llama-3.3-70b, qwen-3-235b-a22b | 128K | LPU inference; free tier available, paid tier model-dependent. | | Groq | llama-3.3-70b-versatile, qwen3-32b | 128K | Low-latency inference. No cache discount. | | Together AI | Llama-4-Maverick, Qwen3-Coder-32B | model-dependent | OpenAI-compatible serving for open-weight models. | | Fireworks AI | firefunction-v2, llama4-70b | model-dependent | OpenAI-compatible serving; firefunction-v2 is tuned for function calling. | | OpenRouter | passthrough (e.g. anthropic/claude-*, openai/gpt-*) | model-dependent | Routes to the backing provider; per-request markup applies. | | LiteLLM proxy | operator-defined | depends on backing model | Self-hosted proxy in front of any provider. | | Hugging Face Router | passthrough (auto-routed) | model-dependent | Routes to HF Inference Providers based on the model id. | | Ollama | local-only models | model-dependent | Local OpenAI-compatible endpoint. Free at the provider. |

Context windows above reflect the providers' published numbers for the listed flagship models at the time of writing; check each provider's own model documentation for the current limit on a specific model id. Multi-field providers (Azure OpenAI, AWS Bedrock, Cloudflare Workers AI, Hugging Face Inference Endpoints) require additional configuration and roll out in a follow-up release — see the BYOK page for the current status.

Automatic Failover

If one provider is unavailable, MiraBridge can fall back to an alternative provider. This helps keep coding sessions moving when a single provider is degraded.

Next Steps