Claude Opus 4.6 launched on February 5, 2026. The announcement highlights benchmarks and finance use cases. This reference focuses on what matters for Claude Code users: the features that change how you work day-to-day.
The Four Changes That Matter
1. 1M Token Context Window (Beta)
Previous Opus models had a 200k context ceiling. Opus 4.6 extends this to 1 million tokens in beta.
What this means in practice:
| Content Type | ~Token Count | Fits in 200k | Fits in 1M |
|---|---|---|---|
| A single large file (5k lines) | ~20k | Yes | Yes |
| A mid-size codebase (50 files) | ~150k | Barely | Yes |
| A large codebase (200+ files) | ~600k | No | Yes |
| Full repo + conversation history | ~800k | No | Yes |
The practical impact: Claude Code can now hold your entire project in context simultaneously, rather than re-reading files as it goes. This matters most for:
- Cross-file refactoring: Understanding how a change in file A affects files B, C, and D — without losing context of A while reading D
- Architecture analysis: Grasping the full dependency graph in a single pass
- Long sessions: Extended debugging or feature implementation sessions where conversation history itself consumes significant context
Caveat: Exceeding 200k tokens triggers higher pricing (2x input, 1.5x output) on the API. For Max Plan users, this is covered by the subscription. For API users, long-context sessions get expensive fast.
2. Context Compaction
As sessions run long, older conversation context is automatically summarized rather than dropped. Previous behavior: when approaching context limits, Claude either lost early context or the session degraded.
Opus 4.6’s compaction:
- Summarizes older parts of the conversation while preserving recent context in full
- Maintains key decisions, constraints, and accumulated understanding
- Operates transparently — you don’t trigger it manually
Why this matters for Claude Code: a 3-hour debugging session no longer loses the context from hour 1. The early investigation, hypotheses tested, and dead ends explored are compressed but retained. Claude remembers that you already tried approach X and it didn’t work.
This directly reduces the problem that cross-session memory patterns were designed to solve. It doesn’t eliminate the need for persistent memory (sessions still end), but within a session, context loss is significantly reduced.
3. Effort Controls
Four levels: low, medium, high, max.
This controls how much “thinking” (hidden reasoning tokens) Claude uses before responding. In practice:
| Level | Best For | Speed | Cost |
|---|---|---|---|
| Low | Simple file edits, formatting, obvious fixes | Fast | Low |
| Medium | Standard development tasks | Moderate | Moderate |
| High | Complex debugging, architecture decisions | Slow | Higher |
| Max | Problems requiring deep multi-step reasoning | Slowest | Highest |
Practical application in Claude Code:
Most Claude Code tasks don’t need max effort. A typo fix at max effort is wasting tokens. A complex race condition debug at low effort will produce shallow analysis.
The right pattern: start at medium, escalate to high/max when Claude’s first attempt is inadequate. Don’t default to max for everything — it’s slower and more expensive without proportional benefit for straightforward tasks.
For automated pipelines, effort level should be set per-step:
- HTTP data fetch + simple extraction → low
- Complex analysis requiring reasoning → high
- Strategic synthesis across multiple data sources → max
4. Adaptive Thinking
Related to effort controls but automatic: Opus 4.6 can decide whether extended reasoning (chain-of-thought) is beneficial for a given query. Previous models always used a fixed thinking approach.
For Claude Code users, this means:
- Simple tool calls (read a file, run a grep) execute faster because the model skips unnecessary reasoning
- Complex decisions (which files to modify, how to structure a refactor) trigger deeper thinking automatically
- You don’t need to prompt “think step by step” — the model judges when that’s needed
What Didn’t Change
Tool set: Same tools as before — Read, Write, Edit, Bash, Glob, Grep, etc. No new tools added.
CLAUDE.md behavior: Instructions work the same way. Your existing CLAUDE.md files don’t need updates.
Cross-session memory: Still resets between sessions. Auto-memory and supervisor state patterns remain necessary for continuity.
Pricing for standard context (under 200k tokens): $5/$25 per million tokens input/output — same as Opus 4.5.
Benchmark Context
The headline numbers, for reference:
| Benchmark | Opus 4.6 | Previous Best | What It Measures |
|---|---|---|---|
| Terminal-Bench 2.0 | Highest | — | Real-world agentic coding |
| MRCR v2 (1M, 8-needle) | 76% | 18.5% (Sonnet 4.5) | Long-context recall accuracy |
| GDPval-AA | +144 Elo vs GPT-5.2 | — | Economically valuable knowledge tasks |
| SWE-bench | Slight regression | Opus 4.5 | Open-source issue resolution |
The SWE-bench regression is worth noting. Better on long-context and planning; slightly worse on short, focused code fixes. This matches the architectural direction: Opus 4.6 is optimized for sustained, complex work rather than one-shot patches.
Migration Checklist
If you’re upgrading from Opus 4.5 or Sonnet 4.5 in Claude Code:
- No configuration changes needed — Model selection happens automatically if you’re on Max Plan
- Test long sessions — If you previously broke work into shorter sessions to avoid context degradation, try longer sessions now
- Adjust effort expectations — Tasks that required careful prompting at lower models may “just work” at default effort
- Monitor token usage — 1M context is powerful but expensive on API. Set up billing alerts if using API directly
- Review agent team use cases — If you have manual multi-step workflows, consider whether native agent teams simplify them
- Keep your memory patterns — Context compaction helps within sessions but doesn’t replace cross-session memory. Don’t remove your MEMORY.md or supervisor state
The Practical Bottom Line
Opus 4.6 makes Claude Code better at the things that were already its strengths: sustained multi-file work, complex reasoning, and large codebase navigation. The biggest quality-of-life improvement is the 1M context + compaction combination — long sessions finally work without degradation.
Effort controls give you a cost/speed dial. Default to medium, escalate when needed.
Agent teams are the structural feature. They change what’s possible in a single session: parallel investigation, concurrent analysis, team-style development. But they’re interactive-only — they don’t replace automated orchestration for persistent workflows.
The model is available now in Claude Code. Model ID: claude-opus-4-6.