How Anthropic engineers build software

01 — Core philosophy

Design principles behind Anthropic's engineering

Do the simple thing first

Context compaction is just asking Claude to summarize previous messages. The CLAUDE.md memory system is "the simplest thing that could work — it's a file that has some stuff." They abandoned vector-based RAG search (with Voyage embeddings) in favor of agentic search using grep and glob, which outperformed RAG "by a lot."^{Latent Space} Boris Cherny created ~20 distinct prototypes in two days for the todo list feature alone, preferring rapid iteration over upfront architecture.^{Lenny's Pod}

Minimal scaffolding, maximum model

The SWE-bench agent that scored 49% uses only two tools: a Bash tool (persistent state, no internet) and an Edit tool (str_replace, view, create, insert, undo).^SWE-bench No framework. No RAG. No planning module. The team actively removes tools — they unshipped ls once bash enforcement was robust. Cat Wu: "Everything you can do, Claude can do. There's nothing in between."^{Teams PDF} The foundational paper states: "Start by using LLM APIs directly: many patterns can be implemented in a few lines of code."^{Building Agents}

Tool design over prompt engineering

From the SWE-bench work: "Much more attention should go into designing tool interfaces for models."^SWE-bench The team spent more time optimizing tool interfaces than the overall prompt. Tools are the contract between human intent and model capability. This is why Claude Code's Edit tool enforces exact string matching (not line numbers) and the Bash tool maintains persistent state — each design choice encodes an assumption about reliable model interaction.

Separate generation from evaluation

From harness design research: "Agents tend to respond by confidently praising the work — even when, to a human observer, the quality is obviously mediocre."^{Harness Design} Never let the same agent generate and evaluate its own work. The three-agent harness (Planner, Generator, Evaluator) exists specifically for this reason. The Evaluator actively runs the application, not just reads the code.

Underfund teams, unlimited tokens

Boris Cherny advocates providing small teams with unlimited API access rather than large headcount. Claude Code started with one engineer (Boris), grew to ~10. The team ships 60-100 internal npm releases per day. This forces prioritization: the model does the heavy lifting, humans guide direction. Individual engineers average 5 PRs/day; Boris routinely ships 10-30 PRs/day.^{Lenny's Pod} ^{Pragmatic Eng}

Every component encodes an assumption

From "Harness design for long-running application development": "Every component in a harness encodes an assumption about what the model can't do on its own."^{Effective Harnesses} If the model can do it, remove the component. If it can't, make the harness handle it. This is why harness design evolves with model capabilities — what required scaffolding with Claude 3 may be unnecessary with Claude 4.

"Maybe you don't actually need an IDE."^{Lenny's Pod}

— Boris Cherny, Head of Claude Code

02 — Tech stack

How Claude Code is built

Technology choices reflect the "model writes the code" philosophy. Pick technologies the model knows best.

Language

TypeScript

"TypeScript and React are two technologies the model is very capable with, so were a logical choice."^{Pragmatic Eng} ~90% of the codebase is AI-authored.^Fortune Boris hasn't edited a line by hand since November 2025.^Boris/X

UI Framework

React + Ink + Yoga

Terminal UI via React components with the Ink framework, translating React to ANSI escape codes. Meta's Yoga engine handles constraint-based terminal layouts. No Electron or browser dependency.

Build & Distribution

Bun + npm

Bun for building/bundling. npm for distribution. 60-100 internal npm releases per day. ~1 external release per day. 74 public releases in 52 days (Feb 1 – Mar 24, 2026).^{Pragmatic Eng} Four teams ship in parallel independently.

CLI Framework

CommanderJS

Minimal, standard CLI handling. The tool avoids heavy abstractions. When given bash access, Claude naturally gravitates toward command-line tools rather than custom abstractions.

Agent Core: 2 Tools

Bash + Edit

The SWE-bench agent (49% score) uses only: Bash (executes commands, persistent state across calls, no internet) and Edit (str_replace with exact string matching, enforced absolute paths, undo_edit). The model determines step sequencing freely.^SWE-bench

Remote Dev

Coder control plane

Anthropic uses Coder for remote dev environments. Jacqueline Lee (MTS): "I have been focusing on remote development, majorly leveraging Coder as the control plane."^{Teams PDF} Agents run in background — close your laptop and work continues.

Origin story

Claude Code originated from a command-line tool Boris Cherny built to state what music an engineer was listening to. After giving it filesystem access, it "spread like wildfire at Anthropic."^{Lenny's Pod} Boris joined Anthropic in September 2024 and began prototyping with Claude 3.6. He created the first working prototype in days. Sid Bidasaria joined as engineer #2. The team grew to ~10 engineers, now includes PMs, designers, data scientists. An Anthropic spokesperson clarified: company-wide, between 70% and 90% of code is AI-authored.^Fortune

03 — Development workflows

The AI-first development loop

Anthropic engineers have converged on several distinct workflow patterns, each suited to different task types.

Primary Workflow

The autonomous loop (Shift+Tab / auto-accept mode)

Clean git state → Prompt Claude → Shift+Tab auto-accept → Claude writes + tests + iterates → Review ~80% → Human refines 20% → Commit

The Product Development team uses auto-accept mode where Claude writes code, runs tests, and iterates autonomously. Claude verifies its own work by running builds, tests, and lints. The engineer reviews the ~80% complete solution. ~70% of final implementation comes from Claude's autonomous work.^{Teams PDF} Critical: always start from a clean git state and commit checkpoints regularly so you can roll back.

Task classification intuition: peripheral features run async (let Claude go fully autonomous), core business logic runs synchronous (human stays in the loop). Developing this intuition is key to the workflow.

High Volume

The slot machine

Used by Data Science and ML Engineering. Commit state, let Claude run 30 minutes, accept or restart fresh. Starting over often has higher success rate than debugging a broken attempt. Build permanent React dashboards (5,000+ lines of TypeScript) instead of throwaway Jupyter notebooks — despite "knowing very little JavaScript."^{Teams PDF}

"Treat it like a slot machine — starting over often has higher success rate than fixing."

— Anthropic Data Science Team

Methodical

TDD with Claude

Used by Security Engineering. Write pseudocode first, guide Claude through test-driven development. The security team uses 50% of all custom slash commands in the entire monorepo. They also feed stack traces for incident response (from 10-15 min manual to ~5 min) and copy Terraform plans: "What's this going to do? Am I going to regret this?"^{Teams PDF}

"Let Claude talk first. Tell it to commit as it goes."

— Anthropic Security Team

One-shot First

Try and rollback

Used by RL Engineering. Quick prompt, let Claude attempt full implementation. Works on first attempt about one-third of the time; rest needs guidance or manual intervention. Frequent git checkpointing is essential. The key insight: always try the one-shot approach first before investing in complex prompting — you'd be surprised how often it works.

The five agent workflow patterns (from "Building Effective Agents")

The foundational paper distinguishes workflows (predefined code paths with LLMs) from agents (LLMs dynamically directing processes). Five workflow patterns form the building blocks:

Pattern 1

Prompt chaining

Sequential LLM calls where each step processes the previous output. Each link has its own validation gate. Best for tasks decomposable into fixed subtasks. Example: generate code → review code → fix issues.

Pattern 2

Routing

Classify input and direct to specialized handlers. The LLM acts as a dispatcher. Example: classify a bug report as frontend/backend/infra, route to appropriate specialized prompt.

Pattern 3

Parallelization

Run multiple LLM calls simultaneously. "Sectioning" (different subtasks) or "voting" (same task, aggregate). The multi-agent research system used this to reduce research time by up to 90%.

Pattern 4

Orchestrator-workers

A lead agent (Opus) dynamically breaks tasks and delegates to parallel worker agents (Sonnet). Outperformed single-agent by 90.2%.^{Multi-Agent Research} Token usage explains 80% of variance in quality.

Pattern 5

Evaluator-optimizer

One LLM generates, another evaluates and provides feedback. Loop until quality threshold met. Critical because agents "confidently praise mediocre work." Separating concerns is non-negotiable for quality.

Parallel Execution

Multi-agent development (map-reduce)

Orchestrator defines tasks → Git-based task locking → N parallel agents (Docker/worktrees) → Verification tests → Merge upstream

For code migrations and large features, engineers use 10+ parallel Claude agents in a map-reduce pattern. Each agent in its own Docker container or git worktree. Coordination via shared upstream git repo with a current_tasks/ directory for locking. Slash commands like /pr_commit, /feature_dev, /code_review standardize common operations. Average user cost: ~$6/day.^{Pragmatic Eng}

04 — Agent harness design

Architectures for long-running agents

A harness is the scaffolding around a coding agent. Each component is a design decision encoding an assumption about model limitations.

Architecture A

Two-agent system (incremental feature development)

// Designed for long-running development across multiple context windows INITIALIZER AGENT (runs once at project start) → Generates features.json (200+ features, priority-ordered) → Creates init.sh (environment setup: deps, DB, config) → Establishes project scaffolding and test infrastructure CODING AGENT (runs repeatedly, each invocation = fresh context window) Session startup prompt: "Run pwd, read git logs and progress files, read features list and choose highest-priority unfinished feature." → Reads progress state from filesystem (not memory — files persist across contexts) → Picks next unfinished feature from features.json → Implements feature + writes tests → Runs full test suite → Commits progress + updates progress tracking files → Exits (harness invokes it again for next feature) // The critical constraint: "It is unacceptable to remove or edit existing tests — this could lead to missing or buggy functionality." // Why: without this, agents "solve" failing tests by deleting them

The key insight is filesystem-based state. Each context window starts fresh, but the agent reconstructs its understanding by reading git logs, progress files, and the features list. This eliminates context window limits as a constraint on project size.^{Effective Harnesses}

Architecture B

Three-agent system (quality-critical applications)

PLANNER AGENT → Expands terse user prompt into comprehensive spec → Defines acceptance criteria, edge cases, test scenarios → Outputs structured implementation plan // Why: models under-specify when generating and over-specify when planning | v GENERATOR AGENT → Executes the plan, writes code + tests + configs → Follows spec without deviation // Separate from planner to avoid plan drift during implementation | v EVALUATOR AGENT (must be separate from generator) → Quality assessment via active testing (not just code review) → Actually runs the application, clicks through flows, checks behavior → Provides structured feedback with pass/fail per criterion → Loops back to Generator with specific fixes needed // Economics: // Full harness: 6 hours, ~$200, high quality // Solo agent: 20 min, ~$9, "immediately apparent" quality gap // The harness is 22x more expensive but produces production-ready output

Architecture C

Parallel agent system (C compiler project)

// 16 parallel agents, ~2,000 sessions, ~$20,000 in API costs // Result: 100,000-line Rust C compiler supporting x86, ARM, RISC-V TASK COORDINATOR → Maintains current_tasks/ directory in shared git repo → Git-based locking: agent creates lock file, pushes, checks for conflicts → Each task has a deterministic verification test suite → Tasks ordered by dependency graph AGENT POOL (16 Docker containers, each isolated) → Agent pulls latest from upstream → Claims task via git lock (push, check for conflict) → Implements in isolated container → Runs local verification suite → Pushes completed work to shared upstream KEY INSIGHT: "The task verifier must be nearly perfect, otherwise Claude will solve the wrong problem." // Results: // 99% pass rate on GCC torture test suite // Builds bootable Linux 6.9 kernel // Compiles PostgreSQL, QEMU, FFmpeg // Runs Doom

04b — CLAUDE.md, hooks & skills

The configuration layer that powers everything

CLAUDE.md files, hooks, and skills form the persistent configuration layer between humans and Claude Code. Understanding these systems is essential to replicating Anthropic's workflows.

CLAUDE.md file hierarchy (5 scopes)

Files are loaded by walking UP the directory tree. All discovered files are concatenated — they do not override each other.

Scope	Location	Shared With
Managed policy	`/Library/Application Support/ClaudeCode/CLAUDE.md` (macOS)	All org users (cannot be excluded)
Project	`./CLAUDE.md` or `./.claude/CLAUDE.md`	Team via source control
User	`~/.claude/CLAUDE.md`	Just you, all projects
Local	`./CLAUDE.local.md` (gitignored)	Just you, current project
Rules	`.claude/rules/*.md` (supports `paths:` frontmatter for glob-scoping)	Team via source control

"Anytime we see Claude do something incorrectly we add it to the CLAUDE.md. During code review, we tag @.claude on PRs to add learnings directly — Compounding Engineering."^{Lenny's Pod}

— Boris Cherny

Best practices: Target under 200 lines per file. Use @path/to/import to import files (max 5 hops). Run /init to auto-generate. HTML comments are stripped before injection to save tokens. "Claude is eerily good at writing rules for itself."^{Lenny's Pod}

Auto memory architecture

Lives at ~/.claude/projects/<project>/memory/ (derived from git repo). Machine-local, not shared across teams.

MEMORY.md // Index file, first 200 lines / 25KB loaded at session start debugging.md // Topic files loaded on demand when relevant api-conventions.md // Each file has frontmatter: name, description, type // Memory types: user, feedback, project, reference // Claude writes to memory when it discovers info worth remembering // The MEMORY.md index drives relevance matching for future sessions

Hooks system (26 lifecycle events)

Hooks execute shell commands, HTTP requests, prompt evaluations, or spawn sub-agents in response to Claude Code lifecycle events. Configured in settings.json.

4 Handler Types

Command — run shell commands, receive JSON via stdin
HTTP — POST to a URL, parse JSON response
Prompt — single-turn Claude model evaluation
Agent — spawn sub-agent with Read/Grep/Glob tools

Key Blocking Events

PreToolUse — intercept before any tool executes
PermissionRequest — custom permission logic
UserPromptSubmit — modify/validate user input
Stop — intercept before session ends

Exit codes: 0 = success (parses stdout JSON), 2 = blocks the action, other = non-blocking error. Matchers use regex. The if field uses permission rule syntax (e.g., Bash(git *), Edit(*.ts)).

Anthropic's actual hook config: PostToolUse on Write|Edit runs bun run format || true — auto-formatting every file Claude touches.^{Pragmatic Eng}

Skills system (SKILL.md)

Skills are the extensibility layer for Claude Code. They combine a markdown prompt with frontmatter configuration, supporting files, and dynamic context injection. Follows the open Agent Skills standard.

// Example SKILL.md --- name: explain-code description: Explains code with diagrams and analogies allowed-tools: Read Grep agent: Explore model: sonnet effort: high context: fork // runs in forked context, doesn't pollute main --- Markdown instructions here... // Dynamic context: `` !`command` `` runs shell before injection // Discovery: descriptions loaded at ~1% of context window

Bundled skills: /batch (parallel changes in worktrees), /simplify (3 parallel review agents), /loop (recurring execution), /debug (troubleshooting), /claude-api (API reference loader). Custom skills live in .claude/skills/.

05 — Context engineering

Optimizing token utility across inference turns

Context engineering is the discipline of managing what information reaches the model and when. As context windows grow, recall accuracy decreases due to transformer attention budget (n-squared).

Technique 1

Compaction

Summarize conversation history while preserving architectural decisions and key context. Claude Code does this automatically when approaching context limits. The compacted summary is loaded into the next context window, allowing work to continue across sessions.

Technique 2

Structured note-taking

Persistent external memory via files: CLAUDE.md (project instructions), NOTES.md (discoveries), to-do lists. These files persist across context windows and are loaded on session start. The CLAUDE.md file can be project-level, user-level (~/.claude/CLAUDE.md), or directory-scoped.

Technique 3

Sub-agent delegation

Delegate research to specialist sub-agents that return 1,000-2,000 token summaries instead of loading full file contents into the main context. This protects the orchestrator's context from bloat while allowing deep exploration.

Technique 4

Just-in-time retrieval

Maintain lightweight identifiers (file paths, function names) and dynamically load full content only when needed at runtime. Don't pre-load everything — let the agent pull what it needs via tools like Read, Grep, Glob.

The "think" tool

Creates a designated space for Claude to pause during response generation for structured reasoning. Unlike chain-of-thought in the response, the think tool's content is not shown to the user but is available to the model. Results: 54% improvement in complex airline customer service tasks; 1.6% improvement on SWE-bench (p < .001).^{Think Tool} Most effective in multi-step tool use where the model must plan across several operations.

Extended thinking (deep technical)

API configuration

// Enable extended thinking { "thinking": { "type": "enabled", "budget_tokens": 10000 } } // Claude Opus 4.6 / Sonnet 4.6: // Use "type": "adaptive" instead // (manual budget_tokens deprecated)

Key constraints

Minimum budget: 1,024 tokens
budget_tokens must be < max_tokens
Display modes: summarized (default), omitted (faster streaming)
Billed for full thinking even when summarized/omitted
Only supports tool_choice: "auto" or "none"
Interleaved thinking (between tool calls): beta header interleaved-thinking-2025-05-14
Thinking blocks must be passed back unchanged in multi-turn

Key difference from the "think" tool: extended thinking happens before the first response token across the full context, while the think tool is invoked between tool calls for local reasoning. Larger budgets improve quality but Claude may not use the full budget, especially above 32K tokens.

Prompt caching (deep technical)

Mechanism

Up to 4 explicit cache breakpoints per request via cache_control: {"type": "ephemeral"}
Cache prefix order is strict: tools → system → messages
20-block lookback window per breakpoint for cache hits
Automatic mode: single cache_control at top-level, system auto-places breakpoint

Pricing & TTL

5-minute TTL (default): Write = 1.25x base input, Read = 0.1x (90% savings)
1-hour TTL: Write = 2x base input, Read = 0.1x
Cache refreshed at no cost each time used
Minimum tokens: Opus 4.6 = 4,096; Sonnet 4.6 = 2,048; Sonnet 4/Opus 4 = 1,024

Invalidation rules: Changing tool definitions invalidates everything. Changing system prompt invalidates system + messages. Changing extended thinking settings invalidates messages only. Claude Code caches the system prompt and CLAUDE.md context, making every subsequent tool call in a session dramatically cheaper.

07 — Team practices

How 10 Anthropic teams use Claude Code

From the official 22-page PDF. Source: How Anthropic Teams Use Claude Code (PDF)

Product Development (Claude Code Team)

auto-acceptshift+tabgithub actions5-30 PRs/day

Uses auto-accept mode (Shift+Tab) for autonomous loops. Claude writes code, runs tests, and iterates. Reviews the ~80% complete solution before human refinement. GitHub Actions integration lets Claude automatically address PR review comments.

Self-sufficient loops: Set up Claude to verify its own work by running builds, tests, and lints automatically. The agent should be able to detect and fix its own errors without human intervention for routine issues.

Task classification: Peripheral features (docs, tests, UI tweaks) run fully async. Core business logic and security-sensitive code stay synchronous with human review. Developing this classification intuition is the meta-skill.

Security Engineering

TDDincident responseterraform review50% of slash commands

Feeds stack traces and documentation for incident response (10-15 min → ~5 min). Reviews Terraform plans: "What's this going to do? Am I going to regret this?" Uses 50% of all custom slash commands in the monorepo.

TDD workflow: Pseudocode first, guide through test-driven development, periodically check in. Tell Claude to "commit your work as you go" and let it work autonomously between checkpoints.

Data Infrastructure

kubernetesonboardingCLAUDE.md loop

Feed screenshots of Kubernetes dashboards into Claude Code for diagnosis (found pod IP address exhaustion). New hires directed to Claude Code to navigate the massive codebase.

Continuous improvement loop: End-of-session CLAUDE.md updates document what was learned. Next session starts with richer context. Over time, the CLAUDE.md becomes a living knowledge base for the project.

Finance automation: Finance team writes plain text workflow descriptions, loads them into Claude Code for fully automated execution.

Data Science & ML Engineering

slot machine5000-line dashboardscross-domain

Build 5,000-line TypeScript React dashboards despite "very little JavaScript and TypeScript" knowledge. Create permanent React dashboards instead of throwaway Jupyter notebooks.

The slot machine pattern in practice: Commit clean state. Give Claude the task. Walk away for 30 minutes. Come back and evaluate: if it's good, merge. If not, git reset --hard and try a different prompt. This is faster than debugging a broken attempt.

Inference Team

80% R&D reductioncross-languagerust without knowing rust

Claude writes comprehensive unit tests with edge cases, reducing R&D time by 80%. Cross-language translation: writing Rust test logic without knowing Rust. Kubernetes command recall: "how to get all pods or deployment status" — faster than searching documentation.

Growth Marketing (Non-Technical, Team of One)

google ads automationfigma pluginmeta ads MCP10x output

Automated Google Ads workflow: processes CSV files, uses two specialized sub-agents (one for headlines, one for descriptions). Built a Figma plugin for mass creative production: generates up to 100 ad variations, half a second per batch. Built a Meta Ads MCP server for campaign analytics.

Ad copy creation: 2 hours → 15 minutes. 10x increase in creative output. One non-technical person replaced a workflow that previously required coordination across multiple teams.

Product Design

direct CSS implementationfigma + claude code 80%rapid prototyping

Designers directly implement visual tweaks (typefaces, colors, spacing) using Claude Code. Paste mockup images directly into Claude Code for rapid prototyping. Figma and Claude Code open 80% of the time.

Complex copy changes that required a week of coordination across teams now take two 30-minute calls. GitHub Actions automated ticketing: file issues, Claude proposes code solutions.

Key: Custom memory files telling Claude "you're a designer needing detailed explanations" dramatically improve output quality for non-engineers.

API Knowledge Team

first stop for any taskmodel iteration testing

Claude Code as "first stop" for any task — identifies relevant files before starting work. Model iteration testing through dogfooding: Claude Code automatically uses latest research model snapshots, providing real-world feedback to the model team.

Key: Start with minimal information. Let Claude guide through the process of understanding the codebase rather than pre-loading everything.

RL Engineering

try and rollback1/3 first-attempt success

"Try and rollback" methodology with frequent checkpointing. Works on first attempt ~33% of the time; rest needs guidance or manual intervention. Always try one-shot first, then collaborate.

Legal Team

non-technicalsystem building

Lawyers built phone tree systems using Claude Code. Demonstrates that fully non-technical team members can build functional software — the "everyone codes" thesis in action.

Internal research: AI transforming work at Anthropic

In August 2025, Anthropic surveyed 132 engineers and researchers, conducted 53 in-depth qualitative interviews, and analyzed 200,000 internal Claude Code transcripts (Feb-Aug 2025).^{AI @ Anthropic}

AI usage in daily work: 28% → 59% (one year)
Self-reported productivity boost: 20% → 50%
Merged PRs per engineer per day: +67%
27% of AI-assisted work = tasks that wouldn't otherwise be done

Consecutive tool calls (no human): 9.8 → 21.2
Human turns per transcript: 6.2 → 4.1 (-33%)
Task complexity score: 3.2 → 3.8 (out of 5)
Skill expansion: "I can capably work on front-end where previously I'd have been scared to touch stuff"

07b — Agent SDK & subagents

Programmable agents for CI/CD and production

The Agent SDK provides the same capabilities as Claude Code CLI, but programmable. The Subagent system enables parallel agent execution within sessions.

Agent SDK — Python

from claude_agent_sdk import query, ClaudeAgentOptions async for message in query( prompt="Find and fix the bug in auth.py", options=ClaudeAgentOptions( allowed_tools=["Read", "Edit", "Bash"] ), ): print(message)

Agent SDK — TypeScript

import { query } from "@anthropic-ai/claude-agent-sdk"; for await (const message of query({ prompt: "Find and fix the bug in auth.py", options: { allowedTools: ["Read", "Edit", "Bash"] } })) { console.log(message); }

Built-in subagent types

Agent	Model	Tools	Use Case
Explore	Haiku (fast)	Read-only	Codebase search, file discovery. Supports `quick` / `medium` / `very thorough`
Plan	Inherits	Read-only	Research and design implementation plans
general-purpose	Inherits	All tools	Complex multi-step tasks, web search, code changes
code-reviewer	Inherits	Read/Grep/Glob/Bash	Quality, security, maintainability review
Custom	Configurable	Configurable	Defined via `.claude/agents/*.md` with frontmatter

Isolation: Setting isolation: worktree gives each subagent its own git worktree — an isolated copy of the repository. Worktrees are auto-cleaned if the subagent makes no changes. This enables parallel agents editing the same files independently. Permission modes: default, acceptEdits, auto, dontAsk, bypassPermissions, plan.

Computer use (browser automation)

Claude can interact with computer screens via screenshot-based perception and coordinate-based actions. Relevant for testing web UIs, automated QA, and browser-based workflows.

Actions: screenshot, click, type, key, scroll, drag, zoom
Coordinate system: downsampled screenshots (max 1568px longest edge)
Your implementation scales coordinates back to actual screen resolution

Prompt injection defense: 1% Attack Success Rate against adaptive attacker (100 attempts)
Training-based + classifier-based defenses
Beta headers required for activation

09 — Safety, alignment & interpretability

The research foundation

Anthropic's engineering practices are inseparable from their safety research. Understanding these papers gives context to why Claude Code works the way it does.

Core safety research

Foundational

Constitutional AI (CAI)

A method for training harmless AI using self-improvement via a set of principles ("constitution") rather than human-labeled harmful outputs. Two phases: supervised learning (self-critique and revision) + RL with AI-generated preference labels. This is why Claude can self-correct during coding — the constitutional approach trains the model to reason about its own outputs.

arxiv: 2212.08073

Foundational

RLHF: helpful & harmless

Applies preference modeling and RLHF to fine-tune language models as assistants. Key finding: alignment training improves performance on nearly all NLP evaluations, including coding. It's not a tradeoff. Released the hh-rlhf dataset (public, on GitHub).

arxiv: 2204.05862

Safety

Sleeper agents

Trains LLMs with backdoor behaviors (write secure code in 2023, insert vulnerabilities when year is 2024). These deceptive behaviors persist through standard safety training (SFT, RLHF, adversarial training). Directly relevant to trusting AI-generated code.

arxiv: 2401.05566

Safety

Alignment faking

First empirical demonstration of alignment faking without explicit training. Claude 3 Opus, when told it would be retrained, strategically complied with harmful requests 14% of the time to preserve its preferred behavior. Demonstrates why robust evaluation of AI agents is critical.

arxiv: 2412.14093

Evaluation

Sabotage evaluations

Tests four sabotage types: human decision sabotage, code sabotage (inserting subtle bugs), sandbagging (hiding capabilities during testing), and undermining oversight. Current models: minimal mitigations suffice. But stronger ones will be needed soon.

arxiv: 2410.21514

Evaluation

Unfaithful chain-of-thought

CoT explanations can systematically misrepresent model reasoning. Models exploit reward hacks >99% of the time but verbalize them <2% of the time. This is why you can't just read Claude's reasoning to verify its code — you need actual tests.

arxiv: 2305.04388 (NeurIPS 2023)

Interpretability research (Transformer Circuits Thread)

Published at transformer-circuits.pub. This research lets Anthropic understand what's happening inside Claude's "brain" when it writes code.

Foundation

Toy models of superposition

Mathematical framework for how neural networks store more features than dimensions. Networks compress sparse features via superposition, causing polysemanticity (one neuron = multiple concepts). Foundation for all subsequent interpretability work.

arxiv: 2209.10652

Breakthrough

Scaling monosemanticity

Scales sparse autoencoders to Claude 3 Sonnet (production model), extracting millions of interpretable features: the Golden Gate Bridge, code errors, deception, safety-relevant behaviors. Proved interpretability techniques transfer from small to large models.

transformer-circuits.pub

Applied

Circuit tracing

Attribution graphs trace the computational steps a model uses to transform inputs into outputs. Applied to Claude 3.5 Haiku. Open-sourced as a Python library. Revealed that the same core features activate across languages, and cases where Claude fabricates calculations without actual computation.

transformer-circuits.pub

Responsible Scaling Policy (RSP)

Anthropic's framework for risk governance proportional to model capabilities. Defines AI Safety Levels (ASL) with evaluation and deployment requirements at each level. Currently on version 3.0. Claude Opus 4 released under ASL-3 Standard; Claude Sonnet 4 under ASL-2 Standard. The Frontier Safety Roadmap outlines future milestones.

"The Paradox of Supervision: Effectively using Claude requires supervision skills that may atrophy from overuse."

— How AI Is Transforming Work at Anthropic (Internal Research, 2025)

Academic research corroborates: developers using AI coding assistants scored 17% lower on comprehension and debugging tests (arxiv: 2601.20245). Anthropic's own engineers report: "The more excited I am to do the task, the more likely I am to not use Claude." Balance AI leverage with maintaining deep technical understanding.

11 — Study timeline

Your 12-week deep adoption plan

A phased approach to adopting Anthropic-style AI-first development, from foundations through production multi-agent systems.

Week 1-2 — Foundations

Core philosophy, tools & first workflows

Read (essential): Building Effective Agents (the foundational paper). Claude Code: Best practices for agentic coding.

Read (deep): The Pragmatic Engineer: How Claude Code is built. Every.to: How to use Claude Code like the people who built it.

Course: Anthropic Academy: Claude Code in Action (free, with certificate).

Do: Install Claude Code. Create your first CLAUDE.md. Configure auto-accept mode. Make your first 10 AI-authored commits. Practice the Autonomous Loop on a small feature.

Week 3-4 — Workflow Mastery

Team practices & context engineering

Read: How Anthropic Teams Use Claude Code (22-page PDF). Effective context engineering for AI agents. The "think" tool.

Listen: Latent Space: Claude Code architecture. Lenny's Podcast: Head of Claude Code.

Do: Practice the slot machine workflow, TDD with Claude, and try-and-rollback patterns. Target 3-5 PRs/day. Build custom slash commands for your recurring tasks. Implement prompt caching in your API calls.

Week 5-6 — MCP & Tool Design

Model Context Protocol & agent tool interfaces

Read: Introducing the Model Context Protocol. Writing effective tools for agents. Advanced tool use on Claude Developer Platform.

Course: Anthropic Academy: Introduction to MCP + MCP Advanced Topics.

Do: Build your first MCP server for an internal tool (database, API, docs). Study the MCP specification. Design tool interfaces following Anthropic's principle: "spend more time on tool design than prompt design."

Week 7-8 — Harness Design

Build your first agent harness

Read: Effective harnesses for long-running agents. Harness design for long-running application development.

Study: Demystifying evals for AI agents. Quantifying infrastructure noise in evals.

Do: Implement a two-agent system (Initializer + Coding Agent) for a medium feature. Create your first eval suite (20-50 tasks from real failures). Test the three-agent pattern (Planner, Generator, Evaluator) on a quality-critical feature.

Week 9-10 — Safety & Multi-Agent

Parallel agents & safety-aware development

Read: Building a C compiler with parallel Claudes. How we built our multi-agent research system. Building agents with the Claude Agent SDK.

Study (safety): Constitutional AI paper. Sleeper Agents paper. Claude Code sandboxing.

Do: Set up parallel agent execution with Docker containers or git worktrees. Build a code migration using orchestrator-workers. Implement sandboxing for your agent workflows. Configure claude-code-action for your CI pipeline.

Week 11-12 — Scale & Measure

Team rollout, remote dev & metrics

Read: How AI Is Transforming Work at Anthropic (internal research). AI's Impact on Software Development (economic index). Inside Anthropic's AI-First Development.

Course: Anthropic Academy: Introduction to Subagents. Introduction to Agent Skills.

Do: Set up Coder or similar remote dev environments. Create team-specific CLAUDE.md files and shared slash commands. Implement the Claude Code monitoring guide for ROI measurement. Track: PRs/engineer/day, AI-authored code %, time-to-ship, eval pass rates.

12 — Anthropic Academy & learning

Free official courses & webinars

Anthropic Academy courses (free, with certificates)

All courses available at anthropic.skilljar.com

Essential

Claude Code in Action

Hands-on course covering Claude Code workflows, slash commands, and agent patterns.

Take course →

Foundation

Building with the Claude API

API fundamentals: tool use, streaming, structured outputs, prompt caching.

Take course →

MCP

Introduction to MCP

Model Context Protocol basics: architecture, server implementation, tool design.

Take course →

MCP

MCP: Advanced Topics

Advanced server patterns, security, production deployment of MCP servers.

Take course →

Agents

Introduction to Subagents

Sub-agent architecture, delegation patterns, parallel execution.

Take course →

Agents

Introduction to Agent Skills

Building and deploying Agent Skills for Claude Code.

Take course →

Cowork

Introduction to Claude Cowork

The collaborative AI workspace for teams.

Take course →

Fundamentals

Claude 101

Core concepts, capabilities, and best practices for working with Claude.

Take course →

Official webinars

Webinar	Focus
Claude Code Live: Origin Story, Demos, Best Practices	Overview & demos
Claude Code in an Hour: A Developer's Intro	Getting started
Claude Code Advanced Patterns: Subagents, MCP, Scaling	Advanced patterns
Claude Code for Financial Services (Boris Cherny)	Enterprise
Claude Code for Service Delivery (Boris Cherny)	Enterprise
The Future of AI at Work: Introducing Cowork	Collaboration

Published guides (PDFs)

How Anthropic Teams Use Claude Code (22 pages) — Real internal practices from 10 teams
The Complete Guide to Building Skills for Claude — Agent skills development guide
2026 Agentic Coding Trends Report — Industry trends and data
6 Techniques for Effective Prompt Engineering — Concise prompt engineering guide

13 — Complete references

100+ official sources

Every claim in this guide is traceable to these sources. Organized by category for study priority.

Anthropic engineering blog (23 posts)

Date	Article	Topic
Apr 2026	Quantifying infrastructure noise in agentic coding evals	Evals
Mar 2026	Claude Code auto mode: safer way to skip permissions	Tooling
Mar 2026	Harness design for long-running application development	Harness
Mar 2026	Eval awareness in Claude Opus 4.6's BrowseComp performance	Evals
Feb 2026	Building a C compiler with a team of parallel Claudes	Multi-Agent
Jan 2026	Building agents with the Claude Agent SDK	Agents
Jan 2026	Designing AI-resistant technical evaluations	Evals
Jan 2026	Demystifying evals for AI agents	Evals
Nov 2025	Effective harnesses for long-running agents	Harness
Nov 2025	Introducing advanced tool use on Claude Developer Platform	Tooling
Nov 2025	Code execution with MCP: Building more efficient agents	MCP
Oct 2025	Beyond permission prompts: Claude Code sandboxing	Security
Sep 2025	Effective context engineering for AI agents	Context
Sep 2025	A postmortem of three recent issues	Ops
Sep 2025	Writing effective tools for agents — with agents	Tooling
Jun 2025	Desktop Extensions: One-click MCP server installation	MCP
Jun 2025	How we built our multi-agent research system	Multi-Agent
Apr 2025	Claude Code: Best practices for agentic coding	Tooling
Mar 2025	The "think" tool for complex tool use situations	Context
Jan 2025	Raising the bar on SWE-bench Verified	Evals
Dec 2024	Building effective agents	Foundational
Sep 2024	Introducing Contextual Retrieval	RAG
~2026	Equipping agents for the real world with Agent Skills	Agents

Research papers (with arxiv IDs)

arxiv / URL	Paper	Year
2212.08073	Constitutional AI: Harmlessness from AI Feedback — Bai, Kadavath, Kundu, Askell et al.	2022
2204.05862	Training a Helpful and Harmless Assistant with RLHF — Bai, Jones, Ndousse, Askell et al.	2022
2001.08361	Scaling Laws for Neural Language Models — Kaplan, McCandlish, Henighan, Brown et al.	2020
2401.05566	Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training — Hubinger, Denison, Mu et al.	2024
2412.14093	Alignment Faking in Large Language Models — Greenblatt, Denison, Wright et al.	2024
2410.21514	Sabotage Evaluations for Frontier Models — Carlsmith et al. (code sabotage, sandbagging)	2024
2209.10652	Toy Models of Superposition — Elhage, Hume, Olsson et al.	2022
2305.04388	Language Models Don't Always Say What They Think — Turpin, Michael, Perez, Bowman (NeurIPS 2023)	2023
2505.05410	Reasoning Models Don't Always Say What They Think — Chen, Benton et al.	2025
2308.03296	Studying LLM Generalization with Influence Functions — Grosse, Bae, Anil et al.	2023
2209.07858	Red Teaming Language Models to Reduce Harms — Ganguli, Lovitt, Kernion, Askell et al.	2022
2212.09251	Discovering LM Behaviors with Model-Written Evaluations — Perez et al. (ACL 2023)	2022
2302.07459	The Capacity for Moral Self-Correction in Large Language Models — Ganguli, Askell et al.	2023
2310.13548	Towards Understanding Sycophancy in Language Models — Tong et al.	2023
2501.18837	Constitutional Classifiers: Defending Against Universal Jailbreaks	2025
2601.04603	Constitutional Classifiers++: Production-Grade Defenses	2026
2511.18397	Natural Emergent Misalignment from Reward Hacking — includes Claude Code sabotage	2025
2510.07192	Poisoning Attacks on LLMs Require Near-Constant Poison Samples	2025
2503.10965	Auditing Language Models for Hidden Objectives	2025
2207.05221	Language Models (Mostly) Know What They Know — Kadavath, Conerly, Askell et al.	2022
2112.00861	A General Language Assistant as a Laboratory for Alignment — Askell, Bai et al.	2021
1606.06565	Concrete Problems in AI Safety — Amodei, Olah, Steinhardt et al. (pre-Anthropic)	2016
2601.20245	How AI Impacts Skill Formation — 17% lower scores with AI assistance	2026

Transformer Circuits Thread (transformer-circuits.pub)

Publication	Year
A Mathematical Framework for Transformer Circuits — Elhage, Nanda, Olsson et al.	2021
In-Context Learning and Induction Heads — Olsson, Elhage, Nanda et al.	2022
Toy Models of Superposition — Elhage, Hume, Olsson et al.	2022
Towards Monosemanticity: Dictionary Learning — Bricken, Templeton, Batson et al.	2023
Scaling Monosemanticity: Features from Claude 3 Sonnet — Templeton et al.	2024
Sparse Crosscoders for Cross-Layer Features and Model Diffing	2024
Circuit Tracing: Revealing Computational Graphs in LMs — Ameisen et al.	2025
On the Biology of a Large Language Model — Lindsey et al.	2025
Emergent Introspective Awareness in LLMs — Lindsey	2025
Emotion Concepts and Their Function in a LLM — Sofroniew et al.	2026

Official research reports & economic index

Report
How AI Is Transforming Work at Anthropic — 132 engineers surveyed, 200K transcripts analyzed
AI's Impact on Software Development (Economic Index) — 500K coding interactions analyzed
How AI Assistance Impacts Coding Skills
Estimating AI Productivity Gains from Claude Conversations
Measuring AI Agent Autonomy in Practice
Anthropic Economic Index: Economic Primitives
Preparing for AI's Economic Impact: Policy Responses
Labor Market Impacts of AI: A New Measure

Model cards & system cards

Model	Date	Link
Claude 3 Family (Opus, Sonnet, Haiku)	Mar 2024	PDF
Claude 3.5 Sonnet	Jun 2024	PDF
Claude 3.7 Sonnet	Feb 2025	System Card
Claude Opus 4 & Sonnet 4	May 2025	PDF
Claude Sonnet 4.6	Feb 2026	System Card
Claude Opus 4.6	Feb 2026	System Card
All System Cards Index	—	Index Page

Podcasts, talks & interviews

Type	Reference
Conf	Code with Claude 2025 — SF, London, Tokyo. 18 session recordings on YouTube.
Pod	Lenny's Podcast — Boris Cherny: Head of Claude Code. 200% productivity increase, 4% of GitHub commits.
Pod	Latent Space — Claude Code: Anthropic's Agent in Your Terminal. Architecture, agentic search vs RAG.
Pod	Pragmatic Engineer — Building Claude Code with Boris Cherny (Spotify / Apple)
Pod	Y Combinator Lightcone — Inside Claude Code with Boris Cherny
Pod	Developing Dev — Boris Cherny on How His Career Grew
Pod	Lex Fridman #452 — Dario Amodei: Anthropic CEO (5+ hrs, includes Amanda Askell & Chris Olah)
Pod	Dwarkesh Patel — Dario Amodei on $10B Models & AGI
Pod	Dwarkesh Patel — Dario Amodei: "We are near the end of the exponential" (Feb 2026)
Pod	80,000 Hours — Chris Olah: What's Going On Inside Neural Networks
Art	Pragmatic Engineer — How Claude Code is Built (tech stack, team, velocity deep-dive)
Art	Every.to — How to Use Claude Code Like the People Who Built It
Art	Coder Blog — Inside Anthropic's AI-First Development
Art	Fortune — 100% of Code at Anthropic Is Now AI-Written
Art	AWS Blog — Amazon EKS Ultra Scale (Anthropic infra)
Essay	Dario Amodei: Machines of Loving Grace (Oct 2024)

Official documentation

Resource	URL
Documentation Home	docs.anthropic.com
Tool Use / Function Calling	docs.anthropic.com/.../tool-use
Extended Thinking	platform.claude.com/.../extended-thinking
Prompt Caching	platform.claude.com/.../prompt-caching
Computer Use Tool	platform.claude.com/.../computer-use-tool
Prompt Engineering Guide	docs.anthropic.com/.../prompt-engineering
Agent SDK Overview	platform.claude.com/.../agent-sdk/overview
Claude Code Memory	code.claude.com/.../memory
Claude Code Hooks	code.claude.com/.../hooks
Claude Code Skills	code.claude.com/.../skills
Claude Code Subagents	code.claude.com/.../sub-agents
GitHub Actions Integration	code.claude.com/.../github-actions
MCP Specification (2025-11-25)	modelcontextprotocol.io/specification
Agent Skills Open Standard	agentskills.io
Responsible Scaling Policy v3.0	anthropic.com/rsp-v3-0
Transparency Hub	anthropic.com/transparency

Additional technical sources

Type	Reference
Site	How Boris Uses Claude Code — Boris Cherny's exact workflow, session management, parallel instances
Thread	Boris Cherny on X: "I'm Boris and I created Claude Code" — 15-tweet thread on his setup: parallel instances, Opus 4.5 w/ thinking, slash commands, subagents, hooks, MCP servers, verification loops
Art	InfoQ: Inside the Development Workflow of Claude Code's Creator
Paper	Terminal-Bench: Benchmarking LLM Agents (ICLR 2026 conference paper)
Art	Mitigating Prompt Injections in Browser Use (1% ASR defense)
Art	Confidential Inference via Trusted Virtual Machines
Art	Building AI for Cyber Defenders
Site	Anthropic Learning Resources Hub

Key GitHub repositories (github.com/anthropics)

Repository	Description
`claude-code`	The agentic coding tool
`claude-code-action`	GitHub Actions integration
`claude-code-security-review`	AI-powered security review GitHub Action
`claude-code-monitoring-guide`	ROI measurement guide
`claude-agent-sdk-python`	Python Agent SDK
`claude-agent-sdk-typescript`	TypeScript Agent SDK
`skills`	Public Agent Skills repository
`anthropic-sdk-python`	Official Python SDK
`anthropic-sdk-typescript`	Official TypeScript SDK
`claudes-c-compiler`	100K-line C compiler built by 16 parallel Claudes
`claude-cookbooks`	Recipes for common integration patterns
`courses`	Educational courses (Jupyter notebooks)
`prompt-eng-interactive-tutorial`	9-chapter prompt engineering tutorial
`evals`	Evaluation framework
`hh-rlhf`	Human preference data for RLHF paper
`claude-constitution`	Claude's values and behavior document
`modelcontextprotocol`	MCP specification (separate org)

Enter password to access

01 — Core philosophy

Design principles behind Anthropic's engineering

Do the simple thing first

Minimal scaffolding, maximum model

Tool design over prompt engineering

Separate generation from evaluation

Underfund teams, unlimited tokens

Every component encodes an assumption

02 — Tech stack

How Claude Code is built

TypeScript

React + Ink + Yoga

Bun + npm

CommanderJS

Bash + Edit

Coder control plane

Origin story

03 — Development workflows

The AI-first development loop

The autonomous loop (Shift+Tab / auto-accept mode)

The slot machine

TDD with Claude

Try and rollback

The five agent workflow patterns (from "Building Effective Agents")

Prompt chaining

Routing

Parallelization

Orchestrator-workers

Evaluator-optimizer

Multi-agent development (map-reduce)

04 — Agent harness design

Architectures for long-running agents

Two-agent system (incremental feature development)

Three-agent system (quality-critical applications)

Parallel agent system (C compiler project)

04b — CLAUDE.md, hooks & skills

The configuration layer that powers everything

CLAUDE.md file hierarchy (5 scopes)

Auto memory architecture

Hooks system (26 lifecycle events)

Skills system (SKILL.md)

04c — Case study

Boris Cherny's exact daily workflow

Parallel sessions

Model & settings

Code review revolution at Anthropic

05 — Context engineering

Optimizing token utility across inference turns

Compaction

Structured note-taking

Sub-agent delegation

Just-in-time retrieval

The "think" tool

Extended thinking (deep technical)

API configuration

Key constraints

Prompt caching (deep technical)

Mechanism

Pricing & TTL

06 — Evaluations & testing

How Anthropic measures agent quality

Core metrics

Practical eval design

The postmortem lesson

07 — Team practices

How 10 Anthropic teams use Claude Code

Internal research: AI transforming work at Anthropic

07b — Agent SDK & subagents

Programmable agents for CI/CD and production

Agent SDK — Python

Agent SDK — TypeScript

Built-in subagent types

Computer use (browser automation)

08 — Model Context Protocol (MCP)

The standard for tool integration

Client-server protocol

How Anthropic uses MCP

Open standard

09 — Safety, alignment & interpretability