Anthropic Agent Building Blogs - English | 水码

🚀

Module 1: Agent Foundation

Beginner

Goal: Understand the minimum viable Agent architecture
Learn core Agent concepts, ReAct pattern, Tool Use and Planning, build Agent mental model

1

Building Effective AI Agents

Agent Architecture Intro: ReAct / Tool Use / Planning

Over the past year, we've worked with dozens of teams building large language model (LLM) agents across industries. The most successful implementations weren't using complex frameworks, but simple, composable patterns.

AI Agent Agent Mental Model

2

Building agents with the Claude Agent SDK

Build your first Agent with SDK

Last year, we shared lessons in building effective agents alongside our customers. Since then, we've released Claude Code, an agentic coding solution that we originally built to support developer productivity at Anthropic.

Agent SDK Quick Start

🛠️

Module 2: Tools & Capability Extension

Intermediate

Goal: Give Agent "action capability"
Master parallel tool calls, nested calls, error handling, learn tool design principles and capability modularization

3

Introducing advanced tool use on the Claude Developer Platform

Parallel / Nested / Error Handling

The future of AI agents is one where models work seamlessly across hundreds or thousands of tools. An IDE assistant that integrates git operations, file manipulation, package managers, testing frameworks, and deployment pipelines.

Tool Use Complex Tool Calls

4

Writing effective tools for AI agents—using AI agents

Agent Tool Design Principles

The Model Context Protocol (MCP) can empower LLM agents with potentially hundreds of tools to solve real-world tasks. But how do we make those tools maximally effective?

Tools Tools = Capability Ceiling

5

The "think" tool: Enabling Claude to stop and think

Explicit Reasoning Control

As we continue to enhance Claude's complex problem-solving abilities, we've discovered a particularly effective approach: a "think" tool that creates dedicated space for structured thinking during complex tasks.

Think Tool Decision Stability

6

Equipping agents for the real world with Agent Skills

Skills Abstraction & Reuse

As model capabilities improve, we can now build general-purpose agents that interact with full-fledged computing environments. Claude Code, for example, can accomplish complex tasks across domains using local code execution and filesystems.

Agent Skills Capability Modularization

7

Claude Desktop Extensions: One-click MCP server installation

Skills + MCP Server to Extend Agent

When we released the Model Context Protocol (MCP) last year, we saw developers build amazing local servers that gave Claude access to everything from file systems to databases.

MCP System-level Extension

🧠

Module 3: Context & Memory Management

Core

Goal: Solve "memory & attention" problems in long tasks
Learn context structure design, context-aware RAG, ensure long conversation stability and retrieval serving tasks

8

Effective context engineering for AI agents

Context Structure Design

After a few years of prompt engineering being the focus of attention in applied AI, a new term has come to prominence: context engineering. Building with language models is becoming less about finding the right words.

Context Engineering Long Conversation Stability

9

Contextual Retrieval in AI Systems

Context-aware RAG

For an AI model to be useful in specific contexts, it often needs access to background knowledge. Developers typically enhance an AI model's knowledge using Retrieval-Augmented Generation (RAG).

RAG Retrieval Serving Tasks

⚡

Module 4: Long Tasks & Multi-Agent

Advanced

Goal: Agent Systematization
Master long-task execution frameworks, interruption recovery, state persistence, and multi-Agent collaboration architecture

10

Effective harnesses for long-running agents

Long-task Execution Framework

As AI agents become more capable, developers are increasingly asking them to take on complex tasks requiring work that spans hours, or even days. Getting agents to make consistent progress across multiple context windows remains an open problem.

Long-running Interruption Recovery

11

How we built our multi-agent research system

Multi-Agent Collaboration Architecture

Claude now has Research capabilities that allow it to search across the web, Google Workspace, and any integrations to accomplish complex tasks. The journey from prototype to production taught us critical lessons.

Multi-Agent Role Division

12

Code execution with MCP: building more efficient AI agents

Agent Execution Environment

The Model Context Protocol (MCP) is an open standard for connecting AI agents to external systems. Connecting agents to tools and data traditionally requires a custom integration for each pairing.

MCP Safe & Efficient Execution

🏭

Module 5: Security, Evaluation & Engineering

Production

Goal: Production-ready & Scalable
Learn Agent evaluation methodology, sandboxing & permission isolation, production practices, and real incident postmortems

13

Demystifying evals for AI agents

Agent Evaluation Methodology

Good evaluations help teams ship AI agents more confidently. Without them, it's easy to get stuck in reactive loops—catching issues only in production, where fixing one failure creates others.

Evals Measurable

14

Making Claude Code more secure and autonomous with sandboxing

Sandboxing & Permission Isolation

In Claude Code, Claude writes, tests, and debugs code alongside you, navigating your codebase, editing multiple files, and running commands to verify its work. Giving Claude this much access can introduce risks.

Sandboxing Security Boundaries

15

Claude Code Best Practices

Coding Agent Engineering Experience

We recently released Claude Code, a command line tool for agentic coding. Developed as a research project, Claude Code gives Anthropic engineers and researchers a more native way to integrate Claude into their coding workflows.

Claude Code Production Practices

16

A postmortem of three recent issues

Real Incident Postmortem

Between August and early September, three infrastructure bugs intermittently degraded Claude's response quality. We've now resolved these issues and want to explain what happened.

Infrastructure Pitfall Guide

17

Claude SWE-Bench Performance

How Anthropic Teams Use Claude

Our latest model, the upgraded Claude 3.5 Sonnet, achieved 49% on SWE-bench Verified, a software engineering evaluation, beating the previous state-of-the-art model's 45%.

SWE-Bench Org-level Agent Experience