ai-agents
AIエージェントの構築、ツール利用、連鎖、記憶、自律的なワークフローを大規模言語モデルで実現するためのSkill。
📜 元の英語説明(参考)
Building AI agents — tool use, chains, memory, and autonomous workflows with LLMs. Use when user mentions "AI agent", "agent development", "tool use", "function calling", "agent loop", "ReAct pattern", "agent memory", "autonomous agent", "multi-agent", "langchain agents", "crew AI", or building systems where LLMs take actions.
🇯🇵 日本人クリエイター向け解説
AIエージェントの構築、ツール利用、連鎖、記憶、自律的なワークフローを大規模言語モデルで実現するためのSkill。
※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o ai-agents.zip https://jpskill.com/download/6058.zip && unzip -o ai-agents.zip && rm ai-agents.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/6058.zip -OutFile "$d\ai-agents.zip"; Expand-Archive "$d\ai-agents.zip" -DestinationPath $d -Force; ri "$d\ai-agents.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
ai-agents.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
ai-agentsフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-17
- 取得日時
- 2026-05-17
- 同梱ファイル
- 1
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
[Skill 名] ai-agents
AIエージェント
AIエージェントとは、ツールに接続され、ループ内で実行されるLLMのことです。LLMは何をすべきかを決定し、ツールを呼び出し、結果を観察し、タスクが完了するまでこれを繰り返します。ループとツールがなければ、それは単なるチャットボットです。
コアエージェントループ
すべてのエージェントはこのパターンに従います。
1. OBSERVE (観察) - 入力 (ユーザーメッセージまたはツールの結果) を受け取る
2. THINK (思考) - LLMが次に何をすべきかを推論する
3. ACT (行動) - ツールを呼び出すか、最終的な回答を返す
4. OBSERVE (観察) - ツールの結果を取得し、ステップ2に戻る
LLMがこれ以上ツールの呼び出しは不要と判断し、最終的な応答を返すと、ループは終了します。最大反復回数制限により、暴走ループを防ぎます。
ツール / 関数呼び出し
OpenAI形式
tools = [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for a query",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
Anthropic形式
tools = [
{
"name": "search_web",
"description": "Search the web for a query",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages,
tools=tools
)
モデルは tool_use ブロック (Anthropic) または tool_calls 配列 (OpenAI) を返します。あなたのコードはツールを実行し、その結果を次のメッセージとしてフィードバックします。
ReActパターン (Reasoning + Acting)
ReActは、推論の軌跡とアクションを交互に実行します。LLMは各ツール呼び出しの前に明示的に思考を書き出し、意思決定プロセスを検査可能にします。
Thought: I need to find the current stock price of AAPL.
Action: search_web("AAPL stock price")
Observation: AAPL is trading at $187.44.
Thought: I have the price. I can answer the user now.
Answer: Apple (AAPL) is currently trading at $187.44.
最新のツール呼び出しAPIを使用すると、ReActは自然に発生します。モデルはテキスト出力で推論し、構造化されたブロックでツール呼び出しを発行します。もはや生のテキストから "Action:" 文字列を解析する必要はありません。
Pythonでの最小限のエージェント
フレームワークは不要です。API呼び出しとツールディスパッチ辞書のみです。
import anthropic
import json
client = anthropic.Anthropic()
# Define tools
def read_file(path: str) -> str:
with open(path) as f:
return f.read()
def write_file(path: str, content: str) -> str:
with open(path, "w") as f:
f.write(content)
return f"Wrote {len(content)} bytes to {path}"
tool_definitions = [
{
"name": "read_file",
"description": "Read a file from disk",
"input_schema": {
"type": "object",
"properties": {"path": {"type": "string"}},
"required": ["path"]
}
},
{
"name": "write_file",
"description": "Write content to a file",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
},
"required": ["path", "content"]
}
}
]
dispatch = {
"read_file": lambda args: read_file(args["path"]),
"write_file": lambda args: write_file(args["path"], args["content"]),
}
def run_agent(user_message: str, max_iterations: int = 10):
messages = [{"role": "user", "content": user_message}]
for _ in range(max_iterations):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=tool_definitions,
messages=messages,
)
# Append assistant response
messages.append({"role": "assistant", "content": response.content})
# Check if the model wants to use tools
tool_blocks = [b for b in response.content if b.type == "tool_use"]
if not tool_blocks:
# No tool calls -- agent is done
text = "".join(b.text for b in response.content if b.type == "text")
return text
# Execute each tool and collect results
tool_results = []
for block in tool_blocks:
try:
result = dispatch[block.name](block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
except Exception as e:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Error: {e}",
"is_error": True,
})
messages.append({"role": "user", "content": tool_results})
return "Agent hit max iterations without completing."
TypeScriptでの最小限のエージェント
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools: Anthropic.Tool[] = [
{
name: "search_web",
description: "Search the web",
input_schema: {
type: "object" as const,
properties: { query: { type: "string" } },
required: ["query"],
},
},
];
async function executeTool(name: string, input: Record<string, unknown>): Promise<string> {
if (name === "search_web") {
// Replace with real implementation
return `Results for: ${input.query}`;
}
throw new Error(`Unknown tool: ${name}`);
}
async function runAgent(userMessage: string, maxIterations = 10): Promise<string> {
const messages: Anthr 📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
AI Agents
An AI agent is an LLM connected to tools and running in a loop. The LLM decides what to do, calls a tool, observes the result, and repeats until the task is done. Without the loop and tools, it is just a chatbot.
Core Agent Loop
Every agent follows this pattern:
1. OBSERVE - Receive input (user message or tool result)
2. THINK - LLM reasons about what to do next
3. ACT - Call a tool or return a final answer
4. OBSERVE - Get tool result, go back to step 2
The loop terminates when the LLM decides no more tool calls are needed and returns a final response. A maximum iteration limit prevents runaway loops.
Tool / Function Calling
OpenAI Format
tools = [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for a query",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
Anthropic Format
tools = [
{
"name": "search_web",
"description": "Search the web for a query",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages,
tools=tools
)
The model returns a tool_use block (Anthropic) or tool_calls array (OpenAI). Your code executes the tool and feeds the result back as the next message.
ReAct Pattern (Reasoning + Acting)
ReAct interleaves reasoning traces with actions. The LLM explicitly writes out its thinking before each tool call, making the decision process inspectable.
Thought: I need to find the current stock price of AAPL.
Action: search_web("AAPL stock price")
Observation: AAPL is trading at $187.44.
Thought: I have the price. I can answer the user now.
Answer: Apple (AAPL) is currently trading at $187.44.
With modern tool-calling APIs, ReAct happens naturally -- the model reasons in its text output and issues tool calls in structured blocks. You do not need to parse "Action:" strings from raw text anymore.
Minimal Agent in Python
No frameworks. Just API calls and a tool dispatch dictionary.
import anthropic
import json
client = anthropic.Anthropic()
# Define tools
def read_file(path: str) -> str:
with open(path) as f:
return f.read()
def write_file(path: str, content: str) -> str:
with open(path, "w") as f:
f.write(content)
return f"Wrote {len(content)} bytes to {path}"
tool_definitions = [
{
"name": "read_file",
"description": "Read a file from disk",
"input_schema": {
"type": "object",
"properties": {"path": {"type": "string"}},
"required": ["path"]
}
},
{
"name": "write_file",
"description": "Write content to a file",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
},
"required": ["path", "content"]
}
}
]
dispatch = {
"read_file": lambda args: read_file(args["path"]),
"write_file": lambda args: write_file(args["path"], args["content"]),
}
def run_agent(user_message: str, max_iterations: int = 10):
messages = [{"role": "user", "content": user_message}]
for _ in range(max_iterations):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=tool_definitions,
messages=messages,
)
# Append assistant response
messages.append({"role": "assistant", "content": response.content})
# Check if the model wants to use tools
tool_blocks = [b for b in response.content if b.type == "tool_use"]
if not tool_blocks:
# No tool calls -- agent is done
text = "".join(b.text for b in response.content if b.type == "text")
return text
# Execute each tool and collect results
tool_results = []
for block in tool_blocks:
try:
result = dispatch[block.name](block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result),
})
except Exception as e:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Error: {e}",
"is_error": True,
})
messages.append({"role": "user", "content": tool_results})
return "Agent hit max iterations without completing."
Minimal Agent in TypeScript
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools: Anthropic.Tool[] = [
{
name: "search_web",
description: "Search the web",
input_schema: {
type: "object" as const,
properties: { query: { type: "string" } },
required: ["query"],
},
},
];
async function executeTool(name: string, input: Record<string, unknown>): Promise<string> {
if (name === "search_web") {
// Replace with real implementation
return `Results for: ${input.query}`;
}
throw new Error(`Unknown tool: ${name}`);
}
async function runAgent(userMessage: string, maxIterations = 10): Promise<string> {
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: userMessage },
];
for (let i = 0; i < maxIterations; i++) {
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 4096,
tools,
messages,
});
messages.push({ role: "assistant", content: response.content });
const toolBlocks = response.content.filter(
(b): b is Anthropic.ToolUseBlock => b.type === "tool_use"
);
if (toolBlocks.length === 0) {
return response.content
.filter((b): b is Anthropic.TextBlock => b.type === "text")
.map((b) => b.text)
.join("");
}
const toolResults: Anthropic.ToolResultBlockParam[] = await Promise.all(
toolBlocks.map(async (block) => {
try {
const result = await executeTool(block.name, block.input as Record<string, unknown>);
return { type: "tool_result" as const, tool_use_id: block.id, content: result };
} catch (e) {
return {
type: "tool_result" as const,
tool_use_id: block.id,
content: `Error: ${e}`,
is_error: true,
};
}
})
);
messages.push({ role: "user", content: toolResults });
}
return "Agent hit max iterations.";
}
Memory Patterns
Conversation History (Short-Term)
Pass the full message array to each API call. This is the simplest form of memory but hits context window limits on long conversations.
Summarization (Medium-Term)
When the conversation grows too long, summarize older messages. Keep the system prompt and last few exchanges intact, replace everything in between with a summary generated by a separate LLM call.
Vector Store Retrieval (Long-Term)
Store past interactions or documents as embeddings. Before each LLM call, retrieve the top-k relevant chunks and inject them into the prompt. Use any vector database (Pinecone, ChromaDB, pgvector, Qdrant).
Multi-Agent Patterns
Supervisor: One coordinating agent delegates subtasks to specialist agents and synthesizes their outputs.
Debate / Critique: Two agents review each other's work. Agent A drafts, Agent B critiques, Agent A revises. Improves output quality at the cost of more API calls.
Pipeline: Agents are chained sequentially. Agent 1 researches, Agent 2 writes, Agent 3 reviews. Each agent sees only the output of the previous stage.
Parallel Fan-Out: A router sends independent subtasks to multiple agents simultaneously, then merges results.
Common Tools to Give Agents
| Tool | Use Case |
|---|---|
| Web search | Grounding in current information |
| Code execution (sandbox) | Running Python/JS to verify answers |
| File read/write | Persisting work products |
| Shell commands | System operations, git, builds |
| API calls (HTTP) | Interacting with external services |
| Database queries | Reading/writing structured data |
| Browser automation | Scraping, form filling |
Keep tool descriptions concise and specific. Vague descriptions cause the model to misuse tools.
Error Handling and Retry Strategies
- Catch tool errors and return them to the LLM as error messages (see
is_error: truein examples above). The model can often self-correct. - Retry on transient failures (rate limits, network errors) with exponential backoff.
- Set a max iteration limit to prevent infinite loops. 10-20 is typical.
- Validate tool inputs before execution. If the model passes invalid arguments, return a clear error describing the expected format.
- Timeout individual tool calls. A hung web request should not stall the entire agent.
Token Management
- Track token usage from API response metadata (
usage.input_tokens,usage.output_tokens). - Prune conversation history when approaching the context window limit. Keep the system prompt, recent messages, and a summary of older ones.
- Use prompt caching (Anthropic) or cached completions (OpenAI) for repeated prefixes. This reduces cost on long conversations.
- Limit tool output size. Truncate large file contents or API responses before feeding them back.
- Choose model by task. Use a smaller/cheaper model for simple tool dispatch and a larger model for complex reasoning steps.
Guardrails
Input Validation
Validate user inputs before they reach the agent. Check for prompt injection attempts, excessively long inputs, and disallowed content.
Output Filtering
Check agent outputs before returning to the user or executing dangerous operations:
BLOCKED_COMMANDS = ["rm -rf /", "DROP TABLE", "FORMAT C:"]
def validate_tool_call(name: str, args: dict) -> bool:
if name == "run_shell":
cmd = args.get("command", "")
if any(blocked in cmd for blocked in BLOCKED_COMMANDS):
return False
return True
Human-in-the-Loop
For high-stakes actions (sending emails, making purchases, modifying production data), pause and ask for human approval before executing the tool. Return a rejection message to the LLM if the user declines.
Frameworks Overview
| Framework | Language | Key Strength |
|---|---|---|
| LangChain | Python/JS | Large ecosystem, many integrations |
| LangGraph | Python/JS | Stateful, graph-based agent workflows |
| CrewAI | Python | Multi-agent role-based collaboration |
| AutoGen | Python | Multi-agent conversation patterns |
| Claude Agent SDK | Python | Lightweight agent loop with Claude |
| Vercel AI SDK | TypeScript | Streaming-first, React integration |
| Mastra | TypeScript | Agent framework with built-in memory/tools |
Start without a framework. Add one when you need features you are reimplementing (state persistence, complex routing, built-in tool libraries). Frameworks add abstraction layers that make debugging harder.
Evaluation
Testing agents is harder than testing deterministic code. Strategies:
- Unit test individual tools. Each tool function should be testable in isolation.
- Golden path tests. Define input/expected-output pairs and check that the agent reaches the correct final answer. Allow for variation in intermediate steps.
- Tool call assertions. Verify the agent called the right tools in a reasonable order, even if the exact arguments vary.
- Adversarial inputs. Test with confusing, ambiguous, or adversarial prompts to verify guardrails hold.
- Cost tracking. Log token usage per test case. A test that suddenly uses 10x more tokens indicates a regression in agent efficiency.
- Human evaluation. For open-ended tasks, have humans rate agent outputs on correctness, helpfulness, and safety.
def test_research_agent():
result = run_agent("What is the population of Tokyo?")
assert "13" in result or "14" in result # millions, approximately
# Check that web search was called
assert any("search_web" in str(m) for m in recorded_messages)
Common Agent Patterns
Research Agent
Tools: web search, URL reader, note-taking. The agent searches for information, reads pages, extracts facts, and compiles a report. Useful for market research, literature review, competitive analysis.
Coding Agent
Tools: file read/write, shell execution, web search. The agent reads existing code, plans changes, writes code, runs tests, and iterates on failures. Key design decision: sandbox the execution environment.
Data Analysis Agent
Tools: code execution (Python with pandas/numpy), file read, chart generation. The agent loads data, explores it, runs statistical analysis, and generates visualizations. Give it a Python sandbox with data science libraries pre-installed.
Customer Support Agent
Tools: knowledge base search, ticket system API, escalation. The agent retrieves relevant documentation, answers questions, and escalates when confidence is low or the request requires human judgment.
Workflow Automation Agent
Tools: email, calendar, project management APIs. The agent performs multi-step business processes (schedule meetings, send follow-ups, update tasks). Always use human-in-the-loop for actions with external side effects.