🛠️ 開発・MCP コミュニティ

resilience-analysis

Assess error handling, isolation boundaries, and recovery mechanisms in agent frameworks. Use when (1) tracing error propagation paths, (2) evaluating sandboxing for code execution, (3) understanding retry and fallback mechanisms, (4) assessing production readiness, or (5) identifying failure modes and recovery patterns.

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o resilience-analysis.zip https://jpskill.com/download/18860.zip && unzip -o resilience-analysis.zip && rm resilience-analysis.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/18860.zip -OutFile "$d\resilience-analysis.zip"; Expand-Archive "$d\resilience-analysis.zip" -DestinationPath $d -Force; ri "$d\resilience-analysis.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して resilience-analysis.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → resilience-analysis フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

レジリエンス分析

エラー処理と分離境界を評価します。

プロセス

エラー伝播の追跡 — ツールからエージェントへの例外フローをマッピングします。
分離の特定 — 危険な操作のためのサンドボックスメカニズムを特定します。
回復のカタログ化 — リトライロジック、フォールバック、サーキットブレーカーをカタログ化します。
境界の評価 — どのようなクラッシュが伝播し、どのようなクラッシュが封じ込められるかを評価します。

エラー伝播分析

回答すべき質問

ツール例外はエージェントを終了させますか？
LLM APIエラーは自動的にリトライされますか？
パース失敗（不正な形式の出力）は回復可能ですか？
状態更新が失敗した場合、何が起こりますか？

伝播パターン

クラッシュ伝播（危険）

def run_tool(self, tool, args):
    return tool.execute(args)  # Exception bubbles up

例外のラッピング

def run_tool(self, tool, args):
    try:
        return tool.execute(args)
    except Exception as e:
        raise ToolExecutionError(tool.name, e) from e

エラーの封じ込め

def run_tool(self, tool, args):
    try:
        return ToolResult(success=True, output=tool.execute(args))
    except Exception as e:
        return ToolResult(success=False, error=str(e))

伝播マップテンプレート

User Input
    ↓
┌─────────────────────────────────────────┐
│ Agent Loop                              │
│   ↓                                     │
│ ┌─────────────────────────────────────┐ │
│ │ LLM Call                            │ │
│ │ • APIError → [Retry 3x / Propagate] │ │
│ │ • RateLimit → [Backoff / Propagate] │ │
│ │ • Timeout → [Retry / Propagate]     │ │
│ └─────────────────────────────────────┘ │
│   ↓                                     │
│ ┌─────────────────────────────────────┐ │
│ │ Output Parsing                      │ │
│ │ • ParseError → [Retry / Contained]  │ │
│ │ • ValidationError → [Contained]     │ │
│ └─────────────────────────────────────┘ │
│   ↓                                     │
│ ┌─────────────────────────────────────┐ │
│ │ Tool Execution                      │ │
│ │ • ToolError → [Feedback to LLM]     │ │
│ │ • Timeout → [Kill / Continue]       │ │
│ │ • SecurityError → [Propagate]       │ │
│ └─────────────────────────────────────┘ │
└─────────────────────────────────────────┘

サンドボックスメカニズム

コード実行の分離

メカニズム	安全レベル	パフォーマンス	複雑さ
None	⚠️ 危険	高速	なし
RestrictedPython	中	高速	低
AST Validation	低	高速	中
Subprocess	中	オーバーヘッドあり	低
Docker/Container	高	高いオーバーヘッド	中
gVisor/Firecracker	非常に高	中程度のオーバーヘッド	高

検出パターン

サンドボックスなし

exec(user_code)  # Direct execution
eval(expression)  # Direct eval
subprocess.run(cmd, shell=True)  # Shell injection risk

基本的なサンドボックス

# RestrictedPython
from RestrictedPython import compile_restricted
code = compile_restricted(user_code, '<string>', 'exec')

# AST validation
tree = ast.parse(user_code)
if has_dangerous_nodes(tree):
    raise SecurityError()

プロセス分離

# Subprocess with limits
result = subprocess.run(
    ['python', '-c', user_code],
    timeout=30,
    capture_output=True,
    user='nobody'  # Drop privileges
)

コンテナ分離

import docker
client = docker.from_env()
container = client.containers.run(
    'python:3.11-slim',
    command=['python', '-c', user_code],
    mem_limit='256m',
    network_disabled=True,
    remove=True
)

回復パターン

リトライロジック

# Simple retry
@retry(max_attempts=3, backoff=exponential)
def call_llm(self, prompt):
    return self.client.generate(prompt)

# Retry with error feedback
def call_with_retry(self, prompt, max_retries=3):
    errors = []
    for i in range(max_retries):
        try:
            return self.llm.generate(prompt)
        except ParseError as e:
            errors.append(str(e))
            prompt = f"{prompt}\n\nPrevious errors: {errors}"
    raise MaxRetriesExceeded(errors)

フォールバックメカニズム

def generate(self, prompt):
    try:
        return self.primary_llm.generate(prompt)
    except APIError:
        return self.fallback_llm.generate(prompt)

サーキットブレーカー

class CircuitBreaker:
    def __init__(self, failure_threshold=5, reset_timeout=60):
        self.failures = 0
        self.state = 'closed'
        self.last_failure = None

    def call(self, func, *args):
        if self.state == 'open':
            if time.time() - self.last_failure > self.reset_timeout:
                self.state = 'half-open'
            else:
                raise CircuitOpen()

        try:
            result = func(*args)
            self.failures = 0
            self.state = 'closed'
            return result
        except Exception as e:
            self.failures += 1
            self.last_failure = time.time()
            if self.failures >= self.failure_threshold:
                self.state = 'open'
            raise

出力テンプレート

## Resilience Analysis: [Framework Name]

### Error Propagation Map

| Error Source | Error Type | Handling | Propagates? |
|--------------|-----------|----------|-------------|
| LLM API | RateLimitError | Retry 3x with backoff | No |
| LLM API | APIError | Retry 1x | Yes |
| Parser | ParseError | Feed back to LLM | No |
| Tool | Exception | Wrap and feed to LLM | No |
| Tool | Timeout | Kill process | No |
| State | ValidationError | Propagate | Yes |

### Sandboxing Assessment
- **Code Execution**: [Mechanism or None]
- **File System**: [Isolated/Restricted/Open]
- **Network**: [Blocked/Filtered/Open]
- **Resource Limits**: [Memory/CPU/Time limits]

### Recovery Mechanisms

| Pattern | Implementation | Location |
|---------|---------------|----------|
| Retry | Exponential backoff, 3 attempts | llm.py:L45 |
| Fallback | Secon

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Resilience Analysis

Assesses error handling and isolation boundaries.

Process

Trace error propagation — Map exception flow from tools to agent
Identify isolation — Sandbox mechanisms for dangerous operations
Catalog recovery — Retry logic, fallbacks, circuit breakers
Assess boundaries — What crashes propagate vs. are contained

Error Propagation Analysis

Questions to Answer

Does a tool exception terminate the agent?
Are LLM API errors retried automatically?
Is parsing failure (malformed output) recoverable?
What happens when state updates fail?

Propagation Patterns

Crash Propagation (Dangerous)

def run_tool(self, tool, args):
    return tool.execute(args)  # Exception bubbles up

Exception Wrapping

def run_tool(self, tool, args):
    try:
        return tool.execute(args)
    except Exception as e:
        raise ToolExecutionError(tool.name, e) from e

Error Containment

def run_tool(self, tool, args):
    try:
        return ToolResult(success=True, output=tool.execute(args))
    except Exception as e:
        return ToolResult(success=False, error=str(e))

Propagation Map Template

User Input
    ↓
┌─────────────────────────────────────────┐
│ Agent Loop                              │
│   ↓                                     │
│ ┌─────────────────────────────────────┐ │
│ │ LLM Call                            │ │
│ │ • APIError → [Retry 3x / Propagate] │ │
│ │ • RateLimit → [Backoff / Propagate] │ │
│ │ • Timeout → [Retry / Propagate]     │ │
│ └─────────────────────────────────────┘ │
│   ↓                                     │
│ ┌─────────────────────────────────────┐ │
│ │ Output Parsing                      │ │
│ │ • ParseError → [Retry / Contained]  │ │
│ │ • ValidationError → [Contained]     │ │
│ └─────────────────────────────────────┘ │
│   ↓                                     │
│ ┌─────────────────────────────────────┐ │
│ │ Tool Execution                      │ │
│ │ • ToolError → [Feedback to LLM]     │ │
│ │ • Timeout → [Kill / Continue]       │ │
│ │ • SecurityError → [Propagate]       │ │
│ └─────────────────────────────────────┘ │
└─────────────────────────────────────────┘

Sandboxing Mechanisms

Code Execution Isolation

Mechanism	Safety Level	Performance	Complexity
None	⚠️ Dangerous	Fast	None
RestrictedPython	Medium	Fast	Low
AST Validation	Low	Fast	Medium
Subprocess	Medium	Overhead	Low
Docker/Container	High	High overhead	Medium
gVisor/Firecracker	Very High	Medium overhead	High

Detection Patterns

No Sandboxing

exec(user_code)  # Direct execution
eval(expression)  # Direct eval
subprocess.run(cmd, shell=True)  # Shell injection risk

Basic Sandboxing

# RestrictedPython
from RestrictedPython import compile_restricted
code = compile_restricted(user_code, '<string>', 'exec')

# AST validation
tree = ast.parse(user_code)
if has_dangerous_nodes(tree):
    raise SecurityError()

Process Isolation

# Subprocess with limits
result = subprocess.run(
    ['python', '-c', user_code],
    timeout=30,
    capture_output=True,
    user='nobody'  # Drop privileges
)

Container Isolation

import docker
client = docker.from_env()
container = client.containers.run(
    'python:3.11-slim',
    command=['python', '-c', user_code],
    mem_limit='256m',
    network_disabled=True,
    remove=True
)

Recovery Patterns

Retry Logic

# Simple retry
@retry(max_attempts=3, backoff=exponential)
def call_llm(self, prompt):
    return self.client.generate(prompt)

# Retry with error feedback
def call_with_retry(self, prompt, max_retries=3):
    errors = []
    for i in range(max_retries):
        try:
            return self.llm.generate(prompt)
        except ParseError as e:
            errors.append(str(e))
            prompt = f"{prompt}\n\nPrevious errors: {errors}"
    raise MaxRetriesExceeded(errors)

Fallback Mechanisms

def generate(self, prompt):
    try:
        return self.primary_llm.generate(prompt)
    except APIError:
        return self.fallback_llm.generate(prompt)

Circuit Breaker

class CircuitBreaker:
    def __init__(self, failure_threshold=5, reset_timeout=60):
        self.failures = 0
        self.state = 'closed'
        self.last_failure = None

    def call(self, func, *args):
        if self.state == 'open':
            if time.time() - self.last_failure > self.reset_timeout:
                self.state = 'half-open'
            else:
                raise CircuitOpen()

        try:
            result = func(*args)
            self.failures = 0
            self.state = 'closed'
            return result
        except Exception as e:
            self.failures += 1
            self.last_failure = time.time()
            if self.failures >= self.failure_threshold:
                self.state = 'open'
            raise

Output Template

## Resilience Analysis: [Framework Name]

### Error Propagation Map

| Error Source | Error Type | Handling | Propagates? |
|--------------|-----------|----------|-------------|
| LLM API | RateLimitError | Retry 3x with backoff | No |
| LLM API | APIError | Retry 1x | Yes |
| Parser | ParseError | Feed back to LLM | No |
| Tool | Exception | Wrap and feed to LLM | No |
| Tool | Timeout | Kill process | No |
| State | ValidationError | Propagate | Yes |

### Sandboxing Assessment
- **Code Execution**: [Mechanism or None]
- **File System**: [Isolated/Restricted/Open]
- **Network**: [Blocked/Filtered/Open]
- **Resource Limits**: [Memory/CPU/Time limits]

### Recovery Mechanisms

| Pattern | Implementation | Location |
|---------|---------------|----------|
| Retry | Exponential backoff, 3 attempts | llm.py:L45 |
| Fallback | Secondary model | agent.py:L120 |
| Circuit Breaker | None | - |

### Risk Assessment
- **Critical Gaps**: [List any missing protections]
- **Production Ready**: [Yes/No/Needs work]

Integration

Prerequisite: codebase-mapping to identify execution code
Feeds into: antipattern-catalog for error handling issues
Related: execution-engine-analysis for async error handling