🛠️ 開発・MCP コミュニティ

verification-protocol

Independent verification of task completion - eliminates self-attestation

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o verification-protocol.zip https://jpskill.com/download/17921.zip && unzip -o verification-protocol.zip && rm verification-protocol.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/17921.zip -OutFile "$d\verification-protocol.zip"; Expand-Archive "$d\verification-protocol.zip" -DestinationPath $d -Force; ri "$d\verification-protocol.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して verification-protocol.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → verification-protocol フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

検証プロトコル - 自己証明の排除

目的

問題点: エージェントが自身の作業を検証し、常にデフォルトで success: true を返していた。

解決策: 元のエージェントの主張を信用しない別のエージェントによる独立した検証。

ルール: verified=true は、すべての完了基準が満たされていることを証拠が証明する場合のみ。

核心原則

決して自分の作業を検証しないこと。
常に独立した証拠で検証すること。
主張は真実と証明されるまで、偽であると仮定すること。
証拠なしに完了をブロックすること。

検証プロトコル

ステップ 1: タスク完了の主張

エージェントはタスクが完了したと主張し、以下を提供する。

{
  "task_id": "task-123",
  "claimed_outputs": ["/path/to/file.ts", "/path/to/test.ts"],
  "completion_criteria": [
    "file_exists:/path/to/file.ts",
    "no_placeholders:/path/to/file.ts",
    "typescript_compiles:/path/to/file.ts",
    "lint_passes:/path/to/file.ts",
    "tests_pass:/path/to/test.ts"
  ]
}

ステップ 2: 独立した検証の要求

オーケストレーターは独立した検証エージェント（別のエージェント）に送信する。

ステップ 3: 検証の実行

独立した検証者は、実際の証拠に基づいてすべての基準をチェックする。

file_exists → fs.stat(path) && size > 0
  証拠: /path/to/file.ts, 1,247 bytes, modified 2025-12-02T14:30:00Z

no_placeholders → TODO, TBD, FIXME, [INSERT] をスキャン
  証拠: 0 個のプレースホルダーが見つかりました

typescript_compiles → npx tsc --noEmit [file]
  証拠: コンパイル成功、エラー 0 件

lint_passes → npx eslint [file]
  証拠: リンティングエラー 0 件

tests_pass → npm test -- [file]
  証拠: 15 件のテストが成功、0 件が失敗

ステップ 4: 検証結果の返却

{
  "verified": true,
  "evidence": [
    {
      "criterion": "file_exists:/path/to/file.ts",
      "method": "fs.stat(path) && size > 0",
      "result": "pass",
      "proof": "File: /path/to/file.ts, Size: 1247 bytes"
    },
    // ... more evidence ...
  ],
  "failures": [],
  "verifier_agent_id": "independent-verifier-1",
  "timestamp": "2025-12-02T14:30:00Z"
}

ステップ 5: タスクステータスの更新

verified=true → タスクを完了としてマークし、証拠を記録する
verified=false → タスクを失敗リストとともにエージェントに返す
- エージェントは修正して再提出する試行を 3 回行うことができる
- 3 回失敗した後 → 人間のレビューにエスカレート

検証方法

ファイル検証

方法: fs.existsSync(path) && fs.statSync(path).size > 0 証拠: ファイルパス、バイト単位のサイズ、最終更新タイムスタンプ 失敗のトリガー:

ファイルが存在しない
ファイルが空（0 バイト）
ファイルにアクセスできない（パーミッションエラー）

プレースホルダーの検出

方法: TODO、TBD、FIXME、[INSERT]、[IMPLEMENT] の正規表現スキャン証拠: 見つかったプレースホルダーの数と行番号 失敗のトリガー:

プレースホルダーが見つかった場合（「十分に完了しているように見える」ではない）
不完全な実装マーカーが残っている

TypeScript コンパイル

方法: npx tsc --noEmit [file] 証拠: コンパイラの出力、エラー数、エラーの詳細 失敗のトリガー:

コンパイルエラー（型の不一致、インポートの欠落など）
型チェックの失敗

リンティング

方法: npx eslint [file] --format json 証拠: リントの出力、エラー/警告の数 失敗のトリガー:

ESLint エラー（警告ではない）
コードスタイルの違反

テストの実行

方法: npm test -- [file] --run 証拠: テストの出力、成功/失敗の数、カバレッジ 失敗のトリガー:

テストが成功しなかった
テストファイルが存在しない
予想よりもテストの数が少ない

API エンドポイントの検証

方法: エンドポイントへの HTTP リクエスト、ステータスコードとレスポンスの形状をチェック証拠: ステータスコード、レスポンス時間、レスポンスボディのサンプル 失敗のトリガー:

HTTP 404、500、またはタイムアウト
予期しないレスポンス形式

証拠の要件

すべての検証は証拠を生成する必要がある

基準	証拠の種類	例
file_exists	ファイルパス、サイズ、タイムスタンプ	`/src/lib/file.ts, 2,541 bytes, 2025-12-02 14:30:00`
no_placeholders	スキャン結果	`0 placeholders found` または `Found 2: Line 15, Line 42`
compiles	コンパイラの出力	`0 TypeScript errors`
lint_passes	リンターの出力	`0 errors, 2 warnings`
tests_pass	テスト結果	`15 passed, 0 failed`
endpoint_responds	ステータスコード + レスポンス	`Status 200, response time 45ms`

禁止パターン

❌ 自己証明

// 間違い - エージェントが自分の宿題を採点する
return { verified: true, message: "I completed it" };

❌ 想定された成功

// 間違い - 実際にはチェックしない
if (claimedFile) {
  return { verified: true }; // 証拠がない！
}

❌ スキップされたチェック

// 間違い - 「このチェックは遅いので、今はスキップする」
if (criterion === 'tests_pass') {
  return { verified: true }; // チェックをスキップしてはいけない
}

❌ 緩い検証

// 間違い - 「だいたい合っているように見える」
if (output.includes('success')) {
  return { verified: true }; // 証拠がない！
}

✅ 適切な検証

// 正しい - 実際の証拠を収集
const result = await fs.stat(filePath);
if (result.size > 0) {
  return {
    verified: true,
    evidence: [{
      criterion: 'file_exists',
      proof: `File size: ${result.size} bytes`
    }]
  };
}

失敗の処理

検証が失敗した場合

エージェントは詳細な失敗レポートを受け取る。

{
  "verified": false,
  "failures": [
    {
      "criterion": "tests_pass:/tests/unit/feature.test.ts",
      "reason": "Test execution failed",
      "proof": "Expected 10 tests to pass, 3 failed"
    }
  ],
  "retry_count": 1,
  "max_retries": 3
}

エージェントは問題を修正する必要がある

失敗の詳細を読む
根本的な問題を修正する（検証ではなく）
検証のために再提出する
最大 3 回繰り返す

3 回失敗した後

タスクは人間のレビューにエスカレートされる。

{
  "status": "escalated_to_human",
  "reason": "Failed verification 3 times",
  "failures_history": [...]
}

例

良い例: 完全なファイル

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Verification Protocol - Elimination of Self-Attestation

Purpose

The Problem: Agents were verifying their own work and always returning success: true by default.

The Solution: Independent verification by a DIFFERENT agent that does NOT trust the original agent's claims.

The Rule: verified=true ONLY when EVIDENCE proves all completion criteria are met.

Core Principle

NEVER verify your own work.
ALWAYS verify with independent evidence.
ASSUME claims are false until proven true.
Block completion without proof.

Verification Protocol

Step 1: Task Completion Claim

Agent claims task is complete and provides:

{
  "task_id": "task-123",
  "claimed_outputs": ["/path/to/file.ts", "/path/to/test.ts"],
  "completion_criteria": [
    "file_exists:/path/to/file.ts",
    "no_placeholders:/path/to/file.ts",
    "typescript_compiles:/path/to/file.ts",
    "lint_passes:/path/to/file.ts",
    "tests_pass:/path/to/test.ts"
  ]
}

Step 2: Independent Verification Requested

Orchestrator sends to Independent Verifier Agent (different agent).

Step 3: Verification Execution

Independent Verifier checks EVERY criterion with actual evidence:

file_exists → fs.stat(path) && size > 0
  Proof: /path/to/file.ts, 1,247 bytes, modified 2025-12-02T14:30:00Z

no_placeholders → Scan for TODO, TBD, FIXME, [INSERT]
  Proof: 0 placeholders found

typescript_compiles → npx tsc --noEmit [file]
  Proof: Compilation successful, 0 errors

lint_passes → npx eslint [file]
  Proof: 0 linting errors

tests_pass → npm test -- [file]
  Proof: 15 tests passed, 0 failed

Step 4: Verification Result Returned

{
  "verified": true,
  "evidence": [
    {
      "criterion": "file_exists:/path/to/file.ts",
      "method": "fs.stat(path) && size > 0",
      "result": "pass",
      "proof": "File: /path/to/file.ts, Size: 1247 bytes"
    },
    // ... more evidence ...
  ],
  "failures": [],
  "verifier_agent_id": "independent-verifier-1",
  "timestamp": "2025-12-02T14:30:00Z"
}

Step 5: Task Status Updated

verified=true → Task marked COMPLETE, evidence logged
verified=false → Task returned to agent with failure list
- Agent has 3 attempts to fix and re-submit
- After 3 failures → ESCALATE TO HUMAN REVIEW

Verification Methods

File Verification

Method: fs.existsSync(path) && fs.statSync(path).size > 0 Evidence: File path, size in bytes, last modified timestamp Failure Triggers:

File does not exist
File is empty (0 bytes)
File not accessible (permission error)

Placeholder Detection

Method: Regex scan for TODO, TBD, FIXME, [INSERT], [IMPLEMENT] Evidence: Count and line numbers of placeholders found Failure Triggers:

ANY placeholder found (not "looks complete enough")
Incomplete implementation markers remain

TypeScript Compilation

Method: npx tsc --noEmit [file] Evidence: Compiler output, error count, error details Failure Triggers:

Compilation errors (any type mismatches, missing imports)
Type checking failures

Linting

Method: npx eslint [file] --format json Evidence: Lint output, error/warning counts Failure Triggers:

ESLint errors (not warnings)
Code style violations

Test Execution

Method: npm test -- [file] --run Evidence: Test output, pass/fail counts, coverage Failure Triggers:

Tests did not pass
Test file does not exist
Fewer tests than expected

API Endpoint Verification

Method: HTTP request to endpoint, check status code and response shape Evidence: Status code, response time, response body sample Failure Triggers:

HTTP 404, 500, or timeout
Unexpected response format

Evidence Requirements

Every verification must produce EVIDENCE

Criterion	Evidence Type	Example
file_exists	File path, size, timestamp	`/src/lib/file.ts, 2,541 bytes, 2025-12-02 14:30:00`
no_placeholders	Scan results	`0 placeholders found` or `Found 2: Line 15, Line 42`
compiles	Compiler output	`0 TypeScript errors`
lint_passes	Linter output	`0 errors, 2 warnings`
tests_pass	Test results	`15 passed, 0 failed`
endpoint_responds	Status code + response	`Status 200, response time 45ms`

Prohibited Patterns

❌ SELF-ATTESTATION

// WRONG - Agent grades its own homework
return { verified: true, message: "I completed it" };

❌ ASSUMED SUCCESS

// WRONG - Doesn't actually check
if (claimedFile) {
  return { verified: true }; // No evidence!
}

❌ SKIPPED CHECKS

// WRONG - "This check is slow, skip it for now"
if (criterion === 'tests_pass') {
  return { verified: true }; // NEVER skip checks
}

❌ LOOSE VERIFICATION

// WRONG - "Looks about right"
if (output.includes('success')) {
  return { verified: true }; // No proof!
}

✅ GOOD VERIFICATION

// RIGHT - Actual evidence collected
const result = await fs.stat(filePath);
if (result.size > 0) {
  return {
    verified: true,
    evidence: [{
      criterion: 'file_exists',
      proof: `File size: ${result.size} bytes`
    }]
  };
}

Failure Handling

When Verification Fails

Agent receives detailed failure report:

{
  "verified": false,
  "failures": [
    {
      "criterion": "tests_pass:/tests/unit/feature.test.ts",
      "reason": "Test execution failed",
      "proof": "Expected 10 tests to pass, 3 failed"
    }
  ],
  "retry_count": 1,
  "max_retries": 3
}

Agent Must Fix Issues

Read the failure details
Fix the underlying problem (not the verification)
Re-submit for verification
Repeat up to 3 times

After 3 Failures

Task escalates to human review:

{
  "status": "escalated_to_human",
  "reason": "Failed verification 3 times",
  "failures_history": [...]
}

Examples

Good Example: Complete File Verification

Task: Agent claims file was created and is ready for deployment

Evidence Collected:

✓ file_exists:/src/lib/agents/new-agent.ts
  Size: 3,847 bytes, Created: 2025-12-02 14:30:00

✓ no_placeholders:/src/lib/agents/new-agent.ts
  Scan found 0 TODO/TBD/FIXME markers

✓ typescript_compiles:/src/lib/agents/new-agent.ts
  tsc --noEmit completed successfully

✓ lint_passes:/src/lib/agents/new-agent.ts
  eslint: 0 errors, 0 warnings

✓ tests_pass:/tests/new-agent.test.ts
  npm test: 12 passed, 0 failed

Result: verified: true ✓ All evidence confirms completion

Bad Example: Incomplete File Verification

Task: Agent claims feature is complete

Evidence Collected:

✗ file_exists:/src/lib/features/new-feature.ts
  File not found: ENOENT: no such file or directory

✗ tests_pass:/tests/features/new-feature.test.ts
  Test file not found: ENOENT: no such file or directory

✗ typescript_compiles:/src/lib/features/incomplete.ts
  Compilation failed: Missing return type (line 42)

Result: verified: false ✗ Multiple criteria failed, agent must fix

Implementation in Your Code

Import and Use Independent Verifier

import { independentVerifier } from '@/lib/agents/independent-verifier';

// DO NOT return success directly
// DO call Independent Verifier
const result = await independentVerifier.verify({
  task_id: 'my-task-123',
  claimed_outputs: ['/path/to/file.ts'],
  completion_criteria: [
    'file_exists:/path/to/file.ts',
    'no_placeholders:/path/to/file.ts',
    'typescript_compiles:/path/to/file.ts'
  ],
  requesting_agent_id: this.agent_id
});

// Return the verification result (not your own assessment)
return result;

In Orchestrator

// Before marking task complete:
const verification = await independentVerifier.verify({
  task_id: task.id,
  claimed_outputs: task.outputs,
  completion_criteria: task.criteria,
  requesting_agent_id: task.agent_id
});

if (!verification.verified) {
  // Return task to agent for fixes
  task.status = 'verification_failed';
  task.failures = verification.failures;
  task.retry_count++;

  if (task.retry_count >= 3) {
    task.status = 'escalated_to_human';
  }
  return;
}

// Only mark complete with verification proof
task.status = 'complete';
task.verification = verification;

Health Endpoints for Verification

Endpoint: GET /api/health Status: ✓ Working Use: Basic system health check

Endpoint: GET /api/health/deep Status: ✓ Working Use: Comprehensive dependency checks

Endpoint: GET /api/health/routes Status: ✓ Working Use: Verify all API routes are accessible

All health endpoints return verifiable evidence of system state.

Success Metrics

After implementing Verification Protocol:

Metric	Before	After
Tasks verified without evidence	100%	0%
False completions accepted	Unknown	0%
Completion claims with evidence	0%	100%
Automatic escalation to human	N/A	Happens after 3 failures
Audit trail completeness	Partial	Full with evidence

Key Rules

1. NEVER verify your own work
2. ALWAYS use Independent Verifier
3. ALWAYS provide EVIDENCE
4. NEVER assume success
5. BLOCK completion without proof
6. ESCALATE after 3 failures

Status: Production Ready (v1.0.0) Last Updated: 2025-12-02 Critical: Yes - Blocks all task completions without proof