jpskill.com
💬 コミュニケーション コミュニティ

multi-ai-code-review

Multi-perspective code review using Claude, Gemini, and Codex as specialized agents. 5-dimensional analysis (security, performance, maintainability, correctness, style) with LLM-as-judge consensus, quality scoring, and CI/CD integration. Use when reviewing PRs, auditing code quality, preparing production releases, or establishing code review workflows.

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o multi-ai-code-review.zip https://jpskill.com/download/9444.zip && unzip -o multi-ai-code-review.zip && rm multi-ai-code-review.zip
🪟 Windows (PowerShell)
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/9444.zip -OutFile "$d\multi-ai-code-review.zip"; Expand-Archive "$d\multi-ai-code-review.zip" -DestinationPath $d -Force; ri "$d\multi-ai-code-review.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)
  1. 1. 下の青いボタンを押して multi-ai-code-review.zip をダウンロード
  2. 2. ZIPファイルをダブルクリックで解凍 → multi-ai-code-review フォルダができる
  3. 3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
  4. 4. Claude Code を再起動

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

  1. 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
  2. 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
  3. 3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
    • · macOS / Linux: ~/.claude/skills/
    • · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →
最終更新
2026-05-18
取得日時
2026-05-18
同梱ファイル
1
📖 Claude が読む原文 SKILL.md(中身を展開)

この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。

Multi-AI Code Review

Overview

multi-ai-code-review provides comprehensive code review using multiple AI models as specialized agents, each analyzing code from a different perspective. Based on 2024-2025 best practices for AI-assisted code review.

Purpose: Multi-perspective code quality assessment using AI ensemble with human oversight

Pattern: Task-based (5 independent review dimensions + orchestration)

Key Principles (validated by tri-AI research):

  1. Multi-Agent Architecture - Specialized agents for each review dimension
  2. LLM-as-Judge Consensus - Flag issues only when 2+ models agree
  3. Progressive Severity - Critical → High → Medium → Low prioritization
  4. Human-in-Loop - AI suggests, human decides
  5. Quality Gates - Block merges for critical unresolved issues
  6. Actionable Feedback - Every comment has What/Where/Why/How

Quality Targets:

  • False Positive Rate: <15%
  • Fix Acceptance Rate: >40%
  • Review Turnaround: <5 minutes
  • Bug Catch Rate: >30% pre-production

When to Use

Use multi-ai-code-review when:

  • Reviewing pull requests (any size)
  • Auditing code quality before release
  • Establishing consistent code review standards
  • Security auditing code changes
  • Performance profiling changes
  • Technical debt assessment
  • Onboarding reviews (mentorship mode)

When NOT to Use:

  • Trivial changes (typos, comments only)
  • Automated dependency updates (use dependabot labels)
  • Generated code (migrations, scaffolds)

Prerequisites

Required

  • Code to review (diff, file, or directory)
  • At least one AI available (Claude required, Gemini/Codex optional)

Recommended

  • Gemini CLI for web research and fast analysis
  • Codex CLI for deep code reasoning
  • Git repository context

Integration

  • GitHub Actions (optional, for CI/CD)
  • Pre-commit hooks (optional, for local checks)

Review Dimensions

5-Dimensional Analysis

Dimension Agent Focus Weight
Security Security Specialist OWASP Top 10, secrets, injection 25%
Performance Performance Engineer Complexity, memory, latency 20%
Maintainability Architect Patterns, modularity, DRY 25%
Correctness QA Engineer Logic, edge cases, tests 20%
Style Nitpicker Naming, formatting, conventions 10%

Severity Levels

Level Action Examples
Critical Block merge SQL injection, exposed secrets, data loss
High Require fix Race conditions, missing auth, memory leaks
Medium Suggest fix Code duplication, missing tests, complexity
Low Optional Style issues, naming, minor refactors

Operations

Operation 1: Quick Security Scan

Time: 2-5 minutes Automation: 80% Purpose: Fast security-focused review

Process:

  1. Scan for Critical Issues:
    
    Review this code for security vulnerabilities:
  • SQL injection
  • XSS vulnerabilities
  • Hardcoded secrets/API keys
  • Authentication bypasses
  • Authorization flaws
  • Input validation gaps
  • Insecure dependencies

Code: [PASTE CODE OR DIFF]

For each issue found, provide:

  • Severity (Critical/High/Medium)
  • Location (file:line)
  • Description (what's wrong)
  • Fix (specific code change)
  1. Validate with Gemini (optional):
    
    gemini -p "Verify these security findings. Are any false positives?
    [PASTE CLAUDE FINDINGS]

Code context: [PASTE RELEVANT CODE]"


3. **Output**: Security report with consensus findings

---

### Operation 2: Comprehensive PR Review

**Time**: 10-30 minutes
**Automation**: 60%
**Purpose**: Full multi-dimensional review

**Process**:

**Step 1: Gather Context**
```bash
# Get PR diff
git diff main...HEAD > /tmp/pr_diff.txt

# Identify affected areas
grep -E "^(\\+\\+\\+|---)" /tmp/pr_diff.txt | head -20

Step 2: Run Parallel Agent Reviews

Use Task tool to launch parallel agents:

Launch 3 parallel review agents:

Agent 1 (Security):
"Review this diff for security issues. Focus on:
- OWASP Top 10 vulnerabilities
- Authentication/authorization
- Input validation
- Secrets exposure
Diff: [DIFF]"

Agent 2 (Maintainability):
"Review this diff for maintainability. Focus on:
- Design patterns used correctly
- Code duplication (DRY)
- Modularity and cohesion
- Documentation quality
Diff: [DIFF]"

Agent 3 (Correctness):
"Review this diff for correctness. Focus on:
- Logic errors
- Edge cases not handled
- Test coverage gaps
- Error handling
Diff: [DIFF]"

Step 3: Orchestrate & Deduplicate

Synthesize findings from all agents:
[PASTE ALL AGENT OUTPUTS]

Tasks:
1. Remove duplicate findings
2. Rank by severity (Critical > High > Medium > Low)
3. Group by file
4. Generate summary table
5. Create final report with consensus issues only

Step 4: Generate Report

Output format:

## PR Review Summary

| File | Risk | Issues | Critical | High | Medium |
|------|------|--------|----------|------|--------|
| auth.py | High | 3 | 1 | 2 | 0 |
| api.py | Medium | 2 | 0 | 1 | 1 |

### Critical Issues (Block Merge)
1. **[auth.py:45]** SQL Injection vulnerability
   - Why: User input directly in query
   - Fix: Use parameterized queries

### High Issues (Require Fix)
...

### Consensus Score: 72/100
- Security: 65/100
- Performance: 80/100
- Maintainability: 70/100
- Correctness: 75/100
- Style: 85/100

Operation 3: LLM-as-Judge Tribunal

Time: 5-15 minutes Automation: 70% Purpose: High-confidence findings through consensus

Process:

  1. Run Code Through Multiple Models:

Claude Analysis:

Analyze this code for issues. Rate severity 1-10 for each:
[CODE]

Gemini Analysis (via CLI):

gemini -p "Analyze this code for issues. Rate severity 1-10 for each:
[CODE]"

Codex Analysis (via CLI):

codex "Analyze this code for issues. Rate severity 1-10 for each:
[CODE]"
  1. Calculate Consensus:
    
    Given these analyses from 3 AI models:

Claude: [FINDINGS] Gemini: [FINDINGS] Codex: [FINDINGS]

Identify issues where at least 2 models agree:

  1. List consensus findings

  2. Average severity scores

  3. Note any disagreements

  4. Final verdict for each issue

  5. Output: High-confidence issue list (≥67% agreement)


Operation 4: Mentorship Review

Time: 15-30 minutes Automation: 40% Purpose: Educational code review for learning

Process:

Review this code in mentorship mode. For a developer learning [LANGUAGE/FRAMEWORK]:

Code: [CODE]

For each finding:
1. **What's the issue** (be encouraging, not critical)
2. **Why it matters** (explain the underlying concept)
3. **How to improve** (show before/after with explanation)
4. **Learn more** (link to relevant documentation)

Also highlight:
- What was done well
- Good patterns to continue using
- Growth opportunities

Tone: Supportive and educational, never condescending.

Operation 5: Pre-Release Audit

Time: 30-60 minutes Automation: 50% Purpose: Comprehensive review before production

Process:

  1. Full Codebase Scan:

    # Identify all changes since last release
    git diff v1.0.0...HEAD --stat
    git log v1.0.0...HEAD --oneline
  2. Security Deep Dive:

  • Run all security checks
  • Verify no new vulnerabilities
  • Check dependency updates
  • Audit secrets management
  1. Performance Review:
  • Identify potential bottlenecks
  • Review database queries
  • Check for N+1 problems
  • Validate caching strategies
  1. Test Coverage:
  • Verify test coverage targets
  • Check critical path coverage
  • Validate edge case tests
  1. Generate Release Report:
    
    ## Pre-Release Audit: v1.1.0

Security Clearance: PASS ✓

  • No critical vulnerabilities
  • All high issues resolved
  • Secrets audit: Clean

Performance Assessment: PASS ✓

  • No new N+1 queries
  • Response time within SLA
  • Memory usage stable

Test Coverage: 82% (target: 80%)

  • Critical paths: 95%
  • Edge cases: 78%

Release Recommendation: APPROVED


---

## Multi-AI Coordination

### Agent Assignment Strategy

| Task | Primary | Verification | Speed |
|------|---------|--------------|-------|
| Security scan | Claude | Gemini | Fast |
| Architecture review | Claude | Codex | Medium |
| Logic validation | Codex | Claude | Medium |
| Style checking | Gemini | Claude | Fast |
| Performance analysis | Claude | Codex | Medium |

### Coordination Commands

**Launch Multi-Agent Review**:
```bash
# Using Task tool for parallel execution
# Each agent reviews independently, orchestrator synthesizes

Gemini Quick Check:

gemini -p "Quick security scan of this code: [CODE]"

Codex Deep Analysis:

codex "Analyze this code architecture and suggest improvements: [CODE]"

CI/CD Integration

GitHub Actions Workflow

# .github/workflows/ai-review.yml
name: Multi-AI Code Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get PR Diff
        run: |
          git diff origin/main...HEAD > pr_diff.txt

      - name: Claude Review
        uses: anthropics/claude-code-action@v1
        with:
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
          model: "claude-sonnet-4-5-20250929"
          review_level: "detailed"

      - name: Post Summary
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## AI Review Summary\n${process.env.REVIEW_SUMMARY}`
            })

Quality Gate Configuration

# Block merge for critical issues
quality_gates:
  critical_issues: 0      # Must be zero
  high_issues: 3          # Max allowed
  coverage_minimum: 80    # Percent
  score_minimum: 70       # Out of 100

Quality Scoring

Scoring Formula

Overall = (Security × 0.25) + (Performance × 0.20) +
          (Maintainability × 0.25) + (Correctness × 0.20) +
          (Style × 0.10)

Grade Mapping

Score Grade Status
≥90 A Excellent - Ship it
80-89 B Good - Minor fixes
70-79 C Acceptable - Address issues
60-69 D Needs work - Significant fixes
<60 F Failing - Major revision needed

Anti-Patterns to Detect

  1. Hardcoded Secrets - API keys, passwords in code
  2. SQL Injection - Unparameterized queries
  3. XSS Vulnerabilities - Unsanitized output
  4. Race Conditions - Unprotected shared state
  5. Memory Leaks - Unclosed resources
  6. N+1 Queries - Loop database calls
  7. Dead Code - Unreachable branches
  8. God Objects - Classes doing too much
  9. Copy-Paste Code - Duplicated logic
  10. Missing Error Handling - Unhandled exceptions

Example Review Session

User: Review this PR for my authentication module

Claude: I'll perform a comprehensive multi-dimensional review.

[Launches parallel agents for security, maintainability, correctness]

## PR Review: Authentication Module

### Critical Issues (1)
1. **[auth.py:67]** Password stored in plaintext
   - Severity: Critical
   - Consensus: 3/3 models agree
   - Fix: Use bcrypt hashing
   ```python
   # Before
   user.password = request.password

   # After
   import bcrypt
   user.password = bcrypt.hashpw(request.password.encode(), bcrypt.gensalt())

High Issues (2)

  1. [auth.py:45] No rate limiting on login endpoint
  2. [auth.py:89] JWT secret hardcoded

Quality Score: 58/100 (Grade: F)

  • Security: 35/100 (Critical issues)
  • Performance: 70/100
  • Maintainability: 65/100
  • Correctness: 60/100
  • Style: 80/100

Recommendation: BLOCK MERGE

Resolve critical security issues before merging.



---

## Related Skills

- **multi-ai-testing**: Generate tests for reviewed code
- **multi-ai-verification**: Validate fixes
- **multi-ai-implementation**: Implement suggested fixes
- **codex-review**: Codex-specific review patterns
- **review-multi**: Skill-specific reviews

---

## References

- `references/security-checklist.md` - OWASP Top 10 checklist
- `references/performance-patterns.md` - Performance anti-patterns
- `references/ci-cd-integration.md` - Full CI/CD setup guide