🛠️ 開発・MCP コミュニティ 🔴 エンジニア向け 👤 エンジニア・AI開発者

🛠️ プロジェクトDevelopment

project-development

大規模言語モデル（LLM）を活用するプロジェクトにおいて、

⚡ ⏱ テスト計画作成 2時間 → 20分

📺 まず動画で見る(YouTube)

▶ 【衝撃】最強のAIエージェント「Claude Code」の最新機能・使い方・プログラミングをAIで効率化する超実践術を解説! ↗

※ jpskill.com 編集部が参考用に選んだ動画です。動画の内容と Skill の挙動は厳密には一致しないことがあります。

📜 元の英語説明(参考)

This skill covers the principles for identifying tasks suited to LLM processing, designing effective project architectures, and iterating rapidly using agent-assisted development.

🇯🇵 日本人クリエイター向け解説

一言でいうと

大規模言語モデル（LLM）を活用するプロジェクトにおいて、

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o project-development.zip https://jpskill.com/download/3327.zip && unzip -o project-development.zip && rm project-development.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/3327.zip -OutFile "$d\project-development.zip"; Expand-Archive "$d\project-development.zip" -DestinationPath $d -Force; ri "$d\project-development.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して project-development.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → project-development フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-17
取得日時: 2026-05-17
同梱ファイル: 1

💬 こう話しかけるだけ — サンプルプロンプト

› Project Development を使って、最小構成のサンプルコードを示して
› Project Development の主な使い方と注意点を教えて
› Project Development を既存プロジェクトに組み込む方法を教えて

これをClaude Code に貼るだけで、このSkillが自動発動します。

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

[スキル名] project-development

プロジェクト開発手法

このスキルは、LLM処理に適したタスクの特定、効果的なプロジェクトアーキテクチャの設計、およびエージェント支援開発を用いた迅速な反復の原則を扱います。この手法は、バッチ処理パイプライン、マルチエージェント研究システム、またはインタラクティブなエージェントアプリケーションを構築する場合に適用されます。

使用するタイミング

以下の状況でこのスキルを有効にしてください。

LLM処理の恩恵を受ける可能性のある新しいプロジェクトを開始するとき
タスクがエージェントと従来のコードのどちらに適しているかを評価するとき
LLMを活用したアプリケーションのアーキテクチャを設計するとき
構造化された出力を持つバッチ処理パイプラインを計画するとき
シングルエージェントとマルチエージェントのアプローチのどちらを選択するか決めるとき
LLMを多用するプロジェクトのコストとタイムラインを見積もるとき

コアコンセプト

タスクとモデルの適合性の認識

すべての問題がLLM処理の恩恵を受けるわけではありません。どのプロジェクトにおいても最初のステップは、タスクの特性がLLMの強みと一致するかどうかを評価することです。この評価は、コードを記述する前に行う必要があります。

LLMに適したタスクは、以下の特性を共有しています。

特性	適合する理由
複数のソースからの統合	LLMは複数の入力からの情報を組み合わせるのが得意です
ルーブリックを用いた主観的な判断	LLMは基準を用いた採点、評価、分類を処理します
自然言語出力	目標が構造化データではなく、人間が読めるテキストである場合
エラー許容度	個々の失敗がシステム全体を破壊しない場合
バッチ処理	アイテム間で会話状態が不要な場合
トレーニングにおけるドメイン知識	モデルがすでに適切なコンテキストを持っている場合

LLMに適さないタスクは、以下の特性を共有しています。

特性	失敗する理由
正確な計算	数学、カウント、正確なアルゴリズムは信頼できません
リアルタイム要件	LLMのレイテンシは秒以下の応答には高すぎます
完璧な精度要件	ハルシネーションのリスクにより100%の精度は不可能です
独自のデータへの依存	モデルに必要なコンテキストが不足しています
順次的な依存関係	各ステップが前の結果に強く依存しています
決定論的な出力要件	同じ入力が同じ出力を生成する必要があります

評価は手動プロトタイピングを通じて行う必要があります。自動化を構築する前に、代表的な例を1つ取り上げ、ターゲットモデルで直接テストしてください。

手動プロトタイプステップ

自動化に投資する前に、手動テストでタスクとモデルの適合性を検証してください。代表的な入力を1つモデルインターフェースにコピーします。出力の品質を評価します。これは数分で完了し、何時間もの無駄な開発を防ぎます。

この検証は、重要な質問に答えます。

モデルはこのタスクに必要な知識を持っていますか？
モデルは必要な形式で出力を生成できますか？
大規模な場合、どの程度の品質を期待すべきですか？
対処すべき明らかな失敗モードはありますか？

手動プロトタイプが失敗した場合、自動化されたシステムも失敗します。成功した場合、比較のベースラインとプロンプト設計のテンプレートが得られます。

パイプラインアーキテクチャ

LLMプロジェクトは、各ステージが以下の特性を持つ段階的なパイプラインアーキテクチャから恩恵を受けます。

離散的: ステージ間に明確な境界がある
冪等: 再実行しても同じ結果が生成される
キャッシュ可能: 中間結果がディスクに永続化される
独立: 各ステージが個別に実行できる

典型的なパイプライン構造:

acquire → prepare → process → parse → render

Acquire: ソース（API、ファイル、データベース）から生データを取得します
Prepare: データをプロンプト形式に変換します
Process: LLM呼び出しを実行します（高価で非決定論的なステップ）
Parse: LLM出力から構造化データを抽出します
Render: 最終出力（レポート、ファイル、視覚化）を生成します

ステージ1、2、4、5は決定論的です。ステージ3は非決定論的で高価です。この分離により、高価なLLMステージを必要なときにのみ再実行し、解析とレンダリングを迅速に反復できます。

ファイルシステムをステートマシンとして使用する

データベースやインメモリ構造ではなく、ファイルシステムを使用してパイプラインの状態を追跡します。各処理単位はディレクトリを持ちます。各ステージの完了はファイルの存在によってマークされます。

data/{id}/
├── raw.json         # acquire stage complete
├── prompt.md        # prepare stage complete
├── response.md      # process stage complete
├── parsed.json      # parse stage complete

アイテムが処理を必要とするかどうかを確認するには、出力ファイルが存在するかどうかを確認します。ステージを再実行するには、その出力ファイルと下流のファイルを削除します。デバッグするには、中間ファイルを直接読み取ります。

このパターンは以下を提供します。

自然な冪等性（ファイルの存在が実行を制御します）
簡単なデバッグ（すべての状態が人間が読めます）
シンプルな並列化（各ディレクトリは独立しています）
自明なキャッシュ（ファイルは実行間で永続化されます）

構造化出力の設計

LLM出力をプログラムで解析する必要がある場合、プロンプト設計が解析の信頼性を直接決定します。プロンプトは、正確な形式要件を例とともに指定する必要があります。

効果的な構造指定には以下が含まれます。

セクションマーカー: 解析のための明示的なヘッダーまたはプレフィックス
形式の例: 出力がどのように見えるべきかを正確に示します
理由の開示: 「これをプログラムで解析します」
制約付き値: 列挙されたオプション、スコア範囲、形式

プロンプト構造の例:

Analyze the following and provide your response in exactly this format:

## Summary
[Your summary here]

## Score
Rating: [1-10]

## Details
- Key point 1
- Key point 2

Follow this format exactly because I will be parsing it programmatically.

解析コードは、バリエーションを適切に処理する必要があります。LLMは指示に完璧に従うわけではありません。以下の特性を持つパーサーを構築してください。

軽微な書式設定のバリエーションを処理できる十分な柔軟性を持つ正規表現パターンを使用する
セクションが欠落している場合に適切なデフォルトを提供する
クラッシュするのではなく、解析の失敗を後で確認するためにログに記録する

エージェント支援開発

最新のエージェント対応

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Project Development Methodology

This skill covers the principles for identifying tasks suited to LLM processing, designing effective project architectures, and iterating rapidly using agent-assisted development. The methodology applies whether building a batch processing pipeline, a multi-agent research system, or an interactive agent application.

When to Use

Activate this skill when:

Starting a new project that might benefit from LLM processing
Evaluating whether a task is well-suited for agents versus traditional code
Designing the architecture for an LLM-powered application
Planning a batch processing pipeline with structured outputs
Choosing between single-agent and multi-agent approaches
Estimating costs and timelines for LLM-heavy projects

Core Concepts

Task-Model Fit Recognition

Not every problem benefits from LLM processing. The first step in any project is evaluating whether the task characteristics align with LLM strengths. This evaluation should happen before writing any code.

LLM-suited tasks share these characteristics:

Characteristic	Why It Fits
Synthesis across sources	LLMs excel at combining information from multiple inputs
Subjective judgment with rubrics	LLMs handle grading, evaluation, and classification with criteria
Natural language output	When the goal is human-readable text, not structured data
Error tolerance	Individual failures do not break the overall system
Batch processing	No conversational state required between items
Domain knowledge in training	The model already has relevant context

LLM-unsuited tasks share these characteristics:

Characteristic	Why It Fails
Precise computation	Math, counting, and exact algorithms are unreliable
Real-time requirements	LLM latency is too high for sub-second responses
Perfect accuracy requirements	Hallucination risk makes 100% accuracy impossible
Proprietary data dependence	The model lacks necessary context
Sequential dependencies	Each step depends heavily on the previous result
Deterministic output requirements	Same input must produce identical output

The evaluation should happen through manual prototyping: take one representative example and test it directly with the target model before building any automation.

The Manual Prototype Step

Before investing in automation, validate task-model fit with a manual test. Copy one representative input into the model interface. Evaluate the output quality. This takes minutes and prevents hours of wasted development.

This validation answers critical questions:

Does the model have the knowledge required for this task?
Can the model produce output in the format you need?
What level of quality should you expect at scale?
Are there obvious failure modes to address?

If the manual prototype fails, the automated system will fail. If it succeeds, you have a baseline for comparison and a template for prompt design.

Pipeline Architecture

LLM projects benefit from staged pipeline architectures where each stage is:

Discrete: Clear boundaries between stages
Idempotent: Re-running produces the same result
Cacheable: Intermediate results persist to disk
Independent: Each stage can run separately

The canonical pipeline structure:

acquire → prepare → process → parse → render

Acquire: Fetch raw data from sources (APIs, files, databases)
Prepare: Transform data into prompt format
Process: Execute LLM calls (the expensive, non-deterministic step)
Parse: Extract structured data from LLM outputs
Render: Generate final outputs (reports, files, visualizations)

Stages 1, 2, 4, and 5 are deterministic. Stage 3 is non-deterministic and expensive. This separation allows re-running the expensive LLM stage only when necessary, while iterating quickly on parsing and rendering.

File System as State Machine

Use the file system to track pipeline state rather than databases or in-memory structures. Each processing unit gets a directory. Each stage completion is marked by file existence.

data/{id}/
├── raw.json         # acquire stage complete
├── prompt.md        # prepare stage complete
├── response.md      # process stage complete
├── parsed.json      # parse stage complete

To check if an item needs processing: check if the output file exists. To re-run a stage: delete its output file and downstream files. To debug: read the intermediate files directly.

This pattern provides:

Natural idempotency (file existence gates execution)
Easy debugging (all state is human-readable)
Simple parallelization (each directory is independent)
Trivial caching (files persist across runs)

Structured Output Design

When LLM outputs must be parsed programmatically, prompt design directly determines parsing reliability. The prompt must specify exact format requirements with examples.

Effective structure specification includes:

Section markers: Explicit headers or prefixes for parsing
Format examples: Show exactly what output should look like
Rationale disclosure: "I will be parsing this programmatically"
Constrained values: Enumerated options, score ranges, formats

Example prompt structure:

Analyze the following and provide your response in exactly this format:

## Summary
[Your summary here]

## Score
Rating: [1-10]

## Details
- Key point 1
- Key point 2

Follow this format exactly because I will be parsing it programmatically.

The parsing code must handle variations gracefully. LLMs do not follow instructions perfectly. Build parsers that:

Use regex patterns flexible enough to handle minor formatting variations
Provide sensible defaults when sections are missing
Log parsing failures for later review rather than crashing

Agent-Assisted Development

Modern agent-capable models can accelerate development significantly. The pattern is:

Describe the project goal and constraints
Let the agent generate initial implementation
Test and iterate on specific failures
Refine prompts and architecture based on results

This is about rapid iteration: generate, test, fix, repeat. The agent handles boilerplate and initial structure while you focus on domain-specific requirements and edge cases.

Key practices for effective agent-assisted development:

Provide clear, specific requirements upfront
Break large projects into discrete components
Test each component before moving to the next
Keep the agent focused on one task at a time

Cost and Scale Estimation

LLM processing has predictable costs that should be estimated before starting. The formula:

Total cost = (items × tokens_per_item × price_per_token) + API overhead

For batch processing:

Estimate input tokens per item (prompt + context)
Estimate output tokens per item (typical response length)
Multiply by item count
Add 20-30% buffer for retries and failures

Track actual costs during development. If costs exceed estimates significantly, re-evaluate the approach. Consider:

Reducing context length through truncation
Using smaller models for simpler items
Caching and reusing partial results
Parallel processing to reduce wall-clock time (not token cost)

Detailed Topics

Choosing Single vs Multi-Agent Architecture

Single-agent pipelines work for:

Batch processing with independent items
Tasks where items do not interact
Simpler cost and complexity management

Multi-agent architectures work for:

Parallel exploration of different aspects
Tasks exceeding single context window capacity
When specialized sub-agents improve quality

The primary reason for multi-agent is context isolation, not role anthropomorphization. Sub-agents get fresh context windows for focused subtasks. This prevents context degradation on long-running tasks.

See multi-agent-patterns skill for detailed architecture guidance.

Architectural Reduction

Start with minimal architecture. Add complexity only when proven necessary. Production evidence shows that removing specialized tools often improves performance.

Vercel's d0 agent achieved 100% success rate (up from 80%) by reducing from 17 specialized tools to 2 primitives: bash command execution and SQL. The file system agent pattern uses standard Unix utilities (grep, cat, find, ls) instead of custom exploration tools.

When reduction outperforms complexity:

Your data layer is well-documented and consistently structured
The model has sufficient reasoning capability
Your specialized tools were constraining rather than enabling
You are spending more time maintaining scaffolding than improving outcomes

When complexity is necessary:

Your underlying data is messy, inconsistent, or poorly documented
The domain requires specialized knowledge the model lacks
Safety constraints require limiting agent capabilities
Operations are truly complex and benefit from structured workflows

See tool-design skill for detailed tool architecture guidance.

Iteration and Refactoring

Expect to refactor. Production agent systems at scale require multiple architectural iterations. Manus refactored their agent framework five times since launch. The Bitter Lesson suggests that structures added for current model limitations become constraints as models improve.

Build for change:

Keep architecture simple and unopinionated
Test across model strengths to verify your harness is not limiting performance
Design systems that benefit from model improvements rather than locking in limitations

Practical Guidance

Project Planning Template

Task Analysis
- What is the input? What is the desired output?
- Is this synthesis, generation, classification, or analysis?
- What error rate is acceptable?
- What is the value per successful completion?
Manual Validation
- Test one example with target model
- Evaluate output quality and format
- Identify failure modes
- Estimate tokens per item
Architecture Selection
- Single pipeline vs multi-agent
- Required tools and data sources
- Storage and caching strategy
- Parallelization approach
Cost Estimation
- Items × tokens × price
- Development time
- Infrastructure requirements
- Ongoing operational costs
Development Plan
- Stage-by-stage implementation
- Testing strategy per stage
- Iteration milestones
- Deployment approach

Anti-Patterns to Avoid

Skipping manual validation: Building automation before verifying the model can do the task wastes significant time when the approach is fundamentally flawed.

Monolithic pipelines: Combining all stages into one script makes debugging and iteration difficult. Separate stages with persistent intermediate outputs.

Over-constraining the model: Adding guardrails, pre-filtering, and validation logic that the model could handle on its own. Test whether your scaffolding helps or hurts.

Ignoring costs until production: Token costs compound quickly at scale. Estimate and track from the beginning.

Perfect parsing requirements: Expecting LLMs to follow format instructions perfectly. Build robust parsers that handle variations.

Premature optimization: Adding caching, parallelization, and optimization before the basic pipeline works correctly.

Examples

Example 1: Batch Analysis Pipeline (Karpathy's HN Time Capsule)

Task: Analyze 930 HN discussions from 10 years ago with hindsight grading.

Architecture:

5-stage pipeline: fetch → prompt → analyze → parse → render
File system state: data/{date}/{item_id}/ with stage output files
Structured output: 6 sections with explicit format requirements
Parallel execution: 15 workers for LLM calls

Results: $58 total cost, ~1 hour execution, static HTML output.

Example 2: Architectural Reduction (Vercel d0)

Task: Text-to-SQL agent for internal analytics.

Before: 17 specialized tools, 80% success rate, 274s average execution.

After: 2 tools (bash + SQL), 100% success rate, 77s average execution.

Key insight: The semantic layer was already good documentation. Claude just needed access to read files directly.

See Case Studies for detailed analysis.

Guidelines

Validate task-model fit with manual prototyping before building automation
Structure pipelines as discrete, idempotent, cacheable stages
Use the file system for state management and debugging
Design prompts for structured, parseable outputs with explicit format examples
Start with minimal architecture; add complexity only when proven necessary
Estimate costs early and track throughout development
Build robust parsers that handle LLM output variations
Expect and plan for multiple architectural iterations
Test whether scaffolding helps or constrains model performance
Use agent-assisted development for rapid iteration on implementation

Integration

This skill connects to:

context-fundamentals - Understanding context constraints for prompt design
tool-design - Designing tools for agent systems within pipelines
multi-agent-patterns - When to use multi-agent versus single pipelines
evaluation - Evaluating pipeline outputs and agent performance
context-compression - Managing context when pipelines exceed limits

References

Internal references:

Case Studies - Karpathy HN Capsule, Vercel d0, Manus patterns
Pipeline Patterns - Detailed pipeline architecture guidance

Related skills in this collection:

tool-design - Tool architecture and reduction patterns
multi-agent-patterns - When to use multi-agent architectures
evaluation - Output evaluation frameworks

External resources:

Karpathy's HN Time Capsule project: https://github.com/karpathy/hn-time-capsule
Vercel d0 architectural reduction: https://vercel.com/blog/we-removed-80-percent-of-our-agents-tools
Manus context engineering: Peak Ji's blog on context engineering lessons
Anthropic multi-agent research: How we built our multi-agent research system

Skill Metadata

Created: 2025-12-25 Last Updated: 2025-12-25 Author: Agent Skills for Context Engineering Contributors Version: 1.0.0

Limitations

Use this skill only when the task clearly matches the scope described above.
Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.