📦 その他コミュニティ

experiment-report-writer

機械学習や研究の実験メモ、設定、ログ、結果などから、動機や結論を含む構造化された実験報告書を作成するSkill。

📜 元の英語説明(参考)

Write structured experiment report documents from ML/research experiment notes, configs, logs, metrics, tables, and figures. Use this skill whenever the user asks to write an experiment report, research update, mentor update, weekly experiment summary, result analysis document, or presentation-ready experiment writeup, especially when the output should explain motivation, setup, algorithms, metrics, results, figures, interpretation, conclusions, limitations, and next steps.

🇯🇵 日本人クリエイター向け解説

一言でいうと

機械学習や研究の実験メモ、設定、ログ、結果などから、動機や結論を含む構造化された実験報告書を作成するSkill。

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⬇ このSkillをダウンロード(.skill) 元のソースを見る ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-17
取得日時: 2026-05-17
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

[Skill 名] experiment-report-writer

実験レポートライター

実験の証拠を、読者が実験を再実行することなく評価できる、明確な研究レポートに変換します。

このスキルは、独立したドキュメント、論文やラボノートのセクション、メンター向けの更新、またはプレゼンテーション用の実験概要の作成に使用できます。

完了した結果がプロジェクトの主張、証拠、リスク、アクション、図、またはワークツリーの決定を更新する必要がある場合は、このスキルを research-project-memory と組み合わせて使用してください。

スキルディレクトリのレイアウト

<installed-skill-dir>/
├── SKILL.md
└── templates/
    └── experiment-report.md

プログレッシブローディング

レポートを保存する際は、templates/experiment-report.md をデフォルトの Markdown スケルトンとして使用します。
ユーザーがチャットでドラフトのみを希望する場合は、テンプレートを逐語的に読み込んだりコピーしたりすることなく、同じセクション順序に従います。

コア原則

すべての主張を証拠（設定、コマンド、ログ、メトリクス、テーブル、図、コミットハッシュ、またはユーザー提供のメモ）に基づいて裏付けます。
観測された結果と解釈を区別します。仮説を測定された事実として提示しないでください。
他の研究者が何が実行されたかを特定できる程度に、レポートを再現可能にします。
数値をリストする前に、実験がなぜ重要なのかを説明します。
適切な参照点（ベースライン、以前の実行、アブレーションコントロール、期待される動作、または公開された数値）と比較します。
不確実性を保持します。証拠が不足している場合は、不足していることを明記し、最小限の有用な説明を求めます。
対象読者向けに記述します。ラボノートは詳細に記述できますが、メンターへの更新は決定、証拠、および次のステップを強調する必要があります。

ステップ 1 - レポートの分類

レポートモードを特定します。

single-experiment: 1回の実行または1回の制御された比較
ablation-report: 1つの要因をテストするいくつかのバリアント
batch-summary: スイープまたは実験バッチからの多くの関連する実行
mentor-update: 意思決定に焦点を当てた議論を含む簡潔な進捗レポート
paper-section: 論文の一部となることを意図した洗練されたテキスト

また、以下も特定します。

読者
出力形式: Markdown、LaTeX、スライドアウトライン、またはチャットドラフト
ユーザーがファイルを希望する場合の保存パス
予想される長さ
図、テーブル、設定、ログ、またはノートブックが利用可能かどうか

ユーザーが形式を指定しない場合、デフォルトは Markdown です。ファイルが要求され、パスが指定されていない場合は、以下を使用します。

docs/reports/experiment_report_YYYY-MM-DD.md

ステップ 2 - 証拠の収集

記憶よりも一次証拠を優先します。

以下を探します。

実験コマンドまたはスクリプト
設定ファイルとパラメータのオーバーライド
ランダムシードと実行回数
データセット名、分割、前処理、およびサンプル数
モデル、メソッドバリアント、チェックポイント、またはアルゴリズムバージョン
関連する場合のハードウェアとランタイム
メトリクス、ログ、結果テーブル、図、および失敗ケース
利用可能な場合の git コミットハッシュまたはコードバージョン

有用なローカルチェックには以下が含まれます。

git rev-parse --short HEAD
find . -maxdepth 3 -type f \( -name "*.yaml" -o -name "*.yml" -o -name "*.json" -o -name "*.csv" -o -name "*.md" \)
find . -maxdepth 4 -type f \( -name "*.png" -o -name "*.jpg" -o -name "*.pdf" -o -name "*.svg" \)

ユーザーが非公式のメモのみを提供する場合でも、それらを使用しますが、再現性の詳細が不足している場合は明示的にラベル付けします。

ステップ 3 - 実験のストーリーの抽出

ドラフトを作成する前に、実験を以下のように整理します。

質問: この実験は何を学ぼうとしていたのか？
動機: なぜその質問が重要なのか？
仮説: 何を期待し、なぜそう期待したのか？
方法: ベースラインと比較して何が変更されたのか？
コントロール: 何が固定されたままだったのか？
測定: どのメトリクスが質問に答えるのか？
結果: 何が起こったのか？
解釈: 結果は何を示唆しているのか？
決定: 次に何が起こるべきか？

アブレーションやスイープの場合、独立変数を明示し、比較を公平に保ちます。

必須レポート構造

ユーザーが異なる形式を要求しない限り、以下のセクションを使用します。

# [実験レポートのタイトル]

## 概要
## 1. 実験の動機
## 2. 実験設定
## 3. コアアルゴリズムまたはメソッド
## 4. メトリクス
## 5. 結果
## 6. 図の読み方
## 7. 解釈
## 8. 結論と考察
## 9. 制限事項と注意点
## 10. 次のステップ
## 再現性に関する注記

コアアルゴリズムがない場合は、「該当なし」と記述し、実験がデータ、ハイパーパラメータ、評価、インフラストラクチャ、または分析のいずれを変更するかを簡単に説明します。

図がない場合は、「図の読み方」を省略するか、テーブルが主な証拠である場合は「テーブルの読み方」に置き換えます。

セクションのガイダンス

概要

以下の内容を3〜6個の箇条書きで記述します。

実験の質問
最も重要な設定の詳細
主要な結果
解釈
推奨される次のステップ

1. 実験の動機

実験の調査またはエンジニアリング上の理由を説明します。

テストされている問題
期待されるメカニズム
結果がプロジェクトにどのように影響するか
実験がサポートする決定

2. 実験設定

実行を再現または監査するのに十分な詳細を含めます。

データセット、分割、前処理
ベースラインと比較されたバリアント
主要なハイパーパラメータとパラメータの変更
トレーニング/評価コマンド、設定ファイル、または実行ID
ランダムシードと試行回数
関連する場合のハードウェア、ランタイム、およびコードバージョン

5つ以上の重要な設定がある場合は、パラメータをテーブルで示します。

3. コアアルゴリズムまたはメソッド

実験を理解するために必要なレベルでのみアルゴリズムを記述します。

消費する入力
生成する出力
主要なステップまたは目的
ベースラインと比べて新しい点または異なる点
解釈に影響する複雑さ、仮定、または実装の詳細

対象読者が必要としない限り、標準的な背景を過度に説明しないでください。

4. メトリクス

各メトリクスについて、以下を説明します。

定義
方向: 高い方が良い、低い方が良い、または目標範囲
単位
集計: 平均、中央値、最適なチェックポイント、最終エポック、信頼区間、または標準偏差
実験の質問にどのように関連するか

互いに矛盾する可能性のあるメトリクスにフラグを立てます。

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Experiment Report Writer

Turn experiment evidence into a clear research report that a reader can evaluate without rerunning the experiment.

Use this skill to write a standalone document, a section for a paper or lab note, a mentor-facing update, or a presentation-ready experiment summary.

Pair this skill with research-project-memory when completed results should update project claims, evidence, risks, actions, figures, or worktree decisions.

Skill Directory Layout

<installed-skill-dir>/
├── SKILL.md
└── templates/
    └── experiment-report.md

Progressive Loading

Use templates/experiment-report.md as the default Markdown skeleton when saving a report.
If the user only wants a draft in chat, follow the same section order without needing to read or copy the template verbatim.

Core Principles

Ground every claim in evidence: configs, commands, logs, metrics, tables, figures, commit hashes, or user-provided notes.
Separate observed results from interpretation. Do not present a hypothesis as a measured fact.
Make the report reproducible enough that another researcher can identify what was run.
Explain why the experiment matters before listing numbers.
Compare against the right reference point: baseline, previous run, ablation control, expected behavior, or published number.
Preserve uncertainty. If evidence is missing, mark it as missing and ask for the smallest useful clarification.
Write for the intended audience. A lab notebook can be dense; a mentor update should emphasize decisions, evidence, and next steps.

Step 1 - Classify the Report

Identify the report mode:

single-experiment: one run or one controlled comparison
ablation-report: several variants testing one factor
batch-summary: many related runs from a sweep or experiment batch
mentor-update: concise progress report with decision-oriented discussion
paper-section: polished text intended to become part of a paper

Also identify:

audience
output format: Markdown, LaTeX, slide outline, or chat draft
save path, if the user wants a file
expected length
whether figures, tables, configs, logs, or notebooks are available

If the user gives no format, default to Markdown. If they ask for a file and no path is given, use:

docs/reports/experiment_report_YYYY-MM-DD.md

Step 2 - Gather Evidence

Prefer primary evidence over memory.

Look for:

experiment commands or scripts
config files and parameter overrides
random seeds and number of runs
dataset name, split, preprocessing, and sample count
model, method variant, checkpoint, or algorithm version
hardware and runtime if relevant
metrics, logs, result tables, figures, and failure cases
git commit hash or code version, when available

Useful local checks include:

git rev-parse --short HEAD
find . -maxdepth 3 -type f \( -name "*.yaml" -o -name "*.yml" -o -name "*.json" -o -name "*.csv" -o -name "*.md" \)
find . -maxdepth 4 -type f \( -name "*.png" -o -name "*.jpg" -o -name "*.pdf" -o -name "*.svg" \)

If the user only provides informal notes, use them but label missing reproducibility details explicitly.

Step 3 - Extract the Experiment Story

Before drafting, organize the experiment into:

question: what was this experiment trying to learn?
motivation: why does the question matter?
hypothesis: what did we expect and why?
method: what changed compared with the baseline?
controls: what stayed fixed?
measurement: which metrics answer the question?
outcome: what happened?
interpretation: what does the outcome suggest?
decision: what should happen next?

For ablations or sweeps, make the independent variable explicit and keep the comparison fair.

Required Report Structure

Use these sections unless the user requests a different format:

# [Experiment Report Title]

## Summary
## 1. Experiment Motivation
## 2. Experiment Setup
## 3. Core Algorithm or Method
## 4. Metrics
## 5. Results
## 6. How to Read the Figures
## 7. Interpretation
## 8. Conclusion and Discussion
## 9. Limitations and Caveats
## 10. Next Steps
## Reproducibility Notes

If there is no core algorithm, write "Not applicable" and briefly explain whether the experiment changes data, hyperparameters, evaluation, infrastructure, or analysis instead.

If there are no figures, omit "How to Read the Figures" or replace it with "How to Read the Tables" when tables are the main evidence.

Section Guidance

Summary

Write 3-6 bullets covering:

experiment question
most important setup details
headline result
interpretation
recommended next step

1. Experiment Motivation

Explain the research or engineering reason for the experiment:

problem being tested
expected mechanism
why the result would affect the project
what decision the experiment supports

2. Experiment Setup

Include enough detail to reproduce or audit the run:

dataset, split, preprocessing
baseline and compared variants
key hyperparameters and parameter changes
training/evaluation command, config file, or run ID
random seed and number of trials
hardware, runtime, and code version when relevant

Use a table for parameters when there are more than five important settings.

3. Core Algorithm or Method

Describe the algorithm only at the level needed to understand the experiment:

what input it consumes
what output it produces
key steps or objective
what is new or different from the baseline
complexity, assumptions, or implementation details that affect interpretation

Do not over-explain standard background unless the audience needs it.

4. Metrics

For each metric, explain:

definition
direction: higher is better, lower is better, or target range
unit
aggregation: mean, median, best checkpoint, final epoch, confidence interval, or standard deviation
why it is relevant to the experiment question

Flag metrics that can conflict with each other.

5. Results

Present results before interpretation.

Use:

tables for exact numeric comparisons
figures for trends, distributions, or qualitative examples
short text for the main deltas

Always identify the baseline and report absolute values plus meaningful deltas when possible.

6. How to Read the Figures

For every figure, explain:

what the figure is meant to show
x-axis: variable, unit, and scale
y-axis: metric, unit, and direction
legend: method names, groups, colors, markers, or line styles
error bars or shaded regions, if present
whether points are individual runs, averages, checkpoints, epochs, or samples
the main visual pattern the reader should notice

If an axis is log-scaled, normalized, clipped, or unitless, say so explicitly.

7. Interpretation

Connect the observed results back to the motivation:

whether the hypothesis was supported
what changed relative to the baseline
likely explanation
alternative explanations
surprising or negative results
whether the evidence is strong enough to act on

Use cautious wording when there is only one seed, weak statistical evidence, or missing controls.

8. Conclusion and Discussion

State the practical conclusion:

what we learned
what decision this supports
whether to keep, reject, or further test the method
how the result affects the broader project

9. Limitations and Caveats

Include risks that could change the conclusion:

small number of seeds
narrow dataset or subset
missing baseline
unstable training
possible implementation bug
metric mismatch
data leakage or evaluation contamination risk
hardware/runtime constraints

10. Next Steps

Recommend concrete follow-ups:

one immediate verification step
one high-value extension
one cleanup or documentation task when needed

Tie each next step to the uncertainty it resolves.

Project Memory Writeback

If the project uses research-project-memory, write back the result after the report is drafted:

memory/evidence-board.md: completed EVD-### summary, source paths, linked claim IDs, limitations, and certainty
memory/claim-board.md: mark claims as supported, weakened, revised, unsupported, or cut based on the observed result
memory/risk-board.md: close mitigated risks or add new risks exposed by the result
memory/action-board.md: next steps from the report, including rerun, write, revise-method, park, or kill decisions
memory/current-status.md: latest reliable experiment state and next session entry point
worktree .agent/worktree-status.md: latest result and exit condition if the experiment belongs to a worktree

Do not write an interpretation as a measured fact. Use observed for metrics from logs/tables and inferred for explanations.

Output Quality Checklist

Before finalizing, check that:

the report states the experiment question and decision context
all key parameters and baselines are named
metrics include direction and units
results are separated from interpretation
every figure/table has reading guidance
missing evidence is labeled instead of invented
conclusions do not overclaim beyond the data
next steps are actionable
project memory is updated when present and relevant