📄 ドキュメントコミュニティ

pdf-2

Advanced PDF v2: OCR, form extraction, table parsing, digital signatures, merge/split, annotation

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o pdf-2.zip https://jpskill.com/download/22139.zip && unzip -o pdf-2.zip && rm pdf-2.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/22139.zip -OutFile "$d\pdf-2.zip"; Expand-Archive "$d\pdf-2.zip" -DestinationPath $d -Force; ri "$d\pdf-2.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して pdf-2.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → pdf-2 フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

pdf-2

目的

このスキルは、画像からのテキスト抽出のためのOCR、フォームデータ抽出、構造化形式へのテーブル解析、デジタル署名の検証と追加、PDFの結合/分割、注釈処理を含む高度なPDF処理を可能にします。OpenClawにおけるドキュメントワークフローを自動化するために設計されています。

使用場面

このスキルは、スキャンされたPDFや複雑なPDFを扱うタスク、例えばテーブルやフォームを含む請求書からのデータ抽出、署名付きの法的文書の検証、ファイルを結合してレポートを作成する際などに使用します。検索不可能なPDF（OCR経由など）を扱う場合や、ドキュメントパイプラインのために他のツールとの統合が必要な場合に適用してください。

主な機能

OCR: Tesseractエンジンを使用してテキストを抽出します。--langフラグ（例: --lang engで英語）を介して言語をサポートします。
フォーム抽出: PDFフォームをJSONに解析します。--fieldsフラグを使用してテキストボックスやチェックボックスなどのフィールドを抽出します。
テーブル解析: テーブルを検出してCSVに変換します。--layout autoまたは--layout gridでレイアウトを指定します。
デジタル署名: --verifyフラグで署名を検証します。証明書パスを指定して--signを使用し、新しい署名を追加します。
結合/分割: --inputsフラグを介して複数のPDFを結合します。--pages 1-5のようにページ範囲で分割します。
注釈: --extract-annotationsまたは--add-annotation type=highlight text="Note"を使用して注釈（ハイライトなど）を抽出または追加します。

使用パターン

ワンショットタスクにはCLI経由で、スクリプト化されたワークフローにはAPI経由で呼び出します。パイプでコマンドを連結します。例えば、OCR出力をテキストプロセッサに送るなどです。バッチ処理には、スクリプトでループを使用します。常に明示的に入力/出力パスを指定してください。大きなファイルを扱う場合は、拡張操作のために--timeout 300を設定します。JSON設定ファイルでデフォルトを設定します。例えば、{"default_lang": "eng"}です。

一般的なコマンド/API

CLIコマンド:

OCR: openclaw pdf-2 ocr --input path/to/file.pdf --lang eng --dpi 300 --output extracted.txt
フォーム抽出: openclaw pdf-2 extract-form --file form.pdf --fields name,address --output data.json
テーブル解析: openclaw pdf-2 parse-table --input table.pdf --page 2 --output tables.csv
署名検証: openclaw pdf-2 verify-signature --file signed.pdf --cert path/to/cert.pem
PDF結合: openclaw pdf-2 merge --inputs file1.pdf file2.pdf --output merged.pdf
PDF分割: openclaw pdf-2 split --input original.pdf --pages 1-3 --output split_folder/

APIエンドポイント:

OCR: POST /api/pdf-2/ocr with JSON body: {"file": "base64encoded_string", "lang": "eng", "dpi": 300}
フォーム抽出: POST /api/pdf-2/extract-form with body: {"file": "base64encoded", "fields": ["name", "address"]}
テーブル解析: POST /api/pdf-2/parse-table with body: {"file": "base64encoded", "page": 2}

コードスニペット:

Python API呼び出し（OCR用）:

import requests; import os; import base64
api_key = os.environ['OPENCLAW_API_KEY']
response = requests.post('https://api.openclaw.ai/api/pdf-2/ocr', headers={'Authorization': f'Bearer {api_key}'}, json={'file': base64.b64encode(open('file.pdf', 'rb').read()).decode()})

シェルスクリプトでのCLI:

export OPENCLAW_API_KEY=your_key
openclaw pdf-2 ocr --input input.pdf --output output.txt

設定形式: APIリクエストまたはCLI設定にはJSONを使用します。例えば、{"lang": "eng", "timeout": 60}です。.jsonファイルとして保存し、--config path/to/config.jsonで渡します。

統合に関する注意点

認証には環境変数が必要です。API呼び出しにはOPENCLAW_API_KEY=your_api_keyを設定してください。CLIの場合、OpenClaw CLIがインストールされ（例: npm install openclaw経由）、認証されていることを確認してください。OCR結果をNLPツールに送信するなど、出力をパイプで他のサービスと統合します。APIボディでファイルをbase64エンコードしてアップロードを処理します。Webhookの場合、リクエストでコールバックURLを指定して/api/pdf-2/webhookエンドポイントを使用します。

エラー処理

APIエラーについてはHTTPステータスコードを確認し（例: 不正な入力は400、認証されていない場合は401）、JSONレスポンスを解析して{"error": "File not found"}のような詳細を確認します。CLIの場合、エラーは終了コード（例: 失敗は終了コード1）とともにstderrに出力されます。一般的な問題に対処します。APIタイムアウトにはtry-exceptを使用します。例えばPythonでは、try: requests.post(...) except requests.exceptions.Timeout: print("Timeout occurred")です。コマンドの前にos.path.exists()でファイルが存在するかどうかを確認するなど、入力を検証します。一時的なエラーは指数関数的バックオフで再試行します。

具体的な使用例

スキャンされた請求書PDFからテーブルを抽出し、解析する:
- まず、OCRを実行します: openclaw pdf-2 ocr --input invoice_scanned.pdf --lang eng --output ocr_output.txt
- 次に、テーブルを解析します: openclaw pdf-2 parse-table --input invoice_scanned.pdf --page 1 --output invoice_tables.csv
- これにより、さらなる分析（例: データベースへのインポート）のためのCSVが生成されます。
レポートのためにPDFを結合し、署名を検証する:
- ファイルを結合します: openclaw pdf-2 merge --inputs report_part1.pdf report_part2.pdf --output final_report.pdf
- 署名を検証します: openclaw pdf-2 verify-signature --file final_report.pdf --cert company_cert.pem
- 有効な場合は続行し、そうでない場合はエラーログで処理します。

グラフ関係

依存関係: ocr-1 (コアOCR機能を提供)
統合先: document-1 (一般的なドキュメントの保存と取得用)
競合: なし
関連: pdf-1 (基本的なPDF処理、これは高度なバージョンです)
クラスター: community (共同スキル用の共有クラスター)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

pdf-2

Purpose

This skill enables advanced PDF processing, including OCR for text extraction from images, form data extraction, table parsing into structured formats, digital signature verification and addition, PDF merging/splitting, and annotation handling. It's designed for automating document workflows in OpenClaw.

When to Use

Use this skill for tasks involving scanned or complex PDFs, such as extracting data from invoices with tables and forms, verifying legal documents with signatures, or preparing reports by merging files. Apply it when dealing with non-searchable PDFs (e.g., via OCR) or when integration with other tools is needed for document pipelines.

Key Capabilities

OCR: Uses Tesseract engine to extract text; supports languages via --lang flag (e.g., --lang eng for English).
Form Extraction: Parses PDF forms into JSON; extracts fields like text boxes or checkboxes using --fields flag.
Table Parsing: Detects and converts tables to CSV; specify layout with --layout auto or --layout grid.
Digital Signatures: Verifies signatures with --verify flag; adds new ones using --sign with a certificate path.
Merge/Split: Merges multiple PDFs via --inputs flag; splits by page range, e.g., --pages 1-5.
Annotation: Extracts or adds annotations (e.g., highlights) using --extract-annotations or --add-annotation type=highlight text="Note".

Usage Patterns

Invoke via CLI for one-off tasks or API for scripted workflows. Chain commands with pipes, e.g., OCR output to a text processor. For batch processing, use loops in scripts. Always specify input/output paths explicitly. If handling large files, set --timeout 300 for extended operations. Configure defaults in a JSON config file, e.g., {"default_lang": "eng"}.

Common Commands/API

CLI Commands:

OCR: openclaw pdf-2 ocr --input path/to/file.pdf --lang eng --dpi 300 --output extracted.txt
Form Extraction: openclaw pdf-2 extract-form --file form.pdf --fields name,address --output data.json
Table Parsing: openclaw pdf-2 parse-table --input table.pdf --page 2 --output tables.csv
Signature Verification: openclaw pdf-2 verify-signature --file signed.pdf --cert path/to/cert.pem
Merge PDFs: openclaw pdf-2 merge --inputs file1.pdf file2.pdf --output merged.pdf
Split PDF: openclaw pdf-2 split --input original.pdf --pages 1-3 --output split_folder/

API Endpoints:

OCR: POST /api/pdf-2/ocr with JSON body: {"file": "base64encoded_string", "lang": "eng", "dpi": 300}
Form Extraction: POST /api/pdf-2/extract-form with body: {"file": "base64encoded", "fields": ["name", "address"]}
Table Parsing: POST /api/pdf-2/parse-table with body: {"file": "base64encoded", "page": 2}

Code Snippets:

Python API call for OCR:

import requests; import os; import base64
api_key = os.environ['OPENCLAW_API_KEY']
response = requests.post('https://api.openclaw.ai/api/pdf-2/ocr', headers={'Authorization': f'Bearer {api_key}'}, json={'file': base64.b64encode(open('file.pdf', 'rb').read()).decode()})

CLI in a shell script:

export OPENCLAW_API_KEY=your_key
openclaw pdf-2 ocr --input input.pdf --output output.txt

Config Formats: Use JSON for API requests or CLI configs, e.g., {"lang": "eng", "timeout": 60}. Save as .json file and pass with --config path/to/config.json.

Integration Notes

Require authentication via environment variable: set OPENCLAW_API_KEY=your_api_key for API calls. For CLI, ensure OpenClaw CLI is installed (e.g., via npm install openclaw) and authenticated. Integrate with other services by piping outputs, e.g., send OCR results to a NLP tool. Handle file uploads by encoding to base64 in API bodies. For webhooks, use the /api/pdf-2/webhook endpoint with a callback URL in requests.

Error Handling

Check HTTP status codes for API errors (e.g., 400 for bad input, 401 for unauthorized); parse JSON response for details like {"error": "File not found"}. For CLI, errors output to stderr with codes (e.g., exit code 1 for failures). Handle common issues: use try-except for API timeouts, e.g., in Python: try: requests.post(...) except requests.exceptions.Timeout: print("Timeout occurred"). Validate inputs before commands, e.g., check if file exists with os.path.exists(). Retry transient errors with exponential backoff.

Concrete Usage Examples

Extracting and parsing tables from a scanned invoice PDF:
- First, perform OCR: openclaw pdf-2 ocr --input invoice_scanned.pdf --lang eng --output ocr_output.txt
- Then, parse tables: openclaw pdf-2 parse-table --input invoice_scanned.pdf --page 1 --output invoice_tables.csv
- This produces a CSV for further analysis, e.g., import into a database.
Merging PDFs and verifying signatures for a report:
- Merge files: openclaw pdf-2 merge --inputs report_part1.pdf report_part2.pdf --output final_report.pdf
- Verify signature: openclaw pdf-2 verify-signature --file final_report.pdf --cert company_cert.pem
- If valid, proceed; otherwise, handle with error logging.

Graph Relationships

Depends on: ocr-1 (provides core OCR functionality)
Integrates with: document-1 (for general document storage and retrieval)
Conflicts with: none
Related to: pdf-1 (basic PDF handling, as this is an advanced version)
Clusters with: community (shared cluster for collaborative skills)