theharvester
Passive email, subdomain, and IP harvesting from public sources using theHarvester. Use when: gathering corporate email lists, enumerating subdomains passively, pre-engagement recon, finding exposed employee contacts without triggering alerts.
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o theharvester.zip https://jpskill.com/download/15475.zip && unzip -o theharvester.zip && rm theharvester.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/15475.zip -OutFile "$d\theharvester.zip"; Expand-Archive "$d\theharvester.zip" -DestinationPath $d -Force; ri "$d\theharvester.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
theharvester.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
theharvesterフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-18
- 取得日時
- 2026-05-18
- 同梱ファイル
- 1
📖 Claude が読む原文 SKILL.md(中身を展開)
この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。
theHarvester
Overview
theHarvester is a passive OSINT tool that aggregates information about a target domain from multiple public sources. It finds email addresses, subdomains, hostnames, and IP ranges without making any direct requests to the target — making it ideal for stealth recon during the pre-engagement phase of penetration tests or OSINT investigations.
Sources include: Google, Bing, DuckDuckGo, LinkedIn, Shodan, Hunter.io, CertSpotter, DNSDumpster, VirusTotal, and more.
Instructions
Step 1: Install theHarvester
# Option 1: pip (in a virtual environment recommended)
pip install theHarvester
# Option 2: Clone from GitHub (most up-to-date)
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
pip install -r requirements/base.txt
# Option 3: Docker
docker pull ghcr.io/laramies/theharvester
docker run ghcr.io/laramies/theharvester -d example.com -b google
Step 2: Basic usage
# Syntax: theHarvester -d <domain> -b <source> [options]
# -d target domain
# -b data source(s)
# -l limit results (default: 500)
# -f output filename (supports XML and JSON)
# -n DNS lookup on discovered hosts
# -v verify host via DNS resolution
# Search a single source
theHarvester -d example.com -b google
# Search all available sources
theHarvester -d example.com -b all
# Limit results, enable DNS lookup, save output
theHarvester -d example.com -b google,bing,linkedin -l 200 -n -f results_example
# Run from cloned repo
python3 theHarvester.py -d example.com -b all -l 500 -f output
Step 3: Choose sources strategically
# Email harvesting — best sources
theHarvester -d example.com -b google,bing,hunter,linkedin
# Subdomain enumeration — best sources
theHarvester -d example.com -b certspotter,dnsdumpster,virustotal,shodan
# Comprehensive (slower, uses all sources)
theHarvester -d example.com -b all -l 1000 -f full_recon_example
# LinkedIn employee discovery (requires LinkedIn API key in api-keys.yaml)
theHarvester -d example.com -b linkedin -l 200
Step 4: Configure API keys
# api-keys.yaml (place in theHarvester directory or specify with -c flag)
apikeys:
hunter:
key: YOUR_HUNTER_IO_KEY
shodan:
key: YOUR_SHODAN_KEY
virustotal:
key: YOUR_VIRUSTOTAL_KEY
binaryedge:
key: YOUR_BINARYEDGE_KEY
fullhunt:
key: YOUR_FULLHUNT_KEY
securityTrails:
key: YOUR_SECURITYTRAILS_KEY
github:
key: YOUR_GITHUB_TOKEN
Step 5: Parse and process output with Python
import json
import subprocess
import re
def run_harvester(domain, sources="google,bing,certspotter,dnsdumpster", limit=500):
"""Run theHarvester and return parsed results."""
output_file = f"harvester_{domain.replace('.', '_')}"
cmd = [
"theHarvester",
"-d", domain,
"-b", sources,
"-l", str(limit),
"-f", output_file,
]
print(f"Running: {' '.join(cmd)}")
result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
print(result.stdout)
# Parse JSON output
json_file = f"{output_file}.json"
try:
with open(json_file) as f:
data = json.load(f)
return data
except FileNotFoundError:
# Fall back to parsing stdout
return parse_stdout(result.stdout)
def parse_stdout(output):
"""Extract emails, hosts, and IPs from raw stdout."""
emails = set(re.findall(r'[\w\.-]+@[\w\.-]+\.\w+', output))
# Filter out false positives
emails = {e for e in emails if not e.endswith(('.png', '.jpg', '.css', '.js'))}
hosts = set(re.findall(r'[\w\.-]+\.\w{2,}', output))
ips = set(re.findall(r'\b(?:\d{1,3}\.){3}\d{1,3}\b', output))
return {"emails": list(emails), "hosts": list(hosts), "ips": list(ips)}
def deduplicate_and_report(data, domain):
"""Clean and summarize harvested data."""
emails = sorted(set(data.get("emails", [])))
hosts = sorted(set(data.get("hosts", [])))
ips = sorted(set(data.get("ips", [])))
# Filter to target domain
domain_emails = [e for e in emails if domain in e]
domain_hosts = [h for h in hosts if domain in h]
print(f"\n=== Harvest Report: {domain} ===")
print(f"Emails found: {len(domain_emails)}")
print(f"Subdomains: {len(domain_hosts)}")
print(f"IP addresses: {len(ips)}")
if domain_emails:
print("\nEmails:")
for e in domain_emails[:20]:
print(f" {e}")
if domain_hosts:
print("\nSubdomains:")
for h in domain_hosts[:20]:
print(f" {h}")
return {
"emails": domain_emails,
"subdomains": domain_hosts,
"ips": ips,
}
# Usage
results = run_harvester("target-company.com", sources="google,bing,certspotter,hunter")
clean = deduplicate_and_report(results, "target-company.com")
# Save cleaned results
with open("clean_results.json", "w") as f:
json.dump(clean, f, indent=2)
Step 6: Combine with other tools
# Pass discovered subdomains to nmap (only with explicit authorization)
theHarvester -d example.com -b all -f hosts
cat hosts.json | python3 -c "
import json, sys
data = json.load(sys.stdin)
for host in data.get('hosts', []):
print(host)
" > subdomains.txt
# Feed subdomains into amass for deeper DNS enumeration
cat subdomains.txt | amass enum -df - -passive
# Check emails against breach databases
cat emails.txt | while read email; do
curl -s "https://haveibeenpwned.com/api/v3/breachedaccount/$email" \
-H "hibp-api-key: YOUR_HIBP_KEY"
done
Available Sources Reference
| Source | Data Type | API Key Required |
|---|---|---|
google |
Emails, subdomains | No |
bing |
Emails, subdomains | No |
duckduckgo |
Emails, subdomains | No |
linkedin |
Employees, emails | Optional |
hunter |
Emails | Yes |
certspotter |
Subdomains (SSL certs) | No |
dnsdumpster |
Subdomains, IPs | No |
virustotal |
Subdomains | Yes |
shodan |
IPs, open ports | Yes |
securitytrails |
Subdomains, DNS | Yes |
github |
Emails, code | Yes |
binaryedge |
IPs, services | Yes |
Guidelines
- Always get authorization before running theHarvester against a target — passive does not mean invisible. Data queries may be logged by third-party services.
- Rate limits: Without API keys, theHarvester relies on scraping search engines which may throttle or block requests. Add API keys for reliable results.
- Combine sources: No single source is complete. Use multiple sources and deduplicate.
- Email format detection: Once you have a few emails (e.g.,
jsmith@corp.com,john.smith@corp.com), infer the naming convention and use it to generate a target list. - DNS verification: Always use
-nor-vto verify discovered hosts are live before reporting.