string-database
Query STRING API for protein-protein interactions (59M proteins, 20B interactions). Network analysis, GO/KEGG enrichment, interaction discovery, 5000+ species, for systems biology.
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o string-database.zip https://jpskill.com/download/18558.zip && unzip -o string-database.zip && rm string-database.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/18558.zip -OutFile "$d\string-database.zip"; Expand-Archive "$d\string-database.zip" -DestinationPath $d -Force; ri "$d\string-database.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
string-database.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
string-databaseフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-18
- 取得日時
- 2026-05-18
- 同梱ファイル
- 3
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
STRING データベース
概要
STRING は、既知および予測されたタンパク質間相互作用の包括的なデータベースであり、5900万のタンパク質と5000以上の生物にわたる200億以上の相互作用を網羅しています。相互作用ネットワークのクエリ、機能エンリッチメントの実行、システム生物学および経路解析のための REST API を介したパートナーの発見が可能です。
この Skill を使用する場面
この Skill は、以下の場合に使用する必要があります。
- 単一または複数のタンパク質のタンパク質間相互作用ネットワークを取得する場合
- タンパク質リストに対して機能エンリッチメント分析(GO、KEGG、Pfam)を実行する場合
- 相互作用パートナーを発見し、タンパク質ネットワークを拡張する場合
- タンパク質が有意にエンリッチされた機能モジュールを形成するかどうかをテストする場合
- 証拠に基づいた色分けでネットワークの可視化を生成する場合
- ホモロジーおよびタンパク質ファミリーの関係を分析する場合
- 種を越えたタンパク質間相互作用の比較を行う場合
- ハブタンパク質とネットワーク接続パターンを特定する場合
クイックスタート
この Skill は以下を提供します。
- すべての STRING REST API 操作のための Python ヘルパー関数 (
scripts/string_api.py) - 詳細な API 仕様を含む包括的なリファレンスドキュメント (
references/string_reference.md)
ユーザーが STRING データを要求する場合、必要な操作を判断し、scripts/string_api.py から適切な関数を使用します。
コア操作
1. 識別子マッピング (string_map_ids)
遺伝子名、タンパク質名、および外部 ID を STRING 識別子に変換します。
使用する場面: STRING 分析を開始する際、タンパク質名を検証する際、正規の識別子を見つける際。
使用法:
from scripts.string_api import string_map_ids
# 単一のタンパク質をマッピング
result = string_map_ids('TP53', species=9606)
# 複数のタンパク質をマッピング
result = string_map_ids(['TP53', 'BRCA1', 'EGFR', 'MDM2'], species=9606)
# クエリごとに複数のマッチがある場合のマッピング
result = string_map_ids('p53', species=9606, limit=5)
パラメータ:
species: NCBI タクソン ID (9606 = ヒト, 10090 = マウス, 7227 = ハエ)limit: 識別子ごとのマッチ数 (デフォルト: 1)echo_query: 出力にクエリ用語を含める (デフォルト: 1)
ベストプラクティス: 後続のクエリを高速化するために、常に最初に識別子をマッピングしてください。
2. ネットワーク取得 (string_network)
タンパク質間相互作用ネットワークデータを表形式で取得します。
使用する場面: 相互作用ネットワークを構築する際、接続性を分析する際、相互作用の証拠を取得する際。
使用法:
from scripts.string_api import string_network
# 単一のタンパク質のネットワークを取得
network = string_network('9606.ENSP00000269305', species=9606)
# 複数のタンパク質のネットワークを取得
proteins = ['9606.ENSP00000269305', '9606.ENSP00000275493']
network = string_network(proteins, required_score=700)
# 追加のインタラクターでネットワークを拡張
network = string_network('TP53', species=9606, add_nodes=10, required_score=400)
# 物理的な相互作用のみ
network = string_network('TP53', species=9606, network_type='physical')
パラメータ:
required_score: 信頼性閾値 (0-1000)- 150: 低い信頼性 (探索的)
- 400: 中程度の信頼性 (デフォルト、標準分析)
- 700: 高い信頼性 (保守的)
- 900: 最高の信頼性 (非常に厳格)
network_type:'functional'(すべての証拠、デフォルト) または'physical'(直接結合のみ)add_nodes: 最も接続されている N 個のタンパク質を追加 (0-10)
出力カラム: 相互作用ペア、信頼性スコア、および個々の証拠スコア (neighborhood, fusion, coexpression, experimental, database, text-mining)。
3. ネットワーク可視化 (string_network_image)
ネットワークの可視化を PNG 画像として生成します。
使用する場面: 図の作成、視覚的な探索、プレゼンテーション。
使用法:
from scripts.string_api import string_network_image
# ネットワーク画像を取得
proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1']
img_data = string_network_image(proteins, species=9606, required_score=700)
# 画像を保存
with open('network.png', 'wb') as f:
f.write(img_data)
# 証拠で色分けされたネットワーク
img = string_network_image(proteins, species=9606, network_flavor='evidence')
# 信頼性に基づいた可視化
img = string_network_image(proteins, species=9606, network_flavor='confidence')
# アクションネットワーク (活性化/阻害)
img = string_network_image(proteins, species=9606, network_flavor='actions')
ネットワークフレーバー:
'evidence': 色付きの線は証拠の種類を示します (デフォルト)'confidence': 線の太さは信頼性を表します'actions': 活性化/阻害の関係を示します
4. 相互作用パートナー (string_interaction_partners)
指定されたタンパク質と相互作用するすべてのタンパク質を見つけます。
使用する場面: 新規の相互作用を発見する際、ハブタンパク質を見つける際、ネットワークを拡張する際。
使用法:
from scripts.string_api import string_interaction_partners
# TP53 の上位 10 個のインタラクターを取得
partners = string_interaction_partners('TP53', species=9606, limit=10)
# 高信頼性のインタラクターを取得
partners = string_interaction_partners('TP53', species=9606,
limit=20, required_score=700)
# 複数のタンパク質のインタラクターを見つける
partners = string_interaction_partners(['TP53', 'MDM2'],
species=9606, limit=15)
パラメータ:
limit: 返されるパートナーの最大数 (デフォルト: 10)required_score: 信頼性閾値 (0-1000)
ユースケース:
- ハブタンパク質の特定
- シードタンパク質からのネットワーク拡張
- 間接的な接続の発見
5. 機能エンリッチメント (string_enrichment)
Gene Ontology、KEGG パスウェイ、Pfam ドメインなど全体でエンリッチメント分析を実行します。
使用する場面: タンパク質リストを解釈する際、パスウェイ分析を行う際、機能特性評価を行う際、生物学的プロセスを理解する際。
使用法:
from scripts.string_enrichment import string_enrichment
# タンパク質リストのエンリッチメント
proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1', 'ATR', 'TP73']
enrichment = string_enrichment(proteins, species=9606)
# 結果を解析して有意な用語を見つける
import pan 📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
STRING Database
Overview
STRING is a comprehensive database of known and predicted protein-protein interactions covering 59M proteins and 20B+ interactions across 5000+ organisms. Query interaction networks, perform functional enrichment, discover partners via REST API for systems biology and pathway analysis.
When to Use This Skill
This skill should be used when:
- Retrieving protein-protein interaction networks for single or multiple proteins
- Performing functional enrichment analysis (GO, KEGG, Pfam) on protein lists
- Discovering interaction partners and expanding protein networks
- Testing if proteins form significantly enriched functional modules
- Generating network visualizations with evidence-based coloring
- Analyzing homology and protein family relationships
- Conducting cross-species protein interaction comparisons
- Identifying hub proteins and network connectivity patterns
Quick Start
The skill provides:
- Python helper functions (
scripts/string_api.py) for all STRING REST API operations - Comprehensive reference documentation (
references/string_reference.md) with detailed API specifications
When users request STRING data, determine which operation is needed and use the appropriate function from scripts/string_api.py.
Core Operations
1. Identifier Mapping (string_map_ids)
Convert gene names, protein names, and external IDs to STRING identifiers.
When to use: Starting any STRING analysis, validating protein names, finding canonical identifiers.
Usage:
from scripts.string_api import string_map_ids
# Map single protein
result = string_map_ids('TP53', species=9606)
# Map multiple proteins
result = string_map_ids(['TP53', 'BRCA1', 'EGFR', 'MDM2'], species=9606)
# Map with multiple matches per query
result = string_map_ids('p53', species=9606, limit=5)
Parameters:
species: NCBI taxon ID (9606 = human, 10090 = mouse, 7227 = fly)limit: Number of matches per identifier (default: 1)echo_query: Include query term in output (default: 1)
Best practice: Always map identifiers first for faster subsequent queries.
2. Network Retrieval (string_network)
Get protein-protein interaction network data in tabular format.
When to use: Building interaction networks, analyzing connectivity, retrieving interaction evidence.
Usage:
from scripts.string_api import string_network
# Get network for single protein
network = string_network('9606.ENSP00000269305', species=9606)
# Get network with multiple proteins
proteins = ['9606.ENSP00000269305', '9606.ENSP00000275493']
network = string_network(proteins, required_score=700)
# Expand network with additional interactors
network = string_network('TP53', species=9606, add_nodes=10, required_score=400)
# Physical interactions only
network = string_network('TP53', species=9606, network_type='physical')
Parameters:
required_score: Confidence threshold (0-1000)- 150: low confidence (exploratory)
- 400: medium confidence (default, standard analysis)
- 700: high confidence (conservative)
- 900: highest confidence (very stringent)
network_type:'functional'(all evidence, default) or'physical'(direct binding only)add_nodes: Add N most connected proteins (0-10)
Output columns: Interaction pairs, confidence scores, and individual evidence scores (neighborhood, fusion, coexpression, experimental, database, text-mining).
3. Network Visualization (string_network_image)
Generate network visualization as PNG image.
When to use: Creating figures, visual exploration, presentations.
Usage:
from scripts.string_api import string_network_image
# Get network image
proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1']
img_data = string_network_image(proteins, species=9606, required_score=700)
# Save image
with open('network.png', 'wb') as f:
f.write(img_data)
# Evidence-colored network
img = string_network_image(proteins, species=9606, network_flavor='evidence')
# Confidence-based visualization
img = string_network_image(proteins, species=9606, network_flavor='confidence')
# Actions network (activation/inhibition)
img = string_network_image(proteins, species=9606, network_flavor='actions')
Network flavors:
'evidence': Colored lines show evidence types (default)'confidence': Line thickness represents confidence'actions': Shows activating/inhibiting relationships
4. Interaction Partners (string_interaction_partners)
Find all proteins that interact with given protein(s).
When to use: Discovering novel interactions, finding hub proteins, expanding networks.
Usage:
from scripts.string_api import string_interaction_partners
# Get top 10 interactors of TP53
partners = string_interaction_partners('TP53', species=9606, limit=10)
# Get high-confidence interactors
partners = string_interaction_partners('TP53', species=9606,
limit=20, required_score=700)
# Find interactors for multiple proteins
partners = string_interaction_partners(['TP53', 'MDM2'],
species=9606, limit=15)
Parameters:
limit: Maximum number of partners to return (default: 10)required_score: Confidence threshold (0-1000)
Use cases:
- Hub protein identification
- Network expansion from seed proteins
- Discovering indirect connections
5. Functional Enrichment (string_enrichment)
Perform enrichment analysis across Gene Ontology, KEGG pathways, Pfam domains, and more.
When to use: Interpreting protein lists, pathway analysis, functional characterization, understanding biological processes.
Usage:
from scripts.string_enrichment import string_enrichment
# Enrichment for a protein list
proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1', 'ATR', 'TP73']
enrichment = string_enrichment(proteins, species=9606)
# Parse results to find significant terms
import pandas as pd
df = pd.read_csv(io.StringIO(enrichment), sep='\t')
significant = df[df['fdr'] < 0.05]
Enrichment categories:
- Gene Ontology: Biological Process, Molecular Function, Cellular Component
- KEGG Pathways: Metabolic and signaling pathways
- Pfam: Protein domains
- InterPro: Protein families and domains
- SMART: Domain architecture
- UniProt Keywords: Curated functional keywords
Output columns:
category: Annotation database (e.g., "KEGG Pathways", "GO Biological Process")term: Term identifierdescription: Human-readable term descriptionnumber_of_genes: Input proteins with this annotationp_value: Uncorrected enrichment p-valuefdr: False discovery rate (corrected p-value)
Statistical method: Fisher's exact test with Benjamini-Hochberg FDR correction.
Interpretation: FDR < 0.05 indicates statistically significant enrichment.
6. PPI Enrichment (string_ppi_enrichment)
Test if a protein network has significantly more interactions than expected by chance.
When to use: Validating if proteins form functional module, testing network connectivity.
Usage:
from scripts.string_api import string_ppi_enrichment
import json
# Test network connectivity
proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1']
result = string_ppi_enrichment(proteins, species=9606, required_score=400)
# Parse JSON result
data = json.loads(result)
print(f"Observed edges: {data['number_of_edges']}")
print(f"Expected edges: {data['expected_number_of_edges']}")
print(f"P-value: {data['p_value']}")
Output fields:
number_of_nodes: Proteins in networknumber_of_edges: Observed interactionsexpected_number_of_edges: Expected in random networkp_value: Statistical significance
Interpretation:
- p-value < 0.05: Network is significantly enriched (proteins likely form functional module)
- p-value ≥ 0.05: No significant enrichment (proteins may be unrelated)
7. Homology Scores (string_homology)
Retrieve protein similarity and homology information.
When to use: Identifying protein families, paralog analysis, cross-species comparisons.
Usage:
from scripts.string_api import string_homology
# Get homology between proteins
proteins = ['TP53', 'TP63', 'TP73'] # p53 family
homology = string_homology(proteins, species=9606)
Use cases:
- Protein family identification
- Paralog discovery
- Evolutionary analysis
8. Version Information (string_version)
Get current STRING database version.
When to use: Ensuring reproducibility, documenting methods.
Usage:
from scripts.string_api import string_version
version = string_version()
print(f"STRING version: {version}")
Common Analysis Workflows
Workflow 1: Protein List Analysis (Standard Workflow)
Use case: Analyze a list of proteins from experiment (e.g., differential expression, proteomics).
from scripts.string_api import (string_map_ids, string_network,
string_enrichment, string_ppi_enrichment,
string_network_image)
# Step 1: Map gene names to STRING IDs
gene_list = ['TP53', 'BRCA1', 'ATM', 'CHEK2', 'MDM2', 'ATR', 'BRCA2']
mapping = string_map_ids(gene_list, species=9606)
# Step 2: Get interaction network
network = string_network(gene_list, species=9606, required_score=400)
# Step 3: Test if network is enriched
ppi_result = string_ppi_enrichment(gene_list, species=9606)
# Step 4: Perform functional enrichment
enrichment = string_enrichment(gene_list, species=9606)
# Step 5: Generate network visualization
img = string_network_image(gene_list, species=9606,
network_flavor='evidence', required_score=400)
with open('protein_network.png', 'wb') as f:
f.write(img)
# Step 6: Parse and interpret results
Workflow 2: Single Protein Investigation
Use case: Deep dive into one protein's interactions and partners.
from scripts.string_api import (string_map_ids, string_interaction_partners,
string_network_image)
# Step 1: Map protein name
protein = 'TP53'
mapping = string_map_ids(protein, species=9606)
# Step 2: Get all interaction partners
partners = string_interaction_partners(protein, species=9606,
limit=20, required_score=700)
# Step 3: Visualize expanded network
img = string_network_image(protein, species=9606, add_nodes=15,
network_flavor='confidence', required_score=700)
with open('tp53_network.png', 'wb') as f:
f.write(img)
Workflow 3: Pathway-Centric Analysis
Use case: Identify and visualize proteins in a specific biological pathway.
from scripts.string_api import string_enrichment, string_network
# Step 1: Start with known pathway proteins
dna_repair_proteins = ['TP53', 'ATM', 'ATR', 'CHEK1', 'CHEK2',
'BRCA1', 'BRCA2', 'RAD51', 'XRCC1']
# Step 2: Get network
network = string_network(dna_repair_proteins, species=9606,
required_score=700, add_nodes=5)
# Step 3: Enrichment to confirm pathway annotation
enrichment = string_enrichment(dna_repair_proteins, species=9606)
# Step 4: Parse enrichment for DNA repair pathways
import pandas as pd
import io
df = pd.read_csv(io.StringIO(enrichment), sep='\t')
dna_repair = df[df['description'].str.contains('DNA repair', case=False)]
Workflow 4: Cross-Species Analysis
Use case: Compare protein interactions across different organisms.
from scripts.string_api import string_network
# Human network
human_network = string_network('TP53', species=9606, required_score=700)
# Mouse network
mouse_network = string_network('Trp53', species=10090, required_score=700)
# Yeast network (if ortholog exists)
yeast_network = string_network('gene_name', species=4932, required_score=700)
Workflow 5: Network Expansion and Discovery
Use case: Start with seed proteins and discover connected functional modules.
from scripts.string_api import (string_interaction_partners, string_network,
string_enrichment)
# Step 1: Start with seed protein(s)
seed_proteins = ['TP53']
# Step 2: Get first-degree interactors
partners = string_interaction_partners(seed_proteins, species=9606,
limit=30, required_score=700)
# Step 3: Parse partners to get protein list
import pandas as pd
import io
df = pd.read_csv(io.StringIO(partners), sep='\t')
all_proteins = list(set(df['preferredName_A'].tolist() +
df['preferredName_B'].tolist()))
# Step 4: Perform enrichment on expanded network
enrichment = string_enrichment(all_proteins[:50], species=9606)
# Step 5: Filter for interesting functional modules
enrichment_df = pd.read_csv(io.StringIO(enrichment), sep='\t')
modules = enrichment_df[enrichment_df['fdr'] < 0.001]
Common Species
When specifying species, use NCBI taxon IDs:
| Organism | Common Name | Taxon ID |
|---|---|---|
| Homo sapiens | Human | 9606 |
| Mus musculus | Mouse | 10090 |
| Rattus norvegicus | Rat | 10116 |
| Drosophila melanogaster | Fruit fly | 7227 |
| Caenorhabditis elegans | C. elegans | 6239 |
| Saccharomyces cerevisiae | Yeast | 4932 |
| Arabidopsis thaliana | Thale cress | 3702 |
| Escherichia coli | E. coli | 511145 |
| Danio rerio | Zebrafish | 7955 |
Full list available at: https://string-db.org/cgi/input?input_page_active_form=organisms
Understanding Confidence Scores
STRING provides combined confidence scores (0-1000) integrating multiple evidence types:
Evidence Channels
- Neighborhood (nscore): Conserved genomic neighborhood across species
- Fusion (fscore): Gene fusion events
- Phylogenetic Profile (pscore): Co-occurrence patterns across species
- Coexpression (ascore): Correlated RNA expression
- Experimental (escore): Biochemical and genetic experiments
- Database (dscore): Curated pathway and complex databases
- Text-mining (tscore): Literature co-occurrence and NLP extraction
Recommended Thresholds
Choose threshold based on analysis goals:
- 150 (low confidence): Exploratory analysis, hypothesis generation
- 400 (medium confidence): Standard analysis, balanced sensitivity/specificity
- 700 (high confidence): Conservative analysis, high-confidence interactions
- 900 (highest confidence): Very stringent, experimental evidence preferred
Trade-offs:
- Lower thresholds: More interactions (higher recall, more false positives)
- Higher thresholds: Fewer interactions (higher precision, more false negatives)
Network Types
Functional Networks (Default)
Includes all evidence types (experimental, computational, text-mining). Represents proteins that are functionally associated, even without direct physical binding.
When to use:
- Pathway analysis
- Functional enrichment studies
- Systems biology
- Most general analyses
Physical Networks
Only includes evidence for direct physical binding (experimental data and database annotations for physical interactions).
When to use:
- Structural biology studies
- Protein complex analysis
- Direct binding validation
- When physical contact is required
API Best Practices
- Always map identifiers first: Use
string_map_ids()before other operations for faster queries - Use STRING IDs when possible: Use format
9606.ENSP00000269305instead of gene names - Specify species for networks >10 proteins: Required for accurate results
- Respect rate limits: Wait 1 second between API calls
- Use versioned URLs for reproducibility: Available in reference documentation
- Handle errors gracefully: Check for "Error:" prefix in returned strings
- Choose appropriate confidence thresholds: Match threshold to analysis goals
Detailed Reference
For comprehensive API documentation, complete parameter lists, output formats, and advanced usage, refer to references/string_reference.md. This includes:
- Complete API endpoint specifications
- All supported output formats (TSV, JSON, XML, PSI-MI)
- Advanced features (bulk upload, values/ranks enrichment)
- Error handling and troubleshooting
- Integration with other tools (Cytoscape, R, Python libraries)
- Data license and citation information
Troubleshooting
No proteins found:
- Verify species parameter matches identifiers
- Try mapping identifiers first with
string_map_ids() - Check for typos in protein names
Empty network results:
- Lower confidence threshold (
required_score) - Check if proteins actually interact
- Verify species is correct
Timeout or slow queries:
- Reduce number of input proteins
- Use STRING IDs instead of gene names
- Split large queries into batches
"Species required" error:
- Add
speciesparameter for networks with >10 proteins - Always include species for consistency
Results look unexpected:
- Check STRING version with
string_version() - Verify network_type is appropriate (functional vs physical)
- Review confidence threshold selection
Additional Resources
For proteome-scale analysis or complete species network upload:
- Visit https://string-db.org
- Use "Upload proteome" feature
- STRING will generate complete interaction network and predict functions
For bulk downloads of complete datasets:
- Download page: https://string-db.org/cgi/download
- Includes complete interaction files, protein annotations, and pathway mappings
Data License
STRING data is freely available under Creative Commons BY 4.0 license:
- Free for academic and commercial use
- Attribution required when publishing
- Cite latest STRING publication
Citation
When using STRING in publications, cite the most recent publication from: https://string-db.org/cgi/about
同梱ファイル
※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。
- 📄 SKILL.md (18,168 bytes)
- 📎 references/string_reference.md (13,697 bytes)
- 📎 scripts/string_api.py (11,927 bytes)