🛠️ Adme Property Predictor
新薬候補が体内でどのように吸収・
📺 まず動画で見る(YouTube)
▶ 【衝撃】最強のAIエージェント「Claude Code」の最新機能・使い方・プログラミングをAIで効率化する超実践術を解説! ↗
※ jpskill.com 編集部が参考用に選んだ動画です。動画の内容と Skill の挙動は厳密には一致しないことがあります。
📜 元の英語説明(参考)
Predict ADME (Absorption, Distribution, Metabolism, Excretion) properties for drug candidates using cheminformatics models and molecular descriptors. Evaluates drug-likeness, bioavailability, and pharmacokinetic profile to guide lead optimization and candidate selection in drug discovery.
🇯🇵 日本人クリエイター向け解説
新薬候補が体内でどのように吸収・
※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o adme-property-predictor.zip https://jpskill.com/download/4278.zip && unzip -o adme-property-predictor.zip && rm adme-property-predictor.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/4278.zip -OutFile "$d\adme-property-predictor.zip"; Expand-Archive "$d\adme-property-predictor.zip" -DestinationPath $d -Force; ri "$d\adme-property-predictor.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
adme-property-predictor.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
adme-property-predictorフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-17
- 取得日時
- 2026-05-18
- 同梱ファイル
- 2
💬 こう話しかけるだけ — サンプルプロンプト
- › Adme Property Predictor を使って、最小構成のサンプルコードを示して
- › Adme Property Predictor の主な使い方と注意点を教えて
- › Adme Property Predictor を既存プロジェクトに組み込む方法を教えて
これをClaude Code に貼るだけで、このSkillが自動発動します。
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
[スキル名] adme-property-predictor
ADME特性予測ツール
概要
検証済みのケモインフォマティクスモデル、分子記述子、および構造-特性関係を用いて、低分子のドラッグライクネスとADME特性を評価する包括的な薬物動態予測ツールです。
主な機能:
- 多特性予測: 吸収、分布、代謝、排泄
- ドラッグライクネススコアリング: リピンスキーの5つの法則、ヴェバーの法則、QEDスコア
- バッチ処理: 化合物ライブラリを効率的に分析
- 構造に基づく洞察: 問題点となるホットスポットと最適化の機会を特定
- 比較分析: 予測されるPKプロファイルに基づいて候補をランク付け
使用場面
✅ このスキルを使用する場面:
- 初期探索段階で、ドラッグライクな特性を持つ化合物ライブラリをスクリーニングする場合
- 予測されるPKに基づいて、リード化合物を優先順位付けして開発を進める場合
- 構造最適化が必要なADME上の問題点を特定する場合
- 最適なADMEプロファイルを持つ候補を選択するためにアナログを比較する場合
- 合成前にバーチャルスクリーニングのヒットをフィルタリングする場合
- 規制当局への事前提出パッケージ用のADMEデータを生成する場合
- 薬物動態学と薬物設計の原則を教育する場合
❌ このスキルを使用しない場面:
- 投与に必要な正確なPKパラメータが必要な場合 → 実験的なPK研究を使用してください
- 生物製剤(抗体、タンパク質)の場合 →
antibody-pk-predictorを使用してください - 複雑な構造を持つ天然物の場合 → 合成低分子でトレーニングされたモデルを使用してください
- 代謝活性化を必要とするプロドラッグの場合 →
prodrug-activation-predictorを使用してください - 臨床投与決定のための予測の場合 → 重要: 実験的検証が必要です
- 毒性または安全性を評価する場合 →
toxicity-structure-alertまたはadmetox-predictorを使用してください
関連スキル:
- 上流:
chemical-structure-converter(構造準備)、lipinski-rule-filter(ルールベースのフィルタリング) - 下流:
drug-candidate-evaluator(統合スコアリング)、molecular-dynamics-sim(詳細な結合)
他のスキルとの統合
上流スキル:
chemical-structure-converter: SMILES、InChI、MOL形式間の変換lipinski-rule-filter: 初期ルールベースのドラッグライクネススクリーニングchemical-structure-converter: 構造ベースの予測のために3Dコンフォマーを生成smiles-de-salter: 分析前に塩の対イオンを除去
下流スキル:
drug-candidate-evaluator: ADMEを含む多パラメータ最適化toxicity-structure-alert: ADMEと並行して安全性を評価target-novelty-scorer: 選択された候補のターゲットの独自性を評価biotech-pitch-deck-narrative: PKデータを含む投資家向け資料を作成
完全なワークフロー:
Chemical Structure Converter (prepare structures) →
Lipinski Rule Filter (initial filtering) →
ADME Property Predictor (this skill, detailed PK) →
Drug Candidate Evaluator (integrated scoring) →
Toxicity Structure Alert (safety check)
コア機能
1. 吸収 (A) 予測
腸管吸収、溶解度、透過性を予測します。
from scripts.adme_predictor import ADMEPredictor
predictor = ADMEPredictor()
# Predict absorption properties
absorption = predictor.predict_absorption(
smiles="CC(=O)Oc1ccccc1C(=O)O", # Aspirin
properties=["all"] # or specific: ["hia", "caco2", "solubility"]
)
print(absorption.summary())
予測される特性: | 特性 | モデル | 単位 | 解釈 | |----------|-------|-------|----------------| | HIA | ML + 物理化学 | % | ヒト腸管吸収; >80%で良好 | | Caco-2 | QSPR | 10⁻⁶ cm/s | 透過性; >70で高、<25で低 | | Solubility | QSPR | mg/mL | 水溶性; >0.1 mg/mLで許容 | | LogS | QSPR | 無単位 | 固有溶解度; >-4で許容 | | Lipinski Pass | ルールベース | ブール値 | 5つのルールすべてに合格 | | Veber Pass | ルールベース | ブール値 | PSA <140、回転結合 <10 |
ベストプラクティス:
- ✅ HIAと溶解度を合わせて考慮する(HIAが高くても溶解度が低い場合は溶解律速)
- ✅ Caco-2は経口吸収予測に優れていますが、BBB透過性には不向きです
- ✅ コンセンサスを得るために、ルールベース(リピンスキー)とMLベースの両方の予測を使用します
- ✅ 生理的pHでの溶解度を確認します(固有溶解度だけでなく)
よくある問題と解決策:
問題: リピンスキーに合格するが溶解度が低い
- 症状: 「5つの法則に合格するが、LogS = -5」
- 解決策: リピンスキーはMWとLogPをチェックし、溶解度を直接チェックしません。明示的な溶解度予測を使用してください。
問題: Caco-2は高い吸収を予測するが、HIAが低い
- 症状: 「Caco-2 = 85(高)だが、HIA = 60%」
- 解決策: モデルは異なるトレーニングセットを持っています。Caco-2はin vitro、HIAはin vivoです。HIAの方が一般的に信頼性が高いです。
2. 分布 (D) 予測
組織分布、タンパク質結合、脳透過性を予測します。
# Predict distribution properties
distribution = predictor.predict_distribution(
smiles="CC(=O)Oc1ccccc1C(=O)O",
properties=["vd", "ppb", "bbb"]
)
# Access specific predictions
vd = distribution.volume_of_distribution
bbb = distribution.blood_brain_barrier
ppb = distribution.plasma_protein_binding
予測される特性: | 特性 | モデル | 単位 | 解釈 | |----------|-------|-------|----------------| | Vd | QSPR | L/kg | 分布容積; 0.1-10が典型的 | | PPB | ML | % | 血漿タンパク質結合; >90%で高、<50%で低 | | BBB | LogBB | 無単位 | 脳透過性; >0.3で透過性あり | | fu | 計算値 | 分数 | 遊離(非結合)画分; 1 - PPB/100 |
ベストプラクティス:
- ✅ 高いPPB(>90%)は高用量を必要とする可能性がありますが、半減期は長くなります
- ✅ 低いVd(<0.3)は主に血漿中、高いVd(>3)は広範な組織分布を示します
- ✅ BBB透過性はCNS薬にとって重要ですが、末梢作用薬では避けるべきです
- ✅ fu(遊離画分)が薬理活性を駆動し、総濃度ではありません
よくある問題と解決策:
問題: 特定のケモタイプに対するBBB予測が信頼できない
- 症状: 「BBBモデルはペプチドに対して矛盾する予測を与える」
- 解決策: モデルは低分子でトレーニングされています。ペプチド、マクロサイクルには専門のBBB予測ツールを使用してください。
📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
ADME Property Predictor
Overview
Comprehensive pharmacokinetic prediction tool that assesses drug-likeness and ADME properties of small molecules using validated cheminformatics models, molecular descriptors, and structure-property relationships.
Key Capabilities:
- Multi-Property Prediction: Absorption, Distribution, Metabolism, Excretion
- Drug-Likeness Scoring: Lipinski's Rule of 5, Veber rules, QED score
- Batch Processing: Analyze compound libraries efficiently
- Structure-Based Insights: Identify liability hotspots and optimization opportunities
- Comparative Analysis: Rank candidates by predicted PK profile
When to Use
✅ Use this skill when:
- Screening compound libraries for drug-like properties in early discovery
- Prioritizing lead compounds for advancement based on predicted PK
- Identifying ADME liabilities requiring structural optimization
- Comparing analogs to select candidates with optimal ADME profiles
- Filtering virtual screening hits before synthesis
- Generating ADME data for regulatory pre-submission packages
- Teaching pharmacokinetics and drug design principles
❌ Do NOT use when:
- Exact PK parameters needed for dosing → Use experimental PK studies
- Biologics (antibodies, proteins) → Use
antibody-pk-predictor - Natural products with complex structures → Models trained on synthetic small molecules
- Prodrugs requiring metabolic activation → Use
prodrug-activation-predictor - Prediction for clinical dosing decisions → CRITICAL: Experimental validation required
- Assessing toxicity or safety → Use
toxicity-structure-alertoradmetox-predictor
Related Skills:
- 上游:
chemical-structure-converter(structure preparation),lipinski-rule-filter(rule-based filtering) - 下游:
drug-candidate-evaluator(integrated scoring),molecular-dynamics-sim(detailed binding)
Integration with Other Skills
Upstream Skills:
chemical-structure-converter: Convert between SMILES, InChI, MOL formatslipinski-rule-filter: Initial rule-based drug-likeness screeningchemical-structure-converter: Generate 3D conformers for structure-based predictionssmiles-de-salter: Remove salt counterions before analysis
Downstream Skills:
drug-candidate-evaluator: Multi-parameter optimization including ADMEtoxicity-structure-alert: Assess safety alongside ADMEtarget-novelty-scorer: Evaluate target uniqueness for selected candidatesbiotech-pitch-deck-narrative: Create investor materials with PK data
Complete Workflow:
Chemical Structure Converter (prepare structures) →
Lipinski Rule Filter (initial filtering) →
ADME Property Predictor (this skill, detailed PK) →
Drug Candidate Evaluator (integrated scoring) →
Toxicity Structure Alert (safety check)
Core Capabilities
1. Absorption (A) Prediction
Predict intestinal absorption, solubility, and permeability:
from scripts.adme_predictor import ADMEPredictor
predictor = ADMEPredictor()
# Predict absorption properties
absorption = predictor.predict_absorption(
smiles="CC(=O)Oc1ccccc1C(=O)O", # Aspirin
properties=["all"] # or specific: ["hia", "caco2", "solubility"]
)
print(absorption.summary())
Predicted Properties: | Property | Model | Units | Interpretation | |----------|-------|-------|----------------| | HIA | ML + physicochemical | % | Human intestinal absorption; >80% good | | Caco-2 | QSPR | 10⁻⁶ cm/s | Permeability; >70 high, <25 low | | Solubility | QSPR | mg/mL | Aqueous solubility; >0.1 mg/mL acceptable | | LogS | QSPR | unitless | Intrinsic solubility; >-4 acceptable | | Lipinski Pass | Rule-based | boolean | Passes all 5 rules | | Veber Pass | Rule-based | boolean | PSA <140, rotatable bonds <10 |
Best Practices:
- ✅ Consider HIA and solubility together (high HIA but low solubility = dissolution-limited)
- ✅ Caco-2 good for oral absorption prediction; poor for BBB penetration
- ✅ Use both rule-based (Lipinski) and ML-based predictions for consensus
- ✅ Check solubility at physiological pH (not just intrinsic)
Common Issues and Solutions:
Issue: Lipinski pass but poor solubility
- Symptom: "Passes Rule of 5 but LogS = -5"
- Solution: Lipinski checks MW and LogP, not solubility directly; use explicit solubility prediction
Issue: Caco-2 predicts high absorption but HIA low
- Symptom: "Caco-2 = 85 (high) but HIA = 60%"
- Solution: Models have different training sets; Caco-2 is in vitro, HIA in vivo; HIA generally more reliable
2. Distribution (D) Prediction
Predict tissue distribution, protein binding, and brain penetration:
# Predict distribution properties
distribution = predictor.predict_distribution(
smiles="CC(=O)Oc1ccccc1C(=O)O",
properties=["vd", "ppb", "bbb"]
)
# Access specific predictions
vd = distribution.volume_of_distribution
bbb = distribution.blood_brain_barrier
ppb = distribution.plasma_protein_binding
Predicted Properties: | Property | Model | Units | Interpretation | |----------|-------|-------|----------------| | Vd | QSPR | L/kg | Volume of distribution; 0.1-10 typical | | PPB | ML | % | Plasma protein binding; >90% high, <50% low | | BBB | LogBB | unitless | Brain penetration; >0.3 penetrant | | fu | Calculated | fraction | Free (unbound) fraction; 1 - PPB/100 |
Best Practices:
- ✅ High PPB (>90%) may require higher doses but longer half-life
- ✅ Low Vd (<0.3) = mainly in plasma; high Vd (>3) = extensive tissue distribution
- ✅ BBB penetration critical for CNS drugs; avoid for peripherally-acting drugs
- ✅ fu (free fraction) drives pharmacological activity, not total concentration
Common Issues and Solutions:
Issue: BBB predictions unreliable for certain chemotypes
- Symptom: "BBB model gives conflicting predictions for peptides"
- Solution: Models trained on small molecules; use specialized BBB predictors for peptides, macrocycles
Issue: PPB overestimated for acidic drugs
- Symptom: "PPB predicted 95% but experimental is 70%"
- Solution: Some models biased toward neutral/basic compounds; check model training set overlap
3. Metabolism (M) Prediction
Predict metabolic stability, CYP interactions, and liability sites:
# Predict metabolism properties
metabolism = predictor.predict_metabolism(
smiles="CC(=O)Oc1ccccc1C(=O)O",
include_site_prediction=True
)
# Check CYP interactions
cyp_profile = metabolism.cyp_profile
stability = metabolism.metabolic_stability
Predicted Properties: | Property | Model | Output | Interpretation | |----------|-------|--------|----------------| | CYP Inhibition | ML | IC50 or class | Potential DDI; <1 μM high risk | | CYP Substrate | Classification | Boolean/Probability | Metabolized by specific CYP | | Stability | ML | T1/2 or class | Microsomal/ hepatocyte stability | | Liability Sites | Reactivity models | Atom indices | Soft spots for metabolism | | MAO Substrate | Classification | Boolean | Monoamine oxidase substrate |
Best Practices:
- ✅ Screen for CYP3A4 inhibition early (most common DDI)
- ✅ Check if compound is CYP substrate (for polymorphism concerns)
- ✅ Identify metabolic hotspots for structural blocking
- ✅ Consider species differences (human vs rodent metabolism)
Common Issues and Solutions:
Issue: False negatives for time-dependent inhibition (TDI)
- Symptom: "No CYP inhibition predicted but TDI observed experimentally"
- Solution: Standard models predict reversible inhibition; use specialized TDI predictors
Issue: Metabolic site prediction shows multiple hotspots
- Symptom: "5 different atoms flagged as metabolic liabilities"
- Solution: Prioritize by reactivity score; consider blocking highest-risk site first
4. Excretion (E) Prediction
Predict clearance routes and elimination kinetics:
# Predict excretion properties
excretion = predictor.predict_excretion(
smiles="CC(=O)Oc1ccccc1C(=O)O",
properties=["clearance", "half_life", "route"]
)
# Access predictions
clearance = excretion.clearance_ml_min_kg
t12 = excretion.half_life_hours
route = excretion.primary_route
Predicted Properties: | Property | Model | Units | Interpretation | |----------|-------|-------|----------------| | CL | QSPR | mL/min/kg | Clearance; <5 low, 5-15 moderate, >15 high | | T1/2 | QSPR | hours | Half-life; 2-8h typical for oral drugs | | Route | Classification | renal/biliary/mixed | Primary excretion pathway | | LogD | QSPR | unitless | Distribution coefficient; affects clearance |
Best Practices:
- ✅ Half-life determines dosing frequency (T1/2 × 5 = time to steady state)
- ✅ Renal clearance predictable for polar compounds; hepatic less predictable
- ✅ High clearance (>15) may require high doses or prodrug approach
- ✅ Very long T1/2 (>24h) good for adherence but risk accumulation
Common Issues and Solutions:
Issue: Clearance predictions highly variable
- Symptom: "Same compound, different models give CL = 5 vs 20 mL/min/kg"
- Solution: Allometry-based methods unreliable for novel scaffolds; use average of multiple models
Issue: Route prediction contradicts structure
- Symptom: "Highly polar compound predicted biliary, expected renal"
- Solution: Check LogP/LogD; polar compounds (<0) usually renal; neutral/lipophilic (>1) usually hepatic
5. Integrated Drug-Likeness Scoring
Overall assessment combining all ADME properties:
# Generate comprehensive drug-likeness score
druglikeness = predictor.calculate_druglikeness(
smiles="CC(=O)Oc1ccccc1C(=O)O",
methods=["qed", "muegge", "golden_triangle"]
)
# Multi-parameter optimization
mpo_score = predictor.mpo_score(
smiles="CC(=O)Oc1ccccc1C(=O)O",
target_profile={"hia": >80, "bbb": <0.3, "t12": "2-8h"}
)
Scoring Methods: | Method | Description | Range | Good Score | |--------|-------------|-------|------------| | QED | Quantitative Estimation of Drug-likeness | 0-1 | >0.6 | | Muegge | Bioavailability score | 0-6 | >4 | | MPO | Multi-Parameter Optimization | 0-10 | >6 |
Best Practices:
- ✅ Use QED as quick overall metric; MPO for property-weighted scoring
- ✅ Don't rely solely on drug-likeness; efficacy and safety equally important
- ✅ Compare to marketed drugs in same class for context
- ✅ Track drug-likeness trends during optimization (should improve)
Common Issues and Solutions:
Issue: Drug-likeness score conflicts with project needs
- Symptom: "CNS drug has low QED (0.5) because high LogP needed for BBB"
- Solution: Drug-likeness rules biased toward oral drugs; use category-specific models (CNS, oncology, etc.)
6. Batch Processing and Library Screening
Analyze compound libraries efficiently:
# Batch process library
results = predictor.batch_predict(
input_file="library.smi", # SMILES file
properties=["all"],
output_format="csv",
n_workers=4 # Parallel processing
)
# Filter by criteria
filtered = results.filter(
lipinski_pass=True,
hia__gt=80,
t12__between=(2, 8)
)
# Rank by multi-parameter score
ranked = results.rank(by="mpo_score", ascending=False)
Best Practices:
- ✅ Process in batches of 1000-10000 for memory efficiency
- ✅ Save intermediate results (crash recovery)
- ✅ Apply filters sequentially (Lipinski first, then detailed ADME)
- ✅ Check property distributions to identify outliers
Common Issues and Solutions:
Issue: Batch processing runs out of memory
- Symptom: "Killed: Out of memory" with 50K compounds
- Solution: Process in chunks; use generators instead of loading all into RAM
Issue: Some compounds fail prediction
- Symptom: "30% of library returns NaN"
- Solution: Check for invalid SMILES, unusual atoms, or molecules outside training set domain
Complete Workflow Example
From SMILES to prioritized candidates:
# Step 1: Predict ADME for single compound
python scripts/main.py \
--smiles "CC(=O)Oc1ccccc1C(=O)O" \
--properties all \
--output aspirin_adme.json
# Step 2: Batch process compound library
python scripts/main.py \
--input library.smi \
--properties absorption,distribution \
--format csv \
--output library_adme.csv
# Step 3: Filter and rank
python scripts/main.py \
--input library_adme.csv \
--filter "lipinski_pass=True,hia>80" \
--rank-by qed \
--top-n 100 \
--output top_candidates.csv
Python API Usage:
from scripts.adme_predictor import ADMEPredictor
from scripts.batch_processor import BatchProcessor
# Initialize
predictor = ADMEPredictor()
batch = BatchProcessor()
# Single compound analysis
aspirin = predictor.predict_all("CC(=O)Oc1ccccc1C(=O)O")
print(f"HIA: {aspirin.absorption.hia}%")
print(f"Half-life: {aspirin.excretion.t12} hours")
# Batch screening
results = batch.process(
input_file="library.smi",
predictor=predictor,
properties=["absorption", "distribution"],
n_workers=4
)
# Filter good candidates
good_candidates = results[
(results.lipinski_pass == True) &
(results.hia > 80) &
(results.bbb < 0.3) &
(results.t12.between(2, 8))
]
Expected Output Files:
output/
├── aspirin_adme.json # Single compound detailed results
├── library_adme.csv # Batch screening results
├── top_candidates.csv # Filtered and ranked candidates
Quality Checklist
Pre-Prediction Checks:
- [ ] SMILES string is valid and canonical
- [ ] Salt forms removed (if analyzing parent compound)
- [ ] Tautomeric state appropriate for physiological pH
- [ ] Stereochemistry specified (if relevant for activity)
During Prediction:
- [ ] Compound within model applicability domain (check similarity to training set)
- [ ] No unusual atoms or functional groups (models trained on typical drug-like space)
- [ ] MW in range 100-800 Da (outside range predictions less reliable)
- [ ] Predictions complete (no missing values for critical properties)
Post-Prediction Verification:
- [ ] Drug-likeness scores in reasonable range (sanity check)
- [ ] Individual properties internally consistent (e.g., high LogP predicts low solubility)
- [ ] CRITICAL: Comparison to experimental data if available (validate model for chemotype)
- [ ] Rankings align with medicinal chemistry intuition
Before Making Decisions:
- [ ] CRITICAL: Predictions are NOT experimental data; use for prioritization only
- [ ] Multiple orthogonal models give consistent results
- [ ] Structural alerts checked (toxicity, reactivity)
- [ ] Top candidates selected for experimental validation
- [ ] Documentation of model versions and confidence intervals
For Regulatory Submissions:
- [ ] Model validation documented (training set, test set performance)
- [ ] Applicability domain clearly defined
- [ ] Prediction uncertainty quantified
- [ ] Experimental confirmation for key predictions
Common Pitfalls
Over-Reliance Issues:
-
❌ Treating predictions as experimental facts → Poor decision making
- ✅ Use predictions for prioritization; experimental validation required for lead optimization
-
❌ Single model dependency → Miss model-specific biases
- ✅ Compare multiple models; consensus predictions more reliable
-
❌ Ignoring prediction confidence → False sense of certainty
- ✅ Check confidence intervals; low confidence predictions need higher scrutiny
Input Issues:
-
❌ Invalid or non-canonical SMILES → Wrong compound analyzed
- ✅ Validate SMILES before prediction; use canonical forms
-
❌ Analyzing salt forms → Properties skewed by counterion
- ✅ Remove salts using
smiles-de-salter; analyze free base/acid
- ✅ Remove salts using
-
❌ Ignoring stereochemistry → Inaccurate predictions for chiral drugs
- ✅ Specify stereochemistry explicitly; use 3D descriptors if available
Interpretation Issues:
-
❌ Focusing on single property → Miss overall profile
- ✅ Consider all ADME properties; use integrated scores like QED or MPO
-
❌ Rigid cutoff application → Discard good candidates
- ✅ Use cutoffs as guidelines; consider project-specific needs
-
❌ Ignoring property correlations → Unrealistic optimization
- ✅ Recognize trade-offs (e.g., increasing LogP improves BBB but reduces solubility)
Domain Issues:
-
❌ Applying to biologics → Completely inappropriate
- ✅ These models for small molecules only; use specialized tools for biologics
-
❌ Extrapolating beyond training set → Unreliable predictions
- ✅ Check applicability domain; novel scaffolds need experimental validation
Workflow Issues:
-
❌ No experimental validation → Continue with false leads
- ✅ Always validate top predictions experimentally
-
❌ Not documenting model versions → Irreproducible results
- ✅ Record software version, model versions, prediction dates
Troubleshooting
Problem: All predictions show "out of domain" warning
- Symptoms: "Compound outside training set" for entire library
- Causes: Library contains unusual chemotypes (peptidomimetics, macrocycles, etc.)
- Solutions:
- Use specialized models for non-traditional chemotypes
- Check if input format correct (SMILES vs InChI)
- Verify no strange atoms (metals, silicon, etc.)
Problem: Extreme predictions (negative solubility, >100% absorption)
- Symptoms: "LogS = -15" or "HIA = 150%"
- Causes: Model extrapolation errors; invalid input structures
- Solutions:
- Check input structure validity
- Cap extreme values at physiologically plausible limits
- Flag for manual review if outside typical ranges
Problem: Batch processing extremely slow
- Symptoms: "100 compounds taking 30 minutes"
- Causes: Single-threaded execution; complex models
- Solutions:
- Enable parallel processing (--n-workers 4)
- Use faster models for initial screening (QSAR vs ML)
- Pre-filter with rule-based methods (Lipinski) before detailed ADME
Problem: Inconsistent predictions across runs
- Symptoms: "Same compound, different predictions on re-run"
- Causes: Random seed issues; stochastic models
- Solutions:
- Set random seeds for reproducibility
- Use deterministic models when consistency critical
- Average multiple predictions if stochastic models necessary
Problem: Properties contradict each other
- Symptoms: "High LogP (4.5) but predicted very soluble"
- Causes: Model inconsistencies; prediction errors
- Solutions:
- Check input structure (tautomeric form matters for both)
- Lipophilic compounds (LogP > 3) typically have poor solubility
- Use thermodynamic cycle checks if available
Problem: Cannot process certain file formats
- Symptoms: "Error: Unsupported format" for SDF or MOL files
- Causes: Format limitations; parser issues
- Solutions:
- Convert to SMILES using
chemical-structure-converter - Check file encoding (UTF-8 vs Latin-1)
- Verify structure validity with external tools
- Convert to SMILES using
References
Available in references/ directory:
lipinski_rules.md- Detailed explanation of Rule of 5 and variantsqsar_models.md- Technical documentation of predictive modelsadme_databases.md- Experimental ADME data sources for validationproperty_ranges.md- Acceptable ranges for marketed drugs by classmodel_validation.md- Validation statistics and applicability domainscheminformatics_basics.md- Introduction to molecular descriptors
Scripts
Located in scripts/ directory:
main.py- CLI interface for ADME predictionadme_predictor.py- Core prediction engineabsorption.py- Absorption property modelsdistribution.py- Distribution property modelsmetabolism.py- Metabolism prediction modelsexcretion.py- Excretion and clearance modelsdruglikeness.py- QED, MPO, and other scoring functionsbatch_processor.py- Library screening and parallel processingvalidator.py- Input validation and applicability domain checking
Performance and Resources
Prediction Speed: | Task | Time | Hardware | |------|------|----------| | Single compound | 0.5-2 sec | CPU | | 100 compounds | 30-60 sec | CPU | | 1000 compounds | 5-10 min | CPU | | 1000 compounds | 2-3 min | 4-core parallel | | 10,000 compounds | 30-60 min | 4-core parallel |
System Requirements:
- RAM: 4 GB minimum; 8 GB for large libraries (>10K compounds)
- Storage: 100 MB for models and dependencies
- CPU: Multi-core recommended for batch processing
- No GPU required: All models CPU-based
Optimization Tips:
- Process libraries in batches of 5000-10000
- Use rule-based filters (Lipinski) before expensive ML predictions
- Cache results to avoid re-prediction
- Parallel processing scales nearly linearly up to 8 cores
Limitations
- Small Molecules Only: Models trained on drugs with MW 100-800 Da; unreliable for larger compounds
- pH 7.4 Assumption: Most models predict properties at physiological pH
- Human-Specific: Predictions for human PK; animal models may differ
- Healthy Subject Assumption: Does not account for disease states, drug interactions
- Single Compound: Does not predict formulation effects, salt form impact
- Static Models: Do not account for induction, inhibition, or time-dependent changes
- Training Set Bias: Underperforms for novel scaffolds not in training data
- Qualitative Only: For Go/No-Go decisions; not for precise quantitative predictions
- No Toxicity: ADME only; use separate tools for safety assessment
Model Accuracy (Typical):
- LogP: R² = 0.85-0.95 (very good)
- Solubility: R² = 0.65-0.80 (moderate)
- HIA: Accuracy = 75-85% (good)
- BBB: Accuracy = 70-80% (moderate)
- Metabolic stability: R² = 0.60-0.75 (moderate)
- T1/2: R² = 0.50-0.65 (challenging)
Version History
- v1.0.0 (Current): Initial release with 20+ ADME endpoints, QED scoring, batch processing
- Planned: Integration with PK simulation, population variability modeling, formulation effects
⚠️ CRITICAL DISCLAIMER: These predictions are computational estimates for prioritization and guidance only. They do NOT replace experimental ADME studies required for regulatory submissions or clinical decision-making. Always validate predictions with appropriate in vitro and in vivo assays before advancing compounds.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--smiles |
str | Required | SMILES string of the molecule |
--properties |
str | ["all"] | Specific properties to calculate |
--format |
str | "json" | Output format |
--input |
str | Required | Input CSV file with SMILES column |
--output |
str | Required | Output file for results |
同梱ファイル
※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。
- 📄 SKILL.md (23,399 bytes)
- 📎 scripts/main.py (15,617 bytes)