🛠️ 開発・MCP コミュニティ 🔴 エンジニア向け 👤 エンジニア・AI開発者

🛠️ Gtars

gtars

生物の遺伝情報（ゲノム）の特定の

⚡ ⏱ テスト計画作成 2時間 → 20分

📺 まず動画で見る(YouTube)

▶ 【衝撃】最強のAIエージェント「Claude Code」の最新機能・使い方・プログラミングをAIで効率化する超実践術を解説! ↗

※ jpskill.com 編集部が参考用に選んだ動画です。動画の内容と Skill の挙動は厳密には一致しないことがあります。

📜 元の英語説明(参考)

High-performance toolkit for genomic interval analysis in Rust with Python bindings. Use when working with genomic regions, BED files, coverage tracks, overlap detection, tokenization for ML models, or fragment analysis in computational genomics and machine learning applications.

🇯🇵 日本人クリエイター向け解説

一言でいうと

生物の遺伝情報（ゲノム）の特定の

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o gtars.zip https://jpskill.com/download/4167.zip && unzip -o gtars.zip && rm gtars.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/4167.zip -OutFile "$d\gtars.zip"; Expand-Archive "$d\gtars.zip" -DestinationPath $d -Force; ri "$d\gtars.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して gtars.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → gtars フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-17
取得日時: 2026-05-18
同梱ファイル: 7

💬 こう話しかけるだけ — サンプルプロンプト

› Gtars を使って、最小構成のサンプルコードを示して
› Gtars の主な使い方と注意点を教えて
› Gtars を既存プロジェクトに組み込む方法を教えて

これをClaude Code に貼るだけで、このSkillが自動発動します。

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

[Skill 名] gtars

Gtars: Rustによるゲノムツールとアルゴリズム

概要

Gtarsは、ゲノムインターバルデータを操作、分析、処理するための高性能なRustツールキットです。オーバーラップ検出、カバレッジ解析、機械学習のためのトークン化、参照配列管理に特化したツールを提供します。

以下の作業を行う際に、このスキルをご利用ください。

ゲノムインターバルファイル（BED形式）
ゲノム領域間のオーバーラップ検出
カバレッジトラックの生成（WIG、BigWig）
ゲノムMLの前処理とトークン化
シングルセルゲノミクスにおけるフラグメント解析
参照配列の取得と検証

インストール

Pythonのインストール

gtars Pythonバインディングをインストールします。

uv pip install gtars

CLIのインストール

コマンドラインツールをインストールします（Rust/Cargoが必要です）。

# 全ての機能でインストール
cargo install gtars-cli --features "uniwig overlaprs igd bbcache scoring fragsplit"

# または特定の機能のみをインストール
cargo install gtars-cli --features "uniwig overlaprs"

Rustライブラリ

RustプロジェクトのCargo.tomlに追加します。

[dependencies]
gtars = { version = "0.1", features = ["tokenizers", "overlaprs"] }

コア機能

Gtarsは、特定のゲノム解析タスクに焦点を当てた専門モジュールで構成されています。

1. オーバーラップ検出とIGDインデックス作成

Integrated Genome Database (IGD) データ構造を使用して、ゲノムインターバル間のオーバーラップを効率的に検出します。

使用する場面:

重複する制御要素の検索
バリアントのアノテーション
ChIP-seqピークの比較
共有ゲノム特徴の特定

簡単な例:

import gtars

# IGDインデックスを構築し、オーバーラップをクエリ
igd = gtars.igd.build_index("regions.bed")
overlaps = igd.query("chr1", 1000, 2000)

包括的なオーバーラップ検出のドキュメントについては、references/overlap.mdをご覧ください。

2. カバレッジトラックの生成

uniwigモジュールを使用して、シーケンスデータからカバレッジトラックを生成します。

使用する場面:

ATAC-seqアクセシビリティプロファイル
ChIP-seqカバレッジの可視化
RNA-seqリードカバレッジ
差分カバレッジ解析

簡単な例:

# BigWigカバレッジトラックを生成
gtars uniwig generate --input fragments.bed --output coverage.bw --format bigwig

詳細なカバレッジ解析ワークフローについては、references/coverage.mdをご覧ください。

3. ゲノムトークン化

ゲノム領域を、特にゲノムデータ上の深層学習モデル向けに、機械学習アプリケーション用の離散トークンに変換します。

使用する場面:

ゲノムMLモデルの前処理
genimlライブラリとの統合
位置エンコーディングの作成
ゲノム配列上のトランスフォーマーモデルのトレーニング

簡単な例:

from gtars.tokenizers import TreeTokenizer

tokenizer = TreeTokenizer.from_bed_file("training_regions.bed")
token = tokenizer.tokenize("chr1", 1000, 2000)

トークン化のドキュメントについては、references/tokenizers.mdをご覧ください。

4. 参照配列管理

GA4GH refgetプロトコルに従って、参照ゲノム配列を処理し、ダイジェストを計算します。

使用する場面:

参照ゲノムの整合性検証
特定のゲノム配列の抽出
配列ダイジェストの計算
相互参照比較

簡単な例:

# 参照をロードし、配列を抽出
store = gtars.RefgetStore.from_fasta("hg38.fa")
sequence = store.get_subsequence("chr1", 1000, 2000)

参照配列操作については、references/refget.mdをご覧ください。

5. フラグメント処理

フラグメントファイルを分割および分析します。特にシングルセルゲノミクスデータに役立ちます。

使用する場面:

シングルセルATAC-seqデータの処理
セルバーコードによるフラグメントの分割
クラスターベースのフラグメント解析
フラグメントの品質管理

簡単な例:

# クラスターごとにフラグメントを分割
gtars fragsplit cluster-split --input fragments.tsv --clusters clusters.txt --output-dir ./by_cluster/

フラグメント処理コマンドについては、references/cli.mdをご覧ください。

6. フラグメントスコアリング

参照データセットに対してフラグメントのオーバーラップをスコアリングします。

使用する場面:

フラグメントエンリッチメントの評価
実験データと参照の比較
品質指標の計算
サンプル間のバッチスコアリング

簡単な例:

# 参照に対してフラグメントをスコアリング
gtars scoring score --fragments fragments.bed --reference reference.bed --output scores.txt

一般的なワークフロー

ワークフロー1: ピークオーバーラップ解析

重複するゲノム特徴を特定します。

import gtars

# 2つの領域セットをロード
peaks = gtars.RegionSet.from_bed("chip_peaks.bed")
promoters = gtars.RegionSet.from_bed("promoters.bed")

# オーバーラップを検索
overlapping_peaks = peaks.filter_overlapping(promoters)

# 結果をエクスポート
overlapping_peaks.to_bed("peaks_in_promoters.bed")

ワークフロー2: カバレッジトラックパイプライン

可視化のためにカバレッジトラックを生成します。

# ステップ1: カバレッジを生成
gtars uniwig generate --input atac_fragments.bed --output coverage.wig --resolution 10

# ステップ2: ゲノムブラウザ用にBigWigに変換
gtars uniwig generate --input atac_fragments.bed --output coverage.bw --format bigwig

ワークフロー3: ML前処理

機械学習のためにゲノムデータを準備します。

from gtars.tokenizers import TreeTokenizer
import gtars

# ステップ1: トレーニング領域をロード
regions = gtars.RegionSet.from_bed("training_peaks.bed")

# ステップ2: トークナイザーを作成
tokenizer = TreeTokenizer.from_bed_file("training_peaks.bed")

# ステップ3: 領域をトークン化
tokens = [tokenizer.tokenize(r.chromosome, r.start, r.end) for r in regions]

# ステップ4: MLパイプラインでトークンを使用
# (genimlまたはカスタムモデルと統合)

PythonとCLIの使い分け

Python APIを使用する場合:

解析パイプラインとの統合
プログラムによる制御が必要な場合
NumPy/Pandasを使用する場合
カスタムワークフローを構築する場合

CLIを使用する場合:

簡単な単発解析
シェルスクリプト
ファイルのバッチ処理
ワークフローのプロトタイピング

参照ドキュメント

包括的なモジュールドキュメント:

references/python-api.md - RegionSet操作、NumPyとの連携を含む完全なPython APIリファレンス

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Gtars: Genomic Tools and Algorithms in Rust

Overview

Gtars is a high-performance Rust toolkit for manipulating, analyzing, and processing genomic interval data. It provides specialized tools for overlap detection, coverage analysis, tokenization for machine learning, and reference sequence management.

Use this skill when working with:

Genomic interval files (BED format)
Overlap detection between genomic regions
Coverage track generation (WIG, BigWig)
Genomic ML preprocessing and tokenization
Fragment analysis in single-cell genomics
Reference sequence retrieval and validation

Installation

Python Installation

Install gtars Python bindings:

uv pip install gtars

CLI Installation

Install command-line tools (requires Rust/Cargo):

# Install with all features
cargo install gtars-cli --features "uniwig overlaprs igd bbcache scoring fragsplit"

# Or install specific features only
cargo install gtars-cli --features "uniwig overlaprs"

Rust Library

Add to Cargo.toml for Rust projects:

[dependencies]
gtars = { version = "0.1", features = ["tokenizers", "overlaprs"] }

Core Capabilities

Gtars is organized into specialized modules, each focused on specific genomic analysis tasks:

1. Overlap Detection and IGD Indexing

Efficiently detect overlaps between genomic intervals using the Integrated Genome Database (IGD) data structure.

When to use:

Finding overlapping regulatory elements
Variant annotation
Comparing ChIP-seq peaks
Identifying shared genomic features

Quick example:

import gtars

# Build IGD index and query overlaps
igd = gtars.igd.build_index("regions.bed")
overlaps = igd.query("chr1", 1000, 2000)

See references/overlap.md for comprehensive overlap detection documentation.

2. Coverage Track Generation

Generate coverage tracks from sequencing data with the uniwig module.

When to use:

ATAC-seq accessibility profiles
ChIP-seq coverage visualization
RNA-seq read coverage
Differential coverage analysis

Quick example:

# Generate BigWig coverage track
gtars uniwig generate --input fragments.bed --output coverage.bw --format bigwig

See references/coverage.md for detailed coverage analysis workflows.

3. Genomic Tokenization

Convert genomic regions into discrete tokens for machine learning applications, particularly for deep learning models on genomic data.

When to use:

Preprocessing for genomic ML models
Integration with geniml library
Creating position encodings
Training transformer models on genomic sequences

Quick example:

from gtars.tokenizers import TreeTokenizer

tokenizer = TreeTokenizer.from_bed_file("training_regions.bed")
token = tokenizer.tokenize("chr1", 1000, 2000)

See references/tokenizers.md for tokenization documentation.

4. Reference Sequence Management

Handle reference genome sequences and compute digests following the GA4GH refget protocol.

When to use:

Validating reference genome integrity
Extracting specific genomic sequences
Computing sequence digests
Cross-reference comparisons

Quick example:

# Load reference and extract sequences
store = gtars.RefgetStore.from_fasta("hg38.fa")
sequence = store.get_subsequence("chr1", 1000, 2000)

See references/refget.md for reference sequence operations.

5. Fragment Processing

Split and analyze fragment files, particularly useful for single-cell genomics data.

When to use:

Processing single-cell ATAC-seq data
Splitting fragments by cell barcodes
Cluster-based fragment analysis
Fragment quality control

Quick example:

# Split fragments by clusters
gtars fragsplit cluster-split --input fragments.tsv --clusters clusters.txt --output-dir ./by_cluster/

See references/cli.md for fragment processing commands.

6. Fragment Scoring

Score fragment overlaps against reference datasets.

When to use:

Evaluating fragment enrichment
Comparing experimental data to references
Quality metrics computation
Batch scoring across samples

Quick example:

# Score fragments against reference
gtars scoring score --fragments fragments.bed --reference reference.bed --output scores.txt

Common Workflows

Workflow 1: Peak Overlap Analysis

Identify overlapping genomic features:

import gtars

# Load two region sets
peaks = gtars.RegionSet.from_bed("chip_peaks.bed")
promoters = gtars.RegionSet.from_bed("promoters.bed")

# Find overlaps
overlapping_peaks = peaks.filter_overlapping(promoters)

# Export results
overlapping_peaks.to_bed("peaks_in_promoters.bed")

Workflow 2: Coverage Track Pipeline

Generate coverage tracks for visualization:

# Step 1: Generate coverage
gtars uniwig generate --input atac_fragments.bed --output coverage.wig --resolution 10

# Step 2: Convert to BigWig for genome browsers
gtars uniwig generate --input atac_fragments.bed --output coverage.bw --format bigwig

Workflow 3: ML Preprocessing

Prepare genomic data for machine learning:

from gtars.tokenizers import TreeTokenizer
import gtars

# Step 1: Load training regions
regions = gtars.RegionSet.from_bed("training_peaks.bed")

# Step 2: Create tokenizer
tokenizer = TreeTokenizer.from_bed_file("training_peaks.bed")

# Step 3: Tokenize regions
tokens = [tokenizer.tokenize(r.chromosome, r.start, r.end) for r in regions]

# Step 4: Use tokens in ML pipeline
# (integrate with geniml or custom models)

Python vs CLI Usage

Use Python API when:

Integrating with analysis pipelines
Need programmatic control
Working with NumPy/Pandas
Building custom workflows

Use CLI when:

Quick one-off analyses
Shell scripting
Batch processing files
Prototyping workflows

Reference Documentation

Comprehensive module documentation:

references/python-api.md - Complete Python API reference with RegionSet operations, NumPy integration, and data export
references/overlap.md - IGD indexing, overlap detection, and set operations
references/coverage.md - Coverage track generation with uniwig
references/tokenizers.md - Genomic tokenization for ML applications
references/refget.md - Reference sequence management and digests
references/cli.md - Command-line interface complete reference

Integration with geniml

Gtars serves as the foundation for the geniml Python package, providing core genomic interval operations for machine learning workflows. When working on geniml-related tasks, use gtars for data preprocessing and tokenization.

Performance Characteristics

Native Rust performance: Fast execution with low memory overhead
Parallel processing: Multi-threaded operations for large datasets
Memory efficiency: Streaming and memory-mapped file support
Zero-copy operations: NumPy integration with minimal data copying

Data Formats

Gtars works with standard genomic formats:

BED: Genomic intervals (3-column or extended)
WIG/BigWig: Coverage tracks
FASTA: Reference sequences
Fragment TSV: Single-cell fragment files with barcodes

Error Handling and Debugging

Enable verbose logging for troubleshooting:

import gtars

# Enable debug logging
gtars.set_log_level("DEBUG")

# CLI verbose mode
gtars --verbose <command>

同梱ファイル

※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。

📄 SKILL.md (7,803 bytes)
📎 references/cli.md (5,060 bytes)
📎 references/coverage.md (3,995 bytes)
📎 references/overlap.md (3,679 bytes)
📎 references/python-api.md (4,222 bytes)
📎 references/refget.md (3,165 bytes)
📎 references/tokenizers.md (2,444 bytes)