tribev2-brain-encoding
Use TRIBE v2, Meta's multimodal foundation model for predicting fMRI brain responses to video, audio, and text stimuli
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o tribev2-brain-encoding.zip https://jpskill.com/download/23104.zip && unzip -o tribev2-brain-encoding.zip && rm tribev2-brain-encoding.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/23104.zip -OutFile "$d\tribev2-brain-encoding.zip"; Expand-Archive "$d\tribev2-brain-encoding.zip" -DestinationPath $d -Force; ri "$d\tribev2-brain-encoding.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
tribev2-brain-encoding.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
tribev2-brain-encodingフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-18
- 取得日時
- 2026-05-18
- 同梱ファイル
- 1
📖 Claude が読む原文 SKILL.md(中身を展開)
この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。
TRIBE v2 Brain Encoding Model
Skill by ara.so — Daily 2026 Skills collection
TRIBE v2 is Meta's multimodal foundation model that predicts fMRI brain responses to naturalistic stimuli (video, audio, text). It combines LLaMA 3.2 (text), V-JEPA2 (video), and Wav2Vec-BERT (audio) encoders into a unified Transformer architecture that maps multimodal representations onto the cortical surface (fsaverage5, ~20k vertices).
Installation
# Inference only
pip install -e .
# With brain visualization (PyVista & Nilearn)
pip install -e ".[plotting]"
# Full training dependencies (PyTorch Lightning, W&B, etc.)
pip install -e ".[training]"
Quick Start — Inference
Load pretrained model and predict from video
from tribev2 import TribeModel
# Load from HuggingFace (downloads weights to cache)
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
# Build events dataframe from a video file
df = model.get_events_dataframe(video_path="path/to/video.mp4")
# Predict brain responses
preds, segments = model.predict(events=df)
print(preds.shape) # (n_timesteps, n_vertices) on fsaverage5
Multimodal input — video + audio + text
from tribev2 import TribeModel
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
# All modalities together (text is auto-converted to speech and transcribed)
df = model.get_events_dataframe(
video_path="path/to/video.mp4",
audio_path="path/to/audio.wav", # optional, overrides video audio
text_path="path/to/script.txt", # optional, auto-timed
)
preds, segments = model.predict(events=df)
print(preds.shape) # (n_timesteps, n_vertices)
Text-only prediction
from tribev2 import TribeModel
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
df = model.get_events_dataframe(text_path="path/to/narration.txt")
preds, segments = model.predict(events=df)
Brain Visualization
from tribev2 import TribeModel
from tribev2.plotting import plot_brain_surface
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
df = model.get_events_dataframe(video_path="path/to/video.mp4")
preds, segments = model.predict(events=df)
# Plot a single timepoint on the cortical surface
plot_brain_surface(preds[0], backend="nilearn") # or backend="pyvista"
Training a Model from Scratch
1. Set environment variables
export DATAPATH="/path/to/studies"
export SAVEPATH="/path/to/output"
export SLURM_PARTITION="your_slurm_partition"
2. Authenticate with HuggingFace (required for LLaMA 3.2)
huggingface-cli login
# Paste a HuggingFace read token when prompted
# Request access at: https://huggingface.co/meta-llama/Llama-3.2-3B
3. Local test run
python -m tribev2.grids.test_run
4. Full grid search on Slurm
# Cortical surface model
python -m tribev2.grids.run_cortical
# Subcortical regions
python -m tribev2.grids.run_subcortical
Key API — TribeModel
from tribev2 import TribeModel
# Load pretrained weights
model = TribeModel.from_pretrained(
"facebook/tribev2",
cache_folder="./cache" # local cache for HuggingFace weights
)
# Build events dataframe (word-level timings, chunking, etc.)
df = model.get_events_dataframe(
video_path=None, # str path to .mp4
audio_path=None, # str path to .wav
text_path=None, # str path to .txt
)
# Run prediction
preds, segments = model.predict(events=df)
# preds: np.ndarray of shape (n_timesteps, n_vertices)
# segments: list of segment metadata dicts
Project Structure
tribev2/
├── main.py # Experiment pipeline: Data, TribeExperiment
├── model.py # FmriEncoder: Transformer multimodal→fMRI model
├── pl_module.py # PyTorch Lightning training module
├── demo_utils.py # TribeModel and inference helpers
├── eventstransforms.py # Event transforms (word extraction, chunking)
├── utils.py # Multi-study loading, splitting, subject weighting
├── utils_fmri.py # Surface projection (MNI / fsaverage) and ROI analysis
├── grids/
│ ├── defaults.py # Full default experiment configuration
│ └── test_run.py # Quick local test entry point
├── plotting/ # Brain visualization backends
└── studies/ # Dataset definitions (Algonauts2025, Lahner2024, …)
Configuration — Defaults
Edit tribev2/grids/defaults.py or set environment variables:
# tribev2/grids/defaults.py (key fields)
{
"datapath": "/path/to/studies", # override with DATAPATH env var
"savepath": "/path/to/output", # override with SAVEPATH env var
"slurm_partition": "learnfair", # override with SLURM_PARTITION env var
"model": "FmriEncoder",
"modalities": ["video", "audio", "text"],
"surface": "fsaverage5", # ~20k vertices
}
Custom Experiment with PyTorch Lightning
from tribev2.main import Data, TribeExperiment
from tribev2.pl_module import TribePLModule
import pytorch_lightning as pl
# Configure experiment
experiment = TribeExperiment(
datapath="/path/to/studies",
savepath="/path/to/output",
modalities=["video", "audio", "text"],
)
data = Data(experiment)
module = TribePLModule(experiment)
trainer = pl.Trainer(
max_epochs=50,
accelerator="gpu",
devices=4,
)
trainer.fit(module, data)
Working with fMRI Surfaces
from tribev2.utils_fmri import project_to_fsaverage, get_roi_mask
# Project MNI coordinates to fsaverage5 surface
surface_data = project_to_fsaverage(mni_data, target="fsaverage5")
# Get a specific ROI mask (e.g., early visual cortex)
roi_mask = get_roi_mask(roi_name="V1", surface="fsaverage5")
v1_responses = preds[:, roi_mask]
print(v1_responses.shape) # (n_timesteps, n_v1_vertices)
Common Patterns
Batch prediction over multiple videos
from tribev2 import TribeModel
import numpy as np
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
video_paths = ["video1.mp4", "video2.mp4", "video3.mp4"]
all_predictions = []
for vp in video_paths:
df = model.get_events_dataframe(video_path=vp)
preds, segments = model.predict(events=df)
all_predictions.append(preds)
# all_predictions: list of (n_timesteps_i, n_vertices) arrays
Extract predictions for specific brain region
from tribev2 import TribeModel
from tribev2.utils_fmri import get_roi_mask
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
df = model.get_events_dataframe(video_path="video.mp4")
preds, segments = model.predict(events=df)
# Focus on auditory cortex
ac_mask = get_roi_mask("auditory_cortex", surface="fsaverage5")
auditory_responses = preds[:, ac_mask] # (n_timesteps, n_ac_vertices)
Access segment timing metadata
preds, segments = model.predict(events=df)
for i, seg in enumerate(segments):
print(f"Segment {i}: onset={seg['onset']:.2f}s, duration={seg['duration']:.2f}s")
print(f" Brain response shape: {preds[i].shape}")
Troubleshooting
LLaMA 3.2 access denied
# Must request access at https://huggingface.co/meta-llama/Llama-3.2-3B
# Then authenticate:
huggingface-cli login
# Use a HuggingFace token with read permissions
CUDA out of memory during inference
# Use CPU for inference on smaller machines
import torch
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
model.to("cpu")
Missing visualization dependencies
pip install -e ".[plotting]"
# Installs pyvista and nilearn backends
Slurm training not submitting
# Check env vars are set
echo $DATAPATH $SAVEPATH $SLURM_PARTITION
# Or edit tribev2/grids/defaults.py directly
Video without audio track causes error
# Provide audio separately or use text-only mode
df = model.get_events_dataframe(
video_path="silent_video.mp4",
audio_path="separate_audio.wav",
)
Citation
@article{dAscoli2026TribeV2,
title={A foundation model of vision, audition, and language for in-silico neuroscience},
author={d'Ascoli, St{\'e}phane and Rapin, J{\'e}r{\'e}my and Benchetrit, Yohann and Brookes, Teon
and Begany, Katelyn and Raugel, Jos{\'e}phine and Banville, Hubert and King, Jean-R{\'e}mi},
year={2026}
}