🛠️ 開発・MCP コミュニティ

fiftyone-embeddings-visualization

データセットの構造を把握したり、画像内のグループを見つけたりするために、UMAPやt-SNEといった方法でデータを2次元に落とし込み、可視化して分析を支援するSkill。

📜 元の英語説明(参考)

Visualize datasets in 2D using embeddings with UMAP or t-SNE dimensionality reduction. Use when users want to explore dataset structure, find clusters in images, identify outliers, color samples by class or metadata, or understand data distribution. Requires FiftyOne MCP server with @voxel51/brain plugin installed.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o fiftyone-embeddings-visualization.zip https://jpskill.com/download/17013.zip && unzip -o fiftyone-embeddings-visualization.zip && rm fiftyone-embeddings-visualization.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/17013.zip -OutFile "$d\fiftyone-embeddings-visualization.zip"; Expand-Archive "$d\fiftyone-embeddings-visualization.zip" -DestinationPath $d -Force; ri "$d\fiftyone-embeddings-visualization.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して fiftyone-embeddings-visualization.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → fiftyone-embeddings-visualization フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

FiftyOne での埋め込みの可視化

概要

深層学習の埋め込みと次元削減（UMAP/t-SNE）を使用して、データセットを 2D で可視化します。クラスタを探索し、外れ値を見つけ、任意のフィールドでサンプルに色を付けます。

このスキルは、以下の場合に使用します。

データセットの構造を 2D で可視化する
画像内の自然なクラスタを見つける
外れ値または異常を特定する
クラスまたはメタデータによるデータ分布を探索する
埋め込み空間の関係を理解する

前提条件

FiftyOne MCP サーバーがインストールされ、実行されていること
@voxel51/brain プラグインがインストールされ、有効になっていること
画像サンプルを含むデータセットが FiftyOne にロードされていること

主要な指示

常に以下のルールに従ってください。

1. 最初にコンテキストを設定する

set_context(dataset_name="my-dataset")

2. FiftyOne App を起動する

Brain オペレーターは委譲され、アプリが必要です。

launch_app()

初期化に 5〜10 秒待ちます。

3. オペレーターを動的に検出する

# すべての brain オペレーターをリストする
list_operators(builtin_only=False)

# 特定のオペレーターのスキーマを取得する
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

4. 可視化の前に埋め込みを計算する

埋め込みは次元削減に必要です。

execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_sim",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

5. 完了したらアプリを閉じる

close_app()

完全なワークフロー

ステップ 1: セットアップ

# コンテキストを設定する
set_context(dataset_name="my-dataset")

# アプリを起動する (brain オペレーターに必要)
launch_app()

ステップ 2: Brain プラグインの確認

# brain プラグインが利用可能かどうかを確認する
list_plugins(enabled=True)

# インストールされていない場合:
download_plugin(
    url_or_repo="voxel51/fiftyone-plugins",
    plugin_names=["@voxel51/brain"]
)
enable_plugin(plugin_name="@voxel51/brain")

ステップ 3: Brain オペレーターの検出

# 利用可能なすべてのオペレーターをリストする
list_operators(builtin_only=False)

# compute_visualization のスキーマを取得する
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

ステップ 4: 既存の埋め込みの確認または新しい埋め込みの計算

まず、オペレーターのスキーマを見て、データセットにすでに埋め込みがあるかどうかを確認します。

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# "embeddings" の選択肢で既存の埋め込みフィールドを探します
# (例: "clip_embeddings", "dinov2_embeddings")

埋め込みが存在する場合: ステップ 5 にスキップし、既存の埋め込みフィールドを使用します。

埋め込みが存在しない場合: それらを計算します。

execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_viz",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",  # 埋め込みを保存するフィールド名
        "backend": "sklearn",
        "metric": "cosine"
    }
)

compute_similarity の必須パラメーター:

brain_key - この brain 実行の一意の識別子
model - 埋め込みを生成するための FiftyOne Model Zoo のモデル
embeddings - 埋め込みが保存されるフィールド名
backend - 類似性バックエンド ( "sklearn" を使用)
metric - 距離メトリック ( "cosine" または "euclidean" を使用)

推奨される埋め込みモデル:

clip-vit-base32-torch - 一般的な視覚的 + セマンティックな類似性に最適
dinov2-vits14-torch - 視覚的な類似性のみに最適
resnet50-imagenet-torch - 古典的な CNN 特徴
mobilenet-v2-imagenet-torch - 高速で軽量なオプション

ステップ 5: 2D 可視化の計算

既存の埋め込みフィールドまたはステップ 4 の brain_key を使用します。

# オプション A: 既存の埋め込みフィールド (例: clip_embeddings) を使用する
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "img_viz",
        "embeddings": "clip_embeddings",  # 既存のフィールドを使用する
        "method": "umap",
        "num_dims": 2
    }
)

# オプション B: compute_similarity からの brain_key を使用する
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "img_viz",  # compute_similarity で使用したのと同じキー
        "method": "umap",
        "num_dims": 2
    }
)

次元削減法:

umap - (推奨) ローカルおよびグローバル構造を保持し、より高速です。umap-learn パッケージが必要です。
tsne - より良いローカル構造ですが、大規模なデータセットではより遅くなります。追加の依存関係はありません。
pca - 線形削減、最速ですが、情報量が少なくなります。

ステップ 6: 埋め込みパネルへのユーザーの誘導

可視化を計算した後、ユーザーを FiftyOne App (http://localhost:5151/) を開き、以下を行うように誘導します。

上部のツールバーにある Embeddings パネルのアイコン（散布図アイコン、ドットのグリッドのように見える）をクリックします。
ドロップダウンから brain キー（例：img_viz）を選択します。
点は 2D 埋め込み空間のサンプルを表します。
"Color by" ドロップダウンを使用して、フィールド（例：ground_truth、predictions）で点に色を付けます。
点をクリックしてサンプルを選択し、なげなわツールを使用してグループを選択します。

重要: set_view(exists=["brain_key"]) を使用しないでください。これはサンプルをフィルタリングし、可視化には必要ありません。埋め込みパネルは、計算された座標を持つすべてのサンプルを自動的に表示します。

ステップ 7: 探索とフィルタリング (オプション)

埋め込みパネルで表示中にサンプルをフィルタリングするには:

# 特定のクラスにフィルタリングする
set_view(filters={"ground_truth.label": "dog"})

# タグでフィルタリングする
set_view(tags=["validated"])

# フィルタをクリアしてすべてを表示する
clear_view()

これらのフィルターは、一致するサンプルのみを表示するように埋め込みパネルを更新します。

ステップ 8: 外れ値の検索

外れ値は、クラスタから遠く離れた孤立した点として表示されます。


# 一意性スコアを計算する (高いほど一意/外れ値)
execute_operator(
    operator_uri="@voxel51/brain/compute_uniqueness",
    params={
        "brain_key": "img_viz"
    }
)

# 最もユニークなサンプルを表示する (潜在的な外れ値)

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Embeddings Visualization in FiftyOne

Overview

Visualize your dataset in 2D using deep learning embeddings and dimensionality reduction (UMAP/t-SNE). Explore clusters, find outliers, and color samples by any field.

Use this skill when:

Visualizing dataset structure in 2D
Finding natural clusters in images
Identifying outliers or anomalies
Exploring data distribution by class or metadata
Understanding embedding space relationships

Prerequisites

FiftyOne MCP server installed and running
@voxel51/brain plugin installed and enabled
Dataset with image samples loaded in FiftyOne

Key Directives

ALWAYS follow these rules:

1. Set context first

set_context(dataset_name="my-dataset")

2. Launch FiftyOne App

Brain operators are delegated and require the app:

launch_app()

Wait 5-10 seconds for initialization.

3. Discover operators dynamically

# List all brain operators
list_operators(builtin_only=False)

# Get schema for specific operator
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

4. Compute embeddings before visualization

Embeddings are required for dimensionality reduction:

execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_sim",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

5. Close app when done

close_app()

Complete Workflow

Step 1: Setup

# Set context
set_context(dataset_name="my-dataset")

# Launch app (required for brain operators)
launch_app()

Step 2: Verify Brain Plugin

# Check if brain plugin is available
list_plugins(enabled=True)

# If not installed:
download_plugin(
    url_or_repo="voxel51/fiftyone-plugins",
    plugin_names=["@voxel51/brain"]
)
enable_plugin(plugin_name="@voxel51/brain")

Step 3: Discover Brain Operators

# List all available operators
list_operators(builtin_only=False)

# Get schema for compute_visualization
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

Step 4: Check for Existing Embeddings or Compute New Ones

First, check if the dataset already has embeddings by looking at the operator schema:

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")
# Look for existing embeddings fields in the "embeddings" choices
# (e.g., "clip_embeddings", "dinov2_embeddings")

If embeddings exist: Skip to Step 5 and use the existing embeddings field.

If no embeddings exist: Compute them:

execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_viz",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",  # Field name to store embeddings
        "backend": "sklearn",
        "metric": "cosine"
    }
)

Required parameters for compute_similarity:

brain_key - Unique identifier for this brain run
model - Model from FiftyOne Model Zoo to generate embeddings
embeddings - Field name where embeddings will be stored
backend - Similarity backend (use "sklearn")
metric - Distance metric (use "cosine" or "euclidean")

Recommended embedding models:

clip-vit-base32-torch - Best for general visual + semantic similarity
dinov2-vits14-torch - Best for visual similarity only
resnet50-imagenet-torch - Classic CNN features
mobilenet-v2-imagenet-torch - Fast, lightweight option

Step 5: Compute 2D Visualization

Use existing embeddings field OR the brain_key from Step 4:

# Option A: Use existing embeddings field (e.g., clip_embeddings)
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "img_viz",
        "embeddings": "clip_embeddings",  # Use existing field
        "method": "umap",
        "num_dims": 2
    }
)

# Option B: Use brain_key from compute_similarity
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "img_viz",  # Same key used in compute_similarity
        "method": "umap",
        "num_dims": 2
    }
)

Dimensionality reduction methods:

umap - (Recommended) Preserves local and global structure, faster. Requires umap-learn package.
tsne - Better local structure, slower on large datasets. No extra dependencies.
pca - Linear reduction, fastest but less informative

Step 6: Direct User to Embeddings Panel

After computing visualization, direct the user to open the FiftyOne App at http://localhost:5151/ and:

Click the Embeddings panel icon (scatter plot icon, looks like a grid of dots) in the top toolbar
Select the brain key (e.g., img_viz) from the dropdown
Points represent samples in 2D embedding space
Use the "Color by" dropdown to color points by a field (e.g., ground_truth, predictions)
Click points to select samples, use lasso tool to select groups

IMPORTANT: Do NOT use set_view(exists=["brain_key"]) - this filters samples and is not needed for visualization. The Embeddings panel automatically shows all samples with computed coordinates.

Step 7: Explore and Filter (Optional)

To filter samples while viewing in the Embeddings panel:

# Filter to specific class
set_view(filters={"ground_truth.label": "dog"})

# Filter by tag
set_view(tags=["validated"])

# Clear filter to show all
clear_view()

These filters will update the Embeddings panel to show only matching samples.

Step 8: Find Outliers

Outliers appear as isolated points far from clusters:

# Compute uniqueness scores (higher = more unique/outlier)
execute_operator(
    operator_uri="@voxel51/brain/compute_uniqueness",
    params={
        "brain_key": "img_viz"
    }
)

# View most unique samples (potential outliers)
set_view(sort_by="uniqueness", reverse=True, limit=50)

Step 9: Find Clusters

Use the App's Embeddings panel to visually identify clusters, then:

Option A: Lasso selection in App

Use lasso tool to select a cluster
Selected samples are highlighted
Tag or export selected samples

Option B: Use similarity to find cluster members

# Sort by similarity to a representative sample
execute_operator(
    operator_uri="@voxel51/brain/sort_by_similarity",
    params={
        "brain_key": "img_viz",
        "query_id": "sample_id_from_cluster",
        "k": 100
    }
)

Step 10: Clean Up

close_app()

Available Tools

Session View Tools

Tool	Description
`set_view(filters={...})`	Filter samples by field values
`set_view(tags=[...])`	Filter samples by tags
`set_view(sort_by="...", reverse=True)`	Sort samples by field
`set_view(limit=N)`	Limit to N samples
`clear_view()`	Clear filters, show all samples

Brain Operators for Visualization

Use list_operators() to discover and get_operator_schema() to see parameters:

Operator	Description
`@voxel51/brain/compute_similarity`	Compute embeddings and similarity index
`@voxel51/brain/compute_visualization`	Reduce embeddings to 2D/3D for visualization
`@voxel51/brain/compute_uniqueness`	Score samples by uniqueness (outlier detection)
`@voxel51/brain/sort_by_similarity`	Sort by similarity to a query sample

Common Use Cases

Use Case 1: Basic Dataset Exploration

Visualize dataset structure and explore clusters:

set_context(dataset_name="my-dataset")
launch_app()

# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

# If embeddings exist (e.g., clip_embeddings), use them directly:
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "exploration",
        "embeddings": "clip_embeddings",
        "method": "umap",  # or "tsne" if umap-learn not installed
        "num_dims": 2
    }
)

# Direct user to App Embeddings panel at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "exploration" from dropdown
# 3. Use "Color by" to color by ground_truth or predictions

Use Case 2: Find Outliers in Dataset

Identify anomalous or mislabeled samples:

set_context(dataset_name="my-dataset")
launch_app()

# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

# If no embeddings exist, compute them:
execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "outliers",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

# Compute uniqueness scores
execute_operator(
    operator_uri="@voxel51/brain/compute_uniqueness",
    params={"brain_key": "outliers"}
)

# Generate visualization (use existing embeddings field or brain_key)
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "outliers",
        "embeddings": "clip_embeddings",  # Use existing field if available
        "method": "umap",  # or "tsne" if umap-learn not installed
        "num_dims": 2
    }
)

# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "outliers" from dropdown
# 3. Outliers appear as isolated points far from clusters
# 4. Optionally sort by uniqueness field in the App sidebar

Use Case 3: Compare Classes in Embedding Space

See how different classes cluster:

set_context(dataset_name="my-dataset")
launch_app()

# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

# If no embeddings exist, compute them:
execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "class_viz",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

# Generate visualization (use existing embeddings field or brain_key)
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "class_viz",
        "embeddings": "clip_embeddings",  # Use existing field if available
        "method": "umap",  # or "tsne" if umap-learn not installed
        "num_dims": 2
    }
)

# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "class_viz" from dropdown
# 3. Use "Color by" dropdown to color by ground_truth or predictions
# Look for:
# - Well-separated clusters = good class distinction
# - Overlapping clusters = similar classes or confusion
# - Scattered points = high variance within class

Use Case 4: Analyze Model Predictions

Compare ground truth vs predictions in embedding space:

set_context(dataset_name="my-dataset")
launch_app()

# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

# If no embeddings exist, compute them:
execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "pred_analysis",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

# Generate visualization (use existing embeddings field or brain_key)
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "pred_analysis",
        "embeddings": "clip_embeddings",  # Use existing field if available
        "method": "umap",  # or "tsne" if umap-learn not installed
        "num_dims": 2
    }
)

# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "pred_analysis" from dropdown
# 3. Color by ground_truth - see true class distribution
# 4. Color by predictions - see model's view
# 5. Look for mismatches to find errors

Use Case 5: t-SNE for Publication-Quality Plots

Use t-SNE for better local structure (no extra dependencies):

set_context(dataset_name="my-dataset")
launch_app()

# Check for existing embeddings in schema
get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

# If no embeddings exist, compute them (DINOv2 for visual similarity):
execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "tsne_viz",
        "model": "dinov2-vits14-torch",
        "embeddings": "dinov2_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

# Generate t-SNE visualization (no umap-learn dependency needed)
execute_operator(
    operator_uri="@voxel51/brain/compute_visualization",
    params={
        "brain_key": "tsne_viz",
        "embeddings": "dinov2_embeddings",  # Use existing field if available
        "method": "tsne",
        "num_dims": 2
    }
)

# Direct user to App at http://localhost:5151/
# 1. Click Embeddings panel icon
# 2. Select "tsne_viz" from dropdown
# 3. t-SNE provides better local cluster structure than UMAP

Troubleshooting

Error: "No executor available"

Cause: Delegated operators require the App executor
Solution: Ensure launch_app() was called and wait 5-10 seconds

Error: "Brain key not found"

Cause: Embeddings not computed
Solution: Run compute_similarity first with a brain_key

Error: "Operator not found"

Cause: Brain plugin not installed
Solution: Install with download_plugin() and enable_plugin()

Error: "You must install the umap-learn>=0.5 package"

Cause: UMAP method requires the umap-learn package
Solutions:
1. Install umap-learn: Ask user if they want to run pip install umap-learn
2. Use t-SNE instead: Change method to "tsne" (no extra dependencies)
3. Use PCA instead: Change method to "pca" (fastest, no extra dependencies)
After installing umap-learn, restart Claude Code/MCP server and retry

Visualization is slow

Use UMAP instead of t-SNE for large datasets
Use faster embedding model: mobilenet-v2-imagenet-torch
Process subset first: set_view(limit=1000)

Embeddings panel not showing

Ensure visualization was computed (not just embeddings)
Check brain_key matches in both compute_similarity and compute_visualization
Refresh the App page

Points not colored correctly

Verify the field exists on samples
Check field type is compatible (Classification, Detections, or string)

Best Practices

Discover dynamically - Use list_operators() and get_operator_schema() to get current operator names and parameters
Choose the right model - CLIP for semantic similarity, DINOv2 for visual similarity
Start with UMAP - Faster and often better than t-SNE for exploration
Use uniqueness for outliers - More reliable than visual inspection alone
Store embeddings - Reuse for multiple visualizations via brain_key
Subset large datasets - Compute on subset first, then full dataset

Performance Notes

Embedding computation time:

1,000 images: ~1-2 minutes
10,000 images: ~10-15 minutes
100,000 images: ~1-2 hours

Visualization computation time:

UMAP: ~30 seconds for 10,000 samples
t-SNE: ~5-10 minutes for 10,000 samples
PCA: ~5 seconds for 10,000 samples

Memory requirements:

~2KB per image for embeddings
~16 bytes per image for 2D coordinates