jpskill.com
🛠️ 開発・MCP コミュニティ

vector-db

埋め込みデータやセマンティック検索、RAGパイプライン構築など、AI検索機能に関連するベクトルデータベースを扱うSkill。

📜 元の英語説明(参考)

Vector databases for embeddings, semantic search, and RAG pipelines. Use when user mentions "vector database", "embeddings", "semantic search", "RAG", "retrieval augmented generation", "pinecone", "chromadb", "pgvector", "qdrant", "weaviate", "similarity search", "embedding store", or building AI search features.

🇯🇵 日本人クリエイター向け解説

一言でいうと

埋め込みデータやセマンティック検索、RAGパイプライン構築など、AI検索機能に関連するベクトルデータベースを扱うSkill。

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o vector-db.zip https://jpskill.com/download/6140.zip && unzip -o vector-db.zip && rm vector-db.zip
🪟 Windows (PowerShell)
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/6140.zip -OutFile "$d\vector-db.zip"; Expand-Archive "$d\vector-db.zip" -DestinationPath $d -Force; ri "$d\vector-db.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)
  1. 1. 下の青いボタンを押して vector-db.zip をダウンロード
  2. 2. ZIPファイルをダブルクリックで解凍 → vector-db フォルダができる
  3. 3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
  4. 4. Claude Code を再起動

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

  1. 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
  2. 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
  3. 3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
    • · macOS / Linux: ~/.claude/skills/
    • · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →
最終更新
2026-05-17
取得日時
2026-05-17
同梱ファイル
1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

ベクトルデータベース

ベクトルデータベースの役割

ベクトルデータベースは、高次元の数値表現(埋め込み)を保存し、高速な類似性検索を可能にします。正確な値に一致させる従来のデータベースとは異なり、ベクトルデータベースはクエリベクトルに最も近いベクトルを見つけ、セマンティックマッチングを実現します。

主な機能:

  • 埋め込みをメタデータおよび元のコンテンツとともに保存します。
  • 大規模な近似最近傍(ANN)検索を実行します。
  • ベクトル類似性と組み合わせたメタデータで結果をフィルタリングします。
  • 数百万から数十億のベクトルをサブ秒のクエリ時間で処理します。

埋め込みの基本

埋め込みは、セマンティックな意味を捉える固定長の浮動小数点数の配列です。意味が似ているテキストは、埋め込み空間で互いに近いベクトルを生成します。

  • 次元: ベクトルの長さです。一般的なサイズは384、768、1536、3072です。数値が大きいほど、ニュアンスが豊かになり、コストも高くなります。
  • 埋め込みモデル: 生データをベクトルに変換します。モデルが異なると、生成される次元も異なります。
  • 距離尺度: 2つのベクトル間の類似性を測定する方法です。

埋め込みの生成

OpenAI

from openai import OpenAI
client = OpenAI()
response = client.embeddings.create(
    input="What is a vector database?",
    model="text-embedding-3-small"  # 1536 dimensions
)
vector = response.data[0].embedding

Sentence-Transformers (ローカル)

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")  # 384 dimensions
vectors = model.encode(["What is a vector database?", "How does search work?"])

Cohere

import cohere
co = cohere.Client("your-api-key")
response = co.embed(
    texts=["What is a vector database?"],
    model="embed-english-v3.0",
    input_type="search_document"  # Use "search_query" for queries
)
vector = response.embeddings[0]

ChromaDB (ローカル、Python)

軽量で組み込み型のベクトルデータベースです。プロトタイピングや中小規模のワークロードに適しています。

pip install chromadb
import chromadb

client = chromadb.Client()  # In-memory
# client = chromadb.PersistentClient(path="./chroma_data")  # Persistent

collection = client.create_collection(
    name="documents",
    metadata={"hnsw:space": "cosine"}  # cosine, l2, or ip
)

# Add documents (ChromaDB auto-generates embeddings with its default model)
collection.add(
    ids=["doc1", "doc2", "doc3"],
    documents=[
        "Vector databases store embeddings",
        "PostgreSQL is a relational database",
        "Semantic search finds similar meaning"
    ],
    metadatas=[
        {"source": "wiki", "topic": "vectors"},
        {"source": "docs", "topic": "sql"},
        {"source": "wiki", "topic": "search"}
    ]
)

# Query
results = collection.query(
    query_texts=["How do vector stores work?"],
    n_results=2,
    where={"source": "wiki"}  # Optional metadata filter
)
# results["documents"], results["distances"], results["metadatas"]

事前に計算された埋め込みを使用するには、add()query()query_embeddings経由)の両方でdocumentsの代わりにembeddings=[[...]]を渡します。

pgvector (PostgreSQL拡張)

PostgreSQLにベクトル列型と類似性演算子を追加します。すでにPostgresを実行しており、リレーショナルデータとともにベクトルを使用したい場合に利用します。

セットアップとスキーマ

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(1536),
    metadata JSONB,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

インデックス作成

-- HNSW index (recommended)
CREATE INDEX ON documents
    USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

-- IVFFlat index (faster to build, slower to query)
CREATE INDEX ON documents
    USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

演算子クラス:vector_cosine_ops (<=>)、vector_l2_ops (<->)、vector_ip_ops (<#>)。

クエリ

SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE metadata->>'category' = 'technical'
ORDER BY embedding <=> $1::vector
LIMIT 5;

Pythonでは、pgvector.psycopg.register_vector(conn)とともにpsycopgを使用して、ベクトルを直接パラメータとして渡します。

Pinecone (マネージドクラウド)

完全にマネージドなベクトルデータベースです。インフラストラクチャのメンテナンスは不要です。論理的なパーティショニングのための名前空間をサポートしています。

pip install pinecone
from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")

pc.create_index(
    name="my-index",
    dimension=1536,
    metric="cosine",  # cosine, euclidean, dotproduct
    spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)

index = pc.Index("my-index")

# Upsert vectors
index.upsert(
    vectors=[
        {"id": "doc1", "values": [0.1, 0.2, ...],
         "metadata": {"source": "wiki", "topic": "databases"}},
        {"id": "doc2", "values": [0.3, 0.4, ...],
         "metadata": {"source": "blog", "topic": "search"}}
    ],
    namespace="articles"
)

# Query with metadata filter
results = index.query(
    vector=[0.1, 0.2, ...],
    top_k=5,
    namespace="articles",
    filter={"topic": {"$eq": "databases"}},
    include_metadata=True
)

名前空間は、インデックス内のデータをパーティション分割します。クエリは1つの名前空間内のみを検索します。テナント、環境、またはドキュメントタイプを分離するために使用します。

Qdrant (セルフホストまたはクラウド)

Rustで書かれた高性能なベクトルデータベースです。Docker経由でセルフホストするか、マネージドクラウドで利用できます。

docker run -p 6333:6333 qdrant/qdrant
pip install qdrant-client
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(url="http://localhost:6333")

client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

client.upsert(
    collection_name="documents",
    points=[
        PointStruct(id=1, vector=[0.1, 0.2, ...],
                    payload={"text": "Vector databases store embedd
📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Vector Databases

What Vector Databases Do

Vector databases store high-dimensional numerical representations (embeddings) and enable fast similarity search. Unlike traditional databases that match exact values, vector databases find the closest vectors to a query vector, enabling semantic matching.

Core capabilities:

  • Store embeddings alongside metadata and original content
  • Perform approximate nearest neighbor (ANN) search at scale
  • Filter results by metadata combined with vector similarity
  • Handle millions to billions of vectors with sub-second query times

Embedding Basics

An embedding is a fixed-length array of floats capturing semantic meaning. Text with similar meaning produces vectors that are close together in the embedding space.

  • Dimensions: Vector length. Common sizes: 384, 768, 1536, 3072. Higher = more nuance, more cost.
  • Embedding model: Converts raw data into vectors. Different models produce different dimensions.
  • Distance metric: How similarity between two vectors is measured.

Generating Embeddings

OpenAI

from openai import OpenAI
client = OpenAI()
response = client.embeddings.create(
    input="What is a vector database?",
    model="text-embedding-3-small"  # 1536 dimensions
)
vector = response.data[0].embedding

Sentence-Transformers (Local)

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")  # 384 dimensions
vectors = model.encode(["What is a vector database?", "How does search work?"])

Cohere

import cohere
co = cohere.Client("your-api-key")
response = co.embed(
    texts=["What is a vector database?"],
    model="embed-english-v3.0",
    input_type="search_document"  # Use "search_query" for queries
)
vector = response.embeddings[0]

ChromaDB (Local, Python)

Lightweight, embedded vector database. Good for prototyping and small-to-medium workloads.

pip install chromadb
import chromadb

client = chromadb.Client()  # In-memory
# client = chromadb.PersistentClient(path="./chroma_data")  # Persistent

collection = client.create_collection(
    name="documents",
    metadata={"hnsw:space": "cosine"}  # cosine, l2, or ip
)

# Add documents (ChromaDB auto-generates embeddings with its default model)
collection.add(
    ids=["doc1", "doc2", "doc3"],
    documents=[
        "Vector databases store embeddings",
        "PostgreSQL is a relational database",
        "Semantic search finds similar meaning"
    ],
    metadatas=[
        {"source": "wiki", "topic": "vectors"},
        {"source": "docs", "topic": "sql"},
        {"source": "wiki", "topic": "search"}
    ]
)

# Query
results = collection.query(
    query_texts=["How do vector stores work?"],
    n_results=2,
    where={"source": "wiki"}  # Optional metadata filter
)
# results["documents"], results["distances"], results["metadatas"]

To use pre-computed embeddings, pass embeddings=[[...]] instead of documents in both add() and query() (via query_embeddings).

pgvector (PostgreSQL Extension)

Adds vector column type and similarity operators to PostgreSQL. Use when you already run Postgres and want vectors alongside relational data.

Setup and Schema

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(1536),
    metadata JSONB,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Indexing

-- HNSW index (recommended)
CREATE INDEX ON documents
    USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

-- IVFFlat index (faster to build, slower to query)
CREATE INDEX ON documents
    USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

Operator classes: vector_cosine_ops (<=>), vector_l2_ops (<->), vector_ip_ops (<#>).

Query

SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE metadata->>'category' = 'technical'
ORDER BY embedding <=> $1::vector
LIMIT 5;

Python: use psycopg with pgvector.psycopg.register_vector(conn) to pass vectors directly as parameters.

Pinecone (Managed Cloud)

Fully managed vector database. No infrastructure to maintain. Supports namespaces for logical partitioning.

pip install pinecone
from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")

pc.create_index(
    name="my-index",
    dimension=1536,
    metric="cosine",  # cosine, euclidean, dotproduct
    spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)

index = pc.Index("my-index")

# Upsert vectors
index.upsert(
    vectors=[
        {"id": "doc1", "values": [0.1, 0.2, ...],
         "metadata": {"source": "wiki", "topic": "databases"}},
        {"id": "doc2", "values": [0.3, 0.4, ...],
         "metadata": {"source": "blog", "topic": "search"}}
    ],
    namespace="articles"
)

# Query with metadata filter
results = index.query(
    vector=[0.1, 0.2, ...],
    top_k=5,
    namespace="articles",
    filter={"topic": {"$eq": "databases"}},
    include_metadata=True
)

Namespaces partition data within an index. Queries only search within one namespace. Use them to separate tenants, environments, or document types.

Qdrant (Self-Hosted or Cloud)

High-performance vector database written in Rust. Self-hosted via Docker or managed cloud.

docker run -p 6333:6333 qdrant/qdrant
pip install qdrant-client
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(url="http://localhost:6333")

client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

client.upsert(
    collection_name="documents",
    points=[
        PointStruct(id=1, vector=[0.1, 0.2, ...],
                    payload={"text": "Vector databases store embeddings", "source": "wiki"}),
        PointStruct(id=2, vector=[0.3, 0.4, ...],
                    payload={"text": "SQL databases use tables", "source": "docs"})
    ]
)

results = client.query_points(
    collection_name="documents",
    query=[0.1, 0.2, ...],
    limit=5,
    query_filter={"must": [{"key": "source", "match": {"value": "wiki"}}]}
)

Distance Metrics

Metric Range Best For Notes
Cosine 0 to 1 (distance) Text similarity Direction matters, magnitude ignored. Most common default.
Euclidean (L2) 0 to infinity Image features, spatial data Sensitive to magnitude.
Dot Product -inf to inf Pre-normalized vectors Fastest. Equivalent to cosine for unit vectors.
  • Use cosine unless you have a specific reason not to.
  • Use dot product when vectors are already unit-normalized.
  • Use Euclidean when vector magnitude carries meaning.

Indexing Types

Flat (Brute Force): Compares query against every vector. Perfect recall, slowest. Use for under 10k vectors.

HNSW (Hierarchical Navigable Small World): Graph-based approximate search. High recall (>95%), fast queries, higher memory. Best general-purpose index. Key params: M (connections per node), ef_construction (build quality), ef (search quality).

IVF (Inverted File Index): Clusters vectors, searches nearby clusters only. Faster to build than HNSW, lower recall. Key param: nlist (number of clusters). Good when you need fast index builds.

Chunking Strategies

Before embedding, long documents must be split into chunks.

Fixed-Size: Split into N-token chunks with overlap. Simple and predictable.

def fixed_chunks(text, size=512, overlap=50):
    words = text.split()
    return [" ".join(words[i:i+size]) for i in range(0, len(words), size - overlap)]

Sentence-Based: Split on sentence boundaries using nltk.sent_tokenize(). Preserves grammatical units.

Recursive Character Splitting: Split by paragraphs, then sentences, then words. Keeps semantically related text together. Used by LangChain's RecursiveCharacterTextSplitter.

Semantic Chunking: Group sentences by embedding similarity. Start a new chunk when similarity drops below threshold. Most coherent results, but slower and more expensive.

Guidelines:

  • 256-512 tokens is a good default chunk size.
  • Use 10-20% overlap to preserve context at boundaries.
  • Smaller chunks = more precise retrieval; larger chunks = more context per result.
  • Q&A benefits from smaller chunks; summarization from larger ones.

Metadata Filtering

All major vector databases support combining vector similarity with metadata filters.

Common operations:

  • Equality: {"category": "technical"}
  • Range: {"date": {"$gte": "2024-01-01"}}
  • List membership: {"tags": {"$in": ["python", "rust"]}}
  • Boolean: {"$and": [...]}, {"$or": [...]}

Pre-filtering reduces the number of vectors compared and improves query speed.

RAG Pipeline Pattern

Retrieval-Augmented Generation: embed query -> vector search -> inject context -> LLM generates answer.

def rag_query(question, collection, llm_client, embed_model, top_k=5):
    query_vector = embed_model.encode(question).tolist()
    results = collection.query(query_embeddings=[query_vector], n_results=top_k)
    context = "\n\n".join(results["documents"][0])

    response = llm_client.chat.completions.create(
        model="claude-sonnet-4-20250514",
        messages=[
            {"role": "system", "content":
                "Answer using only the provided context. "
                "If the context lacks the answer, say so."},
            {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
        ]
    )
    return response.choices[0].message.content

Key points:

  • Always use the same embedding model for documents and queries.
  • Retrieve more chunks than you think you need, then let the LLM filter relevance.
  • Include metadata (source, page number) so the LLM can cite sources.

Common Patterns

Document Q&A: Chunk documents (256-512 tokens with overlap), embed and store with metadata (doc ID, page, section), retrieve top-k at query time, pass to LLM.

Code Search: Parse into functions/classes, embed both code and natural language descriptions, use metadata filters for language/repo/path.

Recommendation Engine: Embed items by description/features, embed user preferences or recent interactions, search for similar items filtering out already-seen content.

Choosing a Vector Database

Need Recommendation
Prototyping, local dev ChromaDB
Already using PostgreSQL pgvector
Managed, zero-ops Pinecone
Self-hosted, high performance Qdrant
Large-scale production Qdrant or Pinecone
Tight budget, moderate scale pgvector or ChromaDB with persistent storage