🛠️ 開発・MCP コミュニティ

golang-performance

Go言語のパフォーマンスを最大化するため、プロファイリング、メモリ最適化、並行処理、エスケープ解析などの技術を駆使するSkill。

📜 元の英語説明(参考)

Go performance optimization techniques including profiling with pprof, memory optimization, concurrency patterns, and escape analysis.

🇯🇵 日本人クリエイター向け解説

一言でいうと

Go言語のパフォーマンスを最大化するため、プロファイリング、メモリ最適化、並行処理、エスケープ解析などの技術を駆使するSkill。

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o golang-performance.zip https://jpskill.com/download/6915.zip && unzip -o golang-performance.zip && rm golang-performance.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/6915.zip -OutFile "$d\golang-performance.zip"; Expand-Archive "$d\golang-performance.zip" -DestinationPath $d -Force; ri "$d\golang-performance.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して golang-performance.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → golang-performance フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-17
取得日時: 2026-05-17
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

Golang パフォーマンス

このスキルは、Goアプリケーションのパフォーマンスを最適化するためのガイダンスを提供します。これには、プロファイリング、メモリ管理、並行処理の最適化、および一般的なパフォーマンスの落とし穴の回避が含まれます。

このスキルを使用するタイミング

CPUまたはメモリの問題についてGoアプリケーションをプロファイリングする場合
メモリ割り当てを最適化し、GCの負荷を軽減する場合
効率的な並行処理パターンを実装する場合
エスケープ解析の結果を分析する場合
本番コードのホットパスを最適化する場合

pprof を使用したプロファイリング

HTTPサーバーでのプロファイリングの有効化

import (
    "net/http"
    _ "net/http/pprof"
)

func main() {
    // pprof エンドポイントは /debug/pprof/ で利用可能です
    go func() {
        http.ListenAndServe("localhost:6060", nil)
    }()

    // メインアプリケーション
}

CPUプロファイリング

# 30秒間のCPUプロファイルを収集
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

# 対話型コマンド
(pprof) top10          # CPU使用率上位10関数
(pprof) list FuncName  # タイミング情報付きのソースを表示
(pprof) web            # ブラウザでフレームグラフを開く

メモリプロファイリング

# ヒーププロファイル
go tool pprof http://localhost:6060/debug/pprof/heap

# Allocsプロファイル (すべての割り当て)
go tool pprof http://localhost:6060/debug/pprof/allocs

# 対話型コマンド
(pprof) top10 -cum     # 累積割り当て量で上位を表示
(pprof) list FuncName  # 割り当てサイトを表示

プログラムによるプロファイリング

import (
    "os"
    "runtime/pprof"
)

func profileCPU() {
    f, _ := os.Create("cpu.prof")
    defer f.Close()

    pprof.StartCPUProfile(f)
    defer pprof.StopCPUProfile()

    // プロファイルするコード
}

func profileMemory() {
    f, _ := os.Create("mem.prof")
    defer f.Close()

    runtime.GC() // 正確な統計情報を取得
    pprof.WriteHeapProfile(f)
}

メモリ最適化

割り当ての削減

// BAD: 呼び出しごとに割り当てが発生
func Process(items []string) []string {
    result := []string{}
    for _, item := range items {
        result = append(result, transform(item))
    }
    return result
}

// GOOD: 既知の容量で事前割り当て
func Process(items []string) []string {
    result := make([]string, 0, len(items))
    for _, item := range items {
        result = append(result, transform(item))
    }
    return result
}

頻繁な割り当てには sync.Pool を使用

var bufferPool = sync.Pool{
    New: func() interface{} {
        return new(bytes.Buffer)
    },
}

func ProcessRequest(data []byte) []byte {
    buf := bufferPool.Get().(*bytes.Buffer)
    defer func() {
        buf.Reset()
        bufferPool.Put(buf)
    }()

    // バッファを使用
    buf.Write(data)
    return buf.Bytes()
}

ループ内での文字列連結を避ける

// BAD: O(n^2) の割り当て
func BuildString(parts []string) string {
    result := ""
    for _, part := range parts {
        result += part
    }
    return result
}

// GOOD: 単一の割り当て
func BuildString(parts []string) string {
    var builder strings.Builder
    for _, part := range parts {
        builder.WriteString(part)
    }
    return builder.String()
}

スライスのメモリリーク

// BAD: バッキング配列全体が生き残る
func GetFirst(data []byte) []byte {
    return data[:10]
}

// GOOD: コピーしてバッキング配列を解放
func GetFirst(data []byte) []byte {
    result := make([]byte, 10)
    copy(result, data[:10])
    return result
}

エスケープ解析

# エスケープ解析の決定を表示
go build -gcflags="-m" ./...

# より詳細に
go build -gcflags="-m -m" ./...

ヒープエスケープの回避

// ESCAPES: 返されるポインタ
func NewUser() *User {
    return &User{}  // ヒープに割り当てられる
}

// STAYS ON STACK: 値の返却
func NewUser() User {
    return User{}  // スタックに留まる可能性がある
}

// ESCAPES: インターフェース変換
func Process(v interface{}) { ... }

func main() {
    x := 42
    Process(x)  // x はヒープにエスケープする
}

並行処理の最適化

ワーカープールパターン

func ProcessItems(items []Item, workers int) []Result {
    jobs := make(chan Item, len(items))
    results := make(chan Result, len(items))

    // ワーカーを開始
    var wg sync.WaitGroup
    for i := 0; i < workers; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for item := range jobs {
                results <- process(item)
            }
        }()
    }

    // ジョブを送信
    for _, item := range items {
        jobs <- item
    }
    close(jobs)

    // 待機して収集
    go func() {
        wg.Wait()
        close(results)
    }()

    var output []Result
    for r := range results {
        output = append(output, r)
    }
    return output
}

スループットのためのバッファ付きチャネル

// SLOW: バッファなしはブロッキングを引き起こす
ch := make(chan int)

// FAST: バッファは競合を減らす
ch := make(chan int, 100)

ロック競合の回避

// BAD: グローバルロック
var mu sync.Mutex
var cache = make(map[string]string)

func Get(key string) string {
    mu.Lock()
    defer mu.Unlock()
    return cache[key]
}

// GOOD: シャードロック
type ShardedCache struct {
    shards [256]struct {
        mu    sync.RWMutex
        items map[string]string
    }
}

func (c *ShardedCache) getShard(key string) *struct {
    mu    sync.RWMutex
    items map[string]string
} {
    h := fnv.New32a()
    h.Write([]byte(key))
    return &c.shards[h.Sum32()%256]
}

func (c *ShardedCache) Get(key string) string {
    shard := c.getShard(key)
    shard.mu.RLock()
    defer shard.mu.RUnlock()
    return shard.items[key]
}

特定のケースでの sync.Map の使用

// 次の場合に最適: キーが一度書き込まれ、何度も読み込まれる場合; 互いに素なキーセット
var cache sync.Map

func Get(key string) (string, bool) {
    v, ok := cache.Load(key)
    if !ok {
        return "", false
    }
    return v.(string), true
}

func Set(key, value string) {
    cache.Store(key, value)
}

データ構造の最適化

構造体フィールドの順序 (メモリ配置)

// BAD: 24バイト (パディング)
type Bad struct {
    a boo

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Golang Performance

This skill provides guidance on optimizing Go application performance including profiling, memory management, concurrency optimization, and avoiding common performance pitfalls.

When to Use This Skill

When profiling Go applications for CPU or memory issues
When optimizing memory allocations and reducing GC pressure
When implementing efficient concurrency patterns
When analyzing escape analysis results
When optimizing hot paths in production code

Profiling with pprof

Enable Profiling in HTTP Server

import (
    "net/http"
    _ "net/http/pprof"
)

func main() {
    // pprof endpoints available at /debug/pprof/
    go func() {
        http.ListenAndServe("localhost:6060", nil)
    }()

    // Main application
}

CPU Profiling

# Collect 30-second CPU profile
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

# Interactive commands
(pprof) top10          # Top 10 functions by CPU
(pprof) list FuncName  # Show source with timing
(pprof) web            # Open flame graph in browser

Memory Profiling

# Heap profile
go tool pprof http://localhost:6060/debug/pprof/heap

# Allocs profile (all allocations)
go tool pprof http://localhost:6060/debug/pprof/allocs

# Interactive commands
(pprof) top10 -cum     # Top by cumulative allocations
(pprof) list FuncName  # Show allocation sites

Programmatic Profiling

import (
    "os"
    "runtime/pprof"
)

func profileCPU() {
    f, _ := os.Create("cpu.prof")
    defer f.Close()

    pprof.StartCPUProfile(f)
    defer pprof.StopCPUProfile()

    // Code to profile
}

func profileMemory() {
    f, _ := os.Create("mem.prof")
    defer f.Close()

    runtime.GC() // Get accurate stats
    pprof.WriteHeapProfile(f)
}

Memory Optimization

Reduce Allocations

// BAD: Allocates on every call
func Process(items []string) []string {
    result := []string{}
    for _, item := range items {
        result = append(result, transform(item))
    }
    return result
}

// GOOD: Pre-allocate with known capacity
func Process(items []string) []string {
    result := make([]string, 0, len(items))
    for _, item := range items {
        result = append(result, transform(item))
    }
    return result
}

Use sync.Pool for Frequent Allocations

var bufferPool = sync.Pool{
    New: func() interface{} {
        return new(bytes.Buffer)
    },
}

func ProcessRequest(data []byte) []byte {
    buf := bufferPool.Get().(*bytes.Buffer)
    defer func() {
        buf.Reset()
        bufferPool.Put(buf)
    }()

    // Use buffer
    buf.Write(data)
    return buf.Bytes()
}

Avoid String Concatenation in Loops

// BAD: O(n^2) allocations
func BuildString(parts []string) string {
    result := ""
    for _, part := range parts {
        result += part
    }
    return result
}

// GOOD: Single allocation
func BuildString(parts []string) string {
    var builder strings.Builder
    for _, part := range parts {
        builder.WriteString(part)
    }
    return builder.String()
}

Slice Memory Leaks

// BAD: Keeps entire backing array alive
func GetFirst(data []byte) []byte {
    return data[:10]
}

// GOOD: Copy to release backing array
func GetFirst(data []byte) []byte {
    result := make([]byte, 10)
    copy(result, data[:10])
    return result
}

Escape Analysis

# Show escape analysis decisions
go build -gcflags="-m" ./...

# More verbose
go build -gcflags="-m -m" ./...

Avoiding Heap Escapes

// ESCAPES: Returned pointer
func NewUser() *User {
    return &User{}  // Allocated on heap
}

// STAYS ON STACK: Value return
func NewUser() User {
    return User{}  // May stay on stack
}

// ESCAPES: Interface conversion
func Process(v interface{}) { ... }

func main() {
    x := 42
    Process(x)  // x escapes to heap
}

Concurrency Optimization

Worker Pool Pattern

func ProcessItems(items []Item, workers int) []Result {
    jobs := make(chan Item, len(items))
    results := make(chan Result, len(items))

    // Start workers
    var wg sync.WaitGroup
    for i := 0; i < workers; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for item := range jobs {
                results <- process(item)
            }
        }()
    }

    // Send jobs
    for _, item := range items {
        jobs <- item
    }
    close(jobs)

    // Wait and collect
    go func() {
        wg.Wait()
        close(results)
    }()

    var output []Result
    for r := range results {
        output = append(output, r)
    }
    return output
}

Buffered Channels for Throughput

// SLOW: Unbuffered causes blocking
ch := make(chan int)

// FAST: Buffer reduces contention
ch := make(chan int, 100)

Avoid Lock Contention

// BAD: Global lock
var mu sync.Mutex
var cache = make(map[string]string)

func Get(key string) string {
    mu.Lock()
    defer mu.Unlock()
    return cache[key]
}

// GOOD: Sharded locks
type ShardedCache struct {
    shards [256]struct {
        mu    sync.RWMutex
        items map[string]string
    }
}

func (c *ShardedCache) getShard(key string) *struct {
    mu    sync.RWMutex
    items map[string]string
} {
    h := fnv.New32a()
    h.Write([]byte(key))
    return &c.shards[h.Sum32()%256]
}

func (c *ShardedCache) Get(key string) string {
    shard := c.getShard(key)
    shard.mu.RLock()
    defer shard.mu.RUnlock()
    return shard.items[key]
}

Use sync.Map for Specific Cases

// Good for: keys written once, read many; disjoint key sets
var cache sync.Map

func Get(key string) (string, bool) {
    v, ok := cache.Load(key)
    if !ok {
        return "", false
    }
    return v.(string), true
}

func Set(key, value string) {
    cache.Store(key, value)
}

Data Structure Optimization

Struct Field Ordering (Memory Alignment)

// BAD: 24 bytes (padding)
type Bad struct {
    a bool   // 1 byte + 7 padding
    b int64  // 8 bytes
    c bool   // 1 byte + 7 padding
}

// GOOD: 16 bytes (no padding)
type Good struct {
    b int64  // 8 bytes
    a bool   // 1 byte
    c bool   // 1 byte + 6 padding
}

Avoid Interface{} When Possible

// SLOW: Type assertions, boxing
func Sum(values []interface{}) float64 {
    var sum float64
    for _, v := range values {
        sum += v.(float64)
    }
    return sum
}

// FAST: Concrete types
func Sum(values []float64) float64 {
    var sum float64
    for _, v := range values {
        sum += v
    }
    return sum
}

Benchmarking Patterns

func BenchmarkProcess(b *testing.B) {
    data := generateTestData()
    b.ResetTimer() // Exclude setup time

    for i := 0; i < b.N; i++ {
        Process(data)
    }
}

// Memory benchmarks
func BenchmarkAllocs(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        _ = make([]byte, 1024)
    }
}

// Compare implementations
func BenchmarkComparison(b *testing.B) {
    b.Run("old", func(b *testing.B) {
        for i := 0; i < b.N; i++ {
            OldImplementation()
        }
    })
    b.Run("new", func(b *testing.B) {
        for i := 0; i < b.N; i++ {
            NewImplementation()
        }
    })
}

Run with:

go test -bench=. -benchmem ./...
go test -bench=. -benchtime=5s ./...  # Longer runs

Common Pitfalls

Defer in Hot Loops

// BAD: Defer overhead per iteration
for _, item := range items {
    mu.Lock()
    defer mu.Unlock()  // Defers stack up!
    process(item)
}

// GOOD: Explicit unlock
for _, item := range items {
    mu.Lock()
    process(item)
    mu.Unlock()
}

// BETTER: Extract to function
for _, item := range items {
    processWithLock(item)
}

func processWithLock(item Item) {
    mu.Lock()
    defer mu.Unlock()
    process(item)
}

JSON Encoding Performance

// SLOW: Reflection on every call
json.Marshal(v)

// FAST: Reuse encoder
var buf bytes.Buffer
encoder := json.NewEncoder(&buf)
encoder.Encode(v)

// FASTER: Code generation (easyjson, ffjson)

Best Practices

Measure before optimizing - Profile to find actual bottlenecks
Pre-allocate slices - Use make([]T, 0, capacity) when size is known
Pool frequently allocated objects - Use sync.Pool for buffers
Minimize allocations in hot paths - Reuse objects, avoid interfaces
Right-size channels - Buffer to reduce blocking without wasting memory
Avoid premature optimization - Clarity first, optimize measured problems
Use value receivers for small structs - Avoid pointer indirection
Order struct fields by size - Largest to smallest reduces padding