🛠️ 開発・MCP コミュニティ

zoom-rtms

Reference skill for Zoom RTMS. Use after routing to a live-media workflow when processing real-time audio, video, chat, transcripts, screen share, or contact-center voice streams.

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o zoom-rtms.zip https://jpskill.com/download/22702.zip && unzip -o zoom-rtms.zip && rm zoom-rtms.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/22702.zip -OutFile "$d\zoom-rtms.zip"; Expand-Archive "$d\zoom-rtms.zip" -DestinationPath $d -Force; ri "$d\zoom-rtms.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して zoom-rtms.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → zoom-rtms フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 7

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

[Skill 名] zoom-rtms

Zoom Realtime Media Streams (RTMS)

Zoomのライブメディアパイプラインに関する背景情報です。まず build-zoom-bot を優先し、その後、ストリームの種類、機能、およびRTMS固有の実装制約についてはこのスキルを使用してください。

Zoom Realtime Media Streams (RTMS)

Zoomミーティング、ウェビナー、Video SDKセッション、およびZoom Contact Center Voiceから、ライブの音声、ビデオ、トランスクリプト、チャット、画面共有データにリアルタイムでアクセスするための専門的なガイダンスです。RTMSはオープンスタンダードに基づくWebSocketベースのプロトコルを使用しており、メディアプレーンをキャプチャするためにミーティングボットは必要ありません。

まずお読みください (重要)

RTMSは主にバックエンドのメディア取り込みサービスです。

バックエンドはライブメディア（音声、ビデオ、画面共有、チャット、トランスクリプト）を受信し、処理します。
RTMSはそれ自体がフロントエンドUI SDKではありません。
処理はイベント駆動型です。バックエンドはストリーム処理が開始される前にRTMS開始のWebhookイベントを待ちます。

オプションのアーキテクチャ（一般的）：

クライアント内UI/コントロールのためにZoom App SDKフロントエンドを追加します。
バックエンドのRTMS出力をWebSocket（またはSSE、gRPC、キューワーカーなど）を介してフロントエンドにストリーミングします。

メディア/データプレーンにはRTMSを使用し、プレゼンテーションとユーザーインタラクションにはフロントエンドフレームワーク/Zoom Appsを使用してください。

公式ドキュメント: https://developers.zoom.us/docs/rtms/ SDKリファレンス (JS): https://zoom.github.io/rtms/js/ SDKリファレンス (Python): https://zoom.github.io/rtms/py/ サンプルリポジトリ: https://github.com/zoom/rtms-samples

クイックリンク

RTMSを初めてお使いですか？以下のパスに従ってください:

Connection Architecture - 2段階のWebSocket設計
SDK Quickstart - 最速でメディアを受信する（推奨）
Manual WebSocket - SDKなしでプロトコルを完全に制御
Media Types - 音声、ビデオ、トランスクリプト、チャット、画面共有

完全な実装:

RTMS Bot - エンドツーエンドのボット実装ガイド

リファレンス:

Lifecycle Flow - Webhookからストリーミングまでの完全なフロー
Data Types - すべての列挙型と定数
Webhooks - イベントサブスクリプションの詳細
Environment Variables - 認証情報モードとランタイム設定
Quickstart Notes - セカンダリクイックスタートガイド
Integrated Index - このファイルの以下のセクションを参照してください

問題がありますか？

接続が失敗する -> Common Issues
重複接続 -> Webhook Gotchas
音声/ビデオがない -> Media Configuration
プレフライトチェックから始める -> 5-Minute Runbook

サポートされている製品

製品	Webhookイベント	ペイロードID	アプリタイプ
Meetings	`meeting.rtms_started` / `meeting.rtms_stopped`	`meeting_uuid`	General App
Webinars	`webinar.rtms_started` / `webinar.rtms_stopped`	`meeting_uuid` (同じ！)	General App
Video SDK	`session.rtms_started` / `session.rtms_stopped`	`session_id`	Video SDK App
Zoom Contact Center Voice	製品固有のRTMS/ZCC Voiceイベント	製品固有のストリーム/セッション識別子	Contact Center / 承認されたRTMS統合

一度接続されると、コアのシグナリング/メディアソケットモデルは製品間で共有されます。ミーティング、ウェビナー、Video SDKセッションは、おなじみの開始/停止Webhookを使用します。Zoom Contact Center Voiceは独自のRTMS/ZCC Voiceイベントファミリーを追加し、製品固有のイベントペイロードを持つ同じトランスポートモデルとして扱われるべきです。

RTMS概要

RTMSは、参加者ボットなしでZoomミーティング、ウェビナー、Video SDKセッションからのライブメディアにアプリがアクセスできるようにするデータパイプラインです。自動化されたクライアントがミーティングに参加する代わりに、RTMSを使用してZoomのインフラストラクチャから直接メディアデータを収集します。

RTMSが提供するもの

メディアタイプ	フォーマット	ユースケース
音声	PCM (L16), G.711, G.722, Opus	文字起こし、音声分析、録音
ビデオ	H.264, JPG, PNG	録画、AIビジョン、サムネイル、アクティブな参加者選択
画面共有	H.264, JPG, PNG	コンテンツキャプチャ、スライド抽出
トランスクリプト	JSONテキスト	会議メモ、検索、コンプライアンス
チャット	JSONテキスト	アーカイブ、感情分析

2026年3月のプロトコル変更

Zoom Contact Center Voiceのサポート: RTMSがContact Center Voiceの音声およびトランスクリプトシナリオをカバーするようになりました。
トランスクリプト言語識別制御: トランスクリプトメディアのハンドシェイクで src_language と enable_lid がサポートされるようになりました。デフォルトの動作はLIDが有効です。固定言語を強制するには enable_lid: false を設定してください。
単一の個別ビデオストリームサブスクリプション: data_opt が VIDEO_SINGLE_INDIVIDUAL_STREAM に設定されている場合、RTMSは一度に1人の参加者のカメラフィードをストリーミングできるようになりました。
クライアント主導の正常なシャットダウン: バックエンドはシグナリングソケット経由で STREAM_CLOSE_REQ を送信し、STREAM_CLOSE_RESP を待つことができます。
メディアキープアライブ許容時間の増加: メディアソケットのキープアライブタイムアウトが35秒ではなく65秒になりました。

2つのアプローチ

アプローチ	最適な用途	複雑さ
SDK (`@zoom/rtms`)	ほとんどのユースケース	低 - WebSocketの複雑さを処理
Manual WebSocket	カスタムプロトコル、その他の言語	高 - 完全なプロトコル実装

前提条件

JavaScript SDKにはNode.js 20.3.0+（24 LTS推奨）
Python SDKにはPython 3.10+
RTMS機能が有効なZoom General App（ミーティング/ウェビナー用）またはVideo SDK App（Video SDK用）
RTMSイベント用のWebhookエンドポイント
WebSocketストリームを受信するサーバー

RTMSアクセスが必要ですか？ Zoom Developer Forumにユースケースを添えてRTMSアクセスをリクエストしてください。

クイックスタート (SDK - 推奨)


i

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Zoom Realtime Media Streams (RTMS)

Background reference for live Zoom media pipelines. Prefer build-zoom-bot first, then use this skill for stream types, capabilities, and RTMS-specific implementation constraints.

Zoom Realtime Media Streams (RTMS)

Expert guidance for accessing live audio, video, transcript, chat, and screen share data from Zoom meetings, webinars, Video SDK sessions, and Zoom Contact Center Voice in real-time. RTMS uses a WebSocket-based protocol with open standards and does not require a meeting bot to capture the media plane.

Read This First (Critical)

RTMS is primarily a backend media ingestion service.

Your backend receives and processes live media: audio, video, screen share, chat, transcript.
RTMS is not a frontend UI SDK by itself.
Processing is event-triggered: backend waits for RTMS start webhook events before stream handling begins.

Optional architecture (common):

Add a Zoom App SDK frontend for in-client UI/controls.
Stream backend RTMS outputs to frontend via WebSocket (or SSE, gRPC, queue workers, etc.).

Use RTMS for media/data plane, and use frontend frameworks/Zoom Apps for presentation + user interactions.

Official Documentation: https://developers.zoom.us/docs/rtms/ SDK Reference (JS): https://zoom.github.io/rtms/js/ SDK Reference (Python): https://zoom.github.io/rtms/py/ Sample Repository: https://github.com/zoom/rtms-samples

Quick Links

New to RTMS? Follow this path:

Connection Architecture - Two-phase WebSocket design
SDK Quickstart - Fastest way to receive media (recommended)
Manual WebSocket - Full protocol control without SDK
Media Types - Audio, video, transcript, chat, screen share

Complete Implementation:

RTMS Bot - End-to-end bot implementation guide

Reference:

Lifecycle Flow - Complete webhook-to-streaming flow
Data Types - All enums and constants
Webhooks - Event subscription details
Environment Variables - credential modes and runtime knobs
Quickstart Notes - Secondary quickstart guide
Integrated Index - see the section below in this file

Having issues?

Connection fails -> Common Issues
Duplicate connections -> Webhook Gotchas
No audio/video -> Media Configuration
Start with preflight checks -> 5-Minute Runbook

Supported Products

Product	Webhook Event	Payload ID	App Type
Meetings	`meeting.rtms_started` / `meeting.rtms_stopped`	`meeting_uuid`	General App
Webinars	`webinar.rtms_started` / `webinar.rtms_stopped`	`meeting_uuid` (same!)	General App
Video SDK	`session.rtms_started` / `session.rtms_stopped`	`session_id`	Video SDK App
Zoom Contact Center Voice	Product-specific RTMS/ZCC Voice events	Product-specific stream/session identifiers	Contact Center / approved RTMS integration

Once connected, the core signaling/media socket model is shared across products. Meetings, webinars, and Video SDK sessions use the familiar start/stop webhooks. Zoom Contact Center Voice adds its own RTMS/ZCC Voice event family and should be treated as the same transport model with product-specific event payloads.

RTMS Overview

RTMS is a data pipeline that gives your app access to live media from Zoom meetings, webinars, and Video SDK sessions without participant bots. Instead of having automated clients join meetings, use RTMS to collect media data directly from Zoom's infrastructure.

What RTMS Provides

Media Type	Format	Use Cases
Audio	PCM (L16), G.711, G.722, Opus	Transcription, voice analysis, recording
Video	H.264, JPG, PNG	Recording, AI vision, thumbnails, active participant selection
Screen Share	H.264, JPG, PNG	Content capture, slide extraction
Transcript	JSON text	Meeting notes, search, compliance
Chat	JSON text	Archive, sentiment analysis

March 2026 Protocol Changes

Zoom Contact Center Voice support: RTMS now covers Contact Center Voice audio and transcript scenarios.
Transcript Language Identification control: transcript media handshakes now support src_language and enable_lid. Default behavior is LID enabled. Set enable_lid: false to force a fixed language.
Single individual video stream subscription: RTMS can now stream one participant's camera feed at a time when data_opt is set to VIDEO_SINGLE_INDIVIDUAL_STREAM.
Graceful client-initiated shutdown: backends can send STREAM_CLOSE_REQ over the signaling socket and wait for STREAM_CLOSE_RESP.
Media keep-alive tolerance increased: media socket keep-alive timeout is now 65 seconds, not 35.

Two Approaches

Approach	Best For	Complexity
SDK (`@zoom/rtms`)	Most use cases	Low - handles WebSocket complexity
Manual WebSocket	Custom protocols, other languages	High - full protocol implementation

Prerequisites

Node.js 20.3.0+ (24 LTS recommended) for JavaScript SDK
Python 3.10+ for Python SDK
Zoom General App (for meetings/webinars) or Video SDK App (for Video SDK) with RTMS feature enabled
Webhook endpoint for RTMS events
Server to receive WebSocket streams

Need RTMS access? Post in Zoom Developer Forum requesting RTMS access with your use case.

Quick Start (SDK - Recommended)

import rtms from "@zoom/rtms";

// All RTMS start/stop events across products
const RTMS_EVENTS = ["meeting.rtms_started", "webinar.rtms_started", "session.rtms_started"];

// Handle webhook events
rtms.onWebhookEvent(({ event, payload }) => {
  if (!RTMS_EVENTS.includes(event)) return;

  const client = new rtms.Client();

  client.onAudioData((data, timestamp, metadata) => {
    console.log(`Audio from ${metadata.userName}: ${data.length} bytes`);
  });

  client.onTranscriptData((data, timestamp, metadata) => {
    const text = data.toString('utf8');
    console.log(`${metadata.userName}: ${text}`);
  });

  client.onJoinConfirm((reason) => {
    console.log(`Joined session: ${reason}`);
  });

  // SDK handles all WebSocket connections automatically
  // Accepts both meeting_uuid and session_id transparently
  client.join(payload);
});

Quick Start (Manual WebSocket)

For full control or non-SDK languages, implement the two-phase WebSocket protocol:

const WebSocket = require('ws');
const crypto = require('crypto');

const RTMS_EVENTS = ['meeting.rtms_started', 'webinar.rtms_started', 'session.rtms_started'];

// 1. Generate signature
// For meetings/webinars: uses meeting_uuid. For Video SDK: uses session_id.
function generateSignature(clientId, idValue, streamId, clientSecret) {
  const message = `${clientId},${idValue},${streamId}`;
  return crypto.createHmac('sha256', clientSecret).update(message).digest('hex');
}

// 2. Handle webhook
app.post('/webhook', (req, res) => {
  res.status(200).send();  // CRITICAL: Respond immediately!

  const { event, payload } = req.body;
  if (RTMS_EVENTS.includes(event)) {
    connectToRTMS(payload);
  }
});

// 3. Connect to signaling WebSocket
function connectToRTMS(payload) {
  const { server_urls, rtms_stream_id } = payload;
  // meeting_uuid for meetings/webinars, session_id for Video SDK
  const idValue = payload.meeting_uuid || payload.session_id;
  const signature = generateSignature(CLIENT_ID, idValue, rtms_stream_id, CLIENT_SECRET);

  const signalingWs = new WebSocket(server_urls);

  signalingWs.on('open', () => {
    signalingWs.send(JSON.stringify({
      msg_type: 1,  // Handshake request
      protocol_version: 1,
      meeting_uuid: idValue,
      rtms_stream_id,
      signature,
      media_type: 9  // AUDIO(1) | TRANSCRIPT(8)
    }));
  });

  // ... handle responses, connect to media WebSocket
}

See: Manual WebSocket Guide for complete implementation.

Media Type Bitmask

Combine types with bitwise OR:

Type	Value	Description
Audio	1	PCM audio samples
Video	2	H.264/JPG video frames
Screen Share	4	Separate from video!
Transcript	8	Real-time speech-to-text
Chat	16	In-meeting chat messages
All	32	All media types

Example: Audio + Transcript = 1 | 8 = 9

Critical Gotchas

Issue	Solution
Only 1 connection allowed	New connections kick out existing ones. Track active sessions!
Respond 200 immediately	If webhook delays, Zoom retries creating duplicate connections
Heartbeat mandatory	Respond to msg_type 12 with msg_type 13, or connection dies
Reconnection is YOUR job	RTMS doesn't auto-reconnect. Media keep-alive tolerance is now about 65s; signaling remains around 60s
Transcript language drift	Use `src_language` plus `enable_lid: false` when you want fixed-language transcription instead of automatic language switching
Single participant video only	`VIDEO_SINGLE_INDIVIDUAL_STREAM` supports one participant at a time. A new `VIDEO_SUBSCRIPTION_REQ` overrides the previous selection
Graceful close is explicit now	Use `STREAM_CLOSE_REQ` / `STREAM_CLOSE_RESP` when your backend wants to terminate the stream cleanly

Environment Variables

SDK Environment Variables

# Required - Authentication
ZM_RTMS_CLIENT=your_client_id          # Zoom OAuth Client ID
ZM_RTMS_SECRET=your_client_secret      # Zoom OAuth Client Secret

# Optional - Webhook server
ZM_RTMS_PORT=8080                      # Default: 8080
ZM_RTMS_PATH=/webhook                  # Default: /

# Optional - Logging
ZM_RTMS_LOG_LEVEL=info                 # error, warn, info, debug, trace
ZM_RTMS_LOG_FORMAT=progressive         # progressive or json
ZM_RTMS_LOG_ENABLED=true

Manual Implementation Variables

ZOOM_CLIENT_ID=your_client_id
ZOOM_CLIENT_SECRET=your_client_secret
ZOOM_SECRET_TOKEN=your_webhook_token   # For webhook validation

Zoom App Setup

For Meetings and Webinars (General App)

Go to marketplace.zoom.us -> Develop -> Build App
Choose General App -> User-Managed
Features -> Access -> Enable Event Subscription
Add Events -> Search "rtms" -> Select:
- meeting.rtms_started
- meeting.rtms_stopped
- webinar.rtms_started (if using webinars)
- webinar.rtms_stopped (if using webinars)
Scopes -> Add Scopes -> Search "rtms" -> Add:
- meeting:read:meeting_audio
- meeting:read:meeting_video
- meeting:read:meeting_transcript
- meeting:read:meeting_chat
- webinar:read:webinar_audio (if using webinars)
- webinar:read:webinar_video (if using webinars)
- webinar:read:webinar_transcript (if using webinars)
- webinar:read:webinar_chat (if using webinars)

For Video SDK (Video SDK App)

Go to marketplace.zoom.us -> Develop -> Build App
Choose Video SDK App
Use your SDK Key and SDK Secret (not OAuth Client ID/Secret)
Add Events:
- session.rtms_started
- session.rtms_stopped

Sample Repositories

Official Samples

Repository	Description
rtms-samples	RTMSManager, boilerplates, AI samples
rtms-quickstart-js	JavaScript SDK quickstart
rtms-quickstart-py	Python SDK quickstart
rtms-sdk-cpp	C++ SDK
zoom-rtms	Main SDK repository

AI Integration Samples

Sample	Description
rtms-meeting-assistant-starter-kit	AI meeting assistant with summaries
arlo-meeting-assistant	Production meeting assistant with DB
videosdk-rtms-transcribe-audio	Whisper transcription

Complete Documentation

Concepts

Connection Architecture - Two-phase WebSocket design
Lifecycle Flow - Webhook to streaming flow

Examples

SDK Quickstart - Using @zoom/rtms SDK
Manual WebSocket - Raw protocol implementation
RTMS Bot - Complete bot implementation guide
AI Integration - Transcription and analysis patterns

References

Media Types - Audio, video, transcript, chat, screen share
Data Types - All enums and constants
Connection - WebSocket protocol details
Webhooks - Event subscription

Troubleshooting

Common Issues - FAQ and solutions

Resources

Official docs: https://developers.zoom.us/docs/rtms/
Data types: https://developers.zoom.us/docs/rtms/data-types/
Media params: https://developers.zoom.us/docs/rtms/media-parameter-definition/
Developer forum: https://devforum.zoom.us/

Need help? Start with Integrated Index section below for complete navigation.

Integrated Index

This section was migrated from SKILL.md.

RTMS provides real-time access to live audio, video, transcript, chat, and screen share from Zoom meetings, webinars, and Video SDK sessions.

Critical Positioning

Treat RTMS as a backend service for receiving and processing media streams.

Backend role: ingest audio/video/share/chat/transcript, run AI/analytics, persist/forward data.
Optional frontend role: Zoom App SDK or web dashboard that consumes processed stream data from backend transport (WebSocket/SSE/other).
Kickoff model: backend waits for RTMS start webhook events, then starts stream processing.

Do not model RTMS as a frontend-only SDK.

Quick Start Path

If you're new to RTMS, follow this order:

Run preflight checks first -> RUNBOOK.md
Understand the architecture -> concepts/connection-architecture.md
- Two-phase WebSocket: Signaling + Media
- Why RTMS doesn't use bots
Choose your approach -> SDK or Manual
- SDK (recommended): examples/sdk-quickstart.md
- Manual WebSocket: examples/manual-websocket.md
Understand the lifecycle -> concepts/lifecycle-flow.md
- Webhook -> Signaling -> Media -> Streaming
Configure media types -> references/media-types.md
- Audio, video, transcript, chat, screen share
Troubleshoot issues -> troubleshooting/common-issues.md
- Connection problems, duplicate webhooks, missing data

Documentation Structure

rtms/
├── SKILL.md                           # Main skill overview
├── SKILL.md                           # This file - navigation guide
│
├── concepts/                          # Core architectural patterns
│   ├── connection-architecture.md     # Two-phase WebSocket design
│   └── lifecycle-flow.md              # Webhook to streaming flow
│
├── examples/                          # Complete working code
│   ├── sdk-quickstart.md              # Using @zoom/rtms SDK
│   ├── manual-websocket.md            # Raw protocol implementation
│   ├── rtms-bot.md                    # Complete RTMS bot implementation
│   └── ai-integration.md              # Transcription and analysis
│
├── references/                        # Reference documentation
│   ├── media-types.md                 # Audio, video, transcript, chat, share
│   ├── data-types.md                  # All enums and constants
│   ├── connection.md                  # WebSocket protocol details
│   └── webhooks.md                    # Event subscription
│
└── troubleshooting/                   # Problem solving guides
    └── common-issues.md               # FAQ and solutions

By Use Case

I want to get meeting transcripts

SDK Quickstart - Fastest approach
Media Types - Transcript configuration
AI Integration - Whisper, Deepgram, AssemblyAI

I want to record meetings

Media Types - Audio + Video configuration
SDK Quickstart - Receiving media
AI Integration - Gap-filled recording

I want to build an AI meeting assistant

AI Integration - Complete patterns
SDK Quickstart - Media ingestion
Lifecycle Flow - Event handling

I want to build a complete RTMS bot

RTMS Bot - Complete implementation guide
Lifecycle Flow - Webhook to streaming flow
Connection Architecture - Two-phase design

I need full protocol control

Manual WebSocket - START HERE
Connection Architecture - Two-phase design
Data Types - All message types and enums
Connection - Protocol details

I'm getting connection errors

Common Issues - Diagnostic checklist
Connection Architecture - Verify flow
Webhooks - Validation and timing

I want to understand the architecture

Connection Architecture - Two-phase WebSocket
Lifecycle Flow - Complete flow diagram
Data Types - Protocol constants

By Product

I'm building for Zoom Meetings

Standard RTMS setup. Webhook event: meeting.rtms_started. Uses General App with OAuth.
Start with SDK Quickstart or Manual WebSocket.

I'm building for Zoom Webinars

Same as meetings, but webhook event is webinar.rtms_started. Payload still uses meeting_uuid (NOT webinar_uuid).
Add webinar scopes and event subscriptions. See Webhooks.
Only panelist streams are confirmed available. Attendee streams may not be individual.

I'm building for Zoom Video SDK

Webhook event: session.rtms_started. Payload uses session_id (NOT meeting_uuid).
Requires a Video SDK App with SDK Key/Secret (not OAuth Client ID/Secret).
Once connected, the protocol is identical to meetings.
See Webhooks for payload details.

Key Documents

1. Connection Architecture (CRITICAL)

concepts/connection-architecture.md

RTMS uses two separate WebSocket connections:

Signaling WebSocket: Authentication, control, heartbeats
Media WebSocket: Actual audio/video/transcript data

2. SDK vs Manual (DECISION POINT)

examples/sdk-quickstart.md vs examples/manual-websocket.md

SDK	Manual
Handles WebSocket complexity	Full protocol control
Automatic reconnection	DIY reconnection
Less code	More code
Best for most use cases	Best for custom requirements

3. Critical Gotchas (MOST COMMON ISSUES)

troubleshooting/common-issues.md

Respond 200 immediately - Delayed webhook responses cause duplicates
Only 1 connection per stream - New connections kick out existing
Heartbeat required - Must respond to keep-alive or connection dies
Track active sessions - Prevent duplicate join attempts

Key Learnings

Critical Discoveries:

Two-Phase WebSocket Design
- Signaling: Control plane (handshake, heartbeat, start/stop)
- Media: Data plane (audio, video, transcript, chat, share)
- See: Connection Architecture
Webhook Response Timing
- MUST respond 200 BEFORE any processing
- Delayed response -> Zoom retries -> duplicate connections
- See: Common Issues
Heartbeat is Mandatory
- Signaling: Receive msg_type 12, respond with msg_type 13
- Media: Same pattern
- Failure to respond = connection closed
- See: Connection
Signature Generation
- Format: HMAC-SHA256(clientSecret, "clientId,meetingUuid,streamId")
- For Video SDK, use session_id in place of meetingUuid
- Webinars still use meeting_uuid (not webinar_uuid)
- Required for both signaling and media handshakes
- See: Manual WebSocket
Media Types are Bitmasks
- Audio=1, Video=2, Share=4, Transcript=8, Chat=16, All=32
- Combine with OR: Audio+Transcript = 1|8 = 9
- See: Media Types
Screen Share is SEPARATE from Video
- Different msg_type (16 vs 15)
- Different media flag (4 vs 2)
- Must subscribe separately
- See: Media Types

Quick Reference

"Connection fails"

-> Common Issues

"Duplicate connections"

-> Webhook timing

"No audio/video data"

-> Media Types - Check configuration

Document Version

Based on Zoom RTMS SDK v1.x and official documentation as of 2026.

Happy coding!

Remember: Start with SDK Quickstart for the fastest path, or Manual WebSocket if you need full control.

同梱ファイル

※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。

📄 SKILL.md (23,952 bytes)
📎 references/connection.md (8,226 bytes)
📎 references/data-types.md (13,981 bytes)
📎 references/environment-variables.md (1,303 bytes)
📎 references/media-types.md (6,103 bytes)
📎 references/quickstart.md (6,734 bytes)
📎 references/webhooks.md (7,624 bytes)

zoom-rtms

🎯 このSkillでできること

📦 インストール方法 (3ステップ)

📖 Skill本文(日本語訳)

Zoom Realtime Media Streams (RTMS)

Zoom Realtime Media Streams (RTMS)

まずお読みください (重要)

クイックリンク

サポートされている製品

RTMS概要

RTMSが提供するもの

2026年3月のプロトコル変更

2つのアプローチ

前提条件

クイックスタート (SDK - 推奨)

Zoom Realtime Media Streams (RTMS)

Zoom Realtime Media Streams (RTMS)

Read This First (Critical)

Quick Links

Supported Products

RTMS Overview

What RTMS Provides

March 2026 Protocol Changes

Two Approaches

Prerequisites

Quick Start (SDK - Recommended)

Quick Start (Manual WebSocket)

Media Type Bitmask

Critical Gotchas

Environment Variables

SDK Environment Variables

Manual Implementation Variables

Zoom App Setup

For Meetings and Webinars (General App)

For Video SDK (Video SDK App)

Sample Repositories

Official Samples

AI Integration Samples

Complete Documentation

Concepts

Examples

References

Troubleshooting

Resources

Integrated Index

Critical Positioning

Quick Start Path

Documentation Structure

By Use Case

I want to get meeting transcripts

I want to record meetings

I want to build an AI meeting assistant

I want to build a complete RTMS bot

I need full protocol control

I'm getting connection errors

I want to understand the architecture

By Product

I'm building for Zoom Meetings

I'm building for Zoom Webinars

I'm building for Zoom Video SDK

Key Documents

1. Connection Architecture (CRITICAL)

2. SDK vs Manual (DECISION POINT)

3. Critical Gotchas (MOST COMMON ISSUES)

Key Learnings

Critical Discoveries:

Quick Reference

"Connection fails"

"Duplicate connections"

"No audio/video data"

"How do I implement manually?"

"What message types exist?"

"How do I integrate AI?"

Document Version

同梱ファイル