Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions examples/180-zoom-recording-transcription-node/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Deepgram — https://console.deepgram.com/
DEEPGRAM_API_KEY=

# Zoom Server-to-Server OAuth — https://marketplace.zoom.us/
ZOOM_ACCOUNT_ID=
ZOOM_CLIENT_ID=
ZOOM_CLIENT_SECRET=
ZOOM_WEBHOOK_SECRET_TOKEN=
72 changes: 72 additions & 0 deletions examples/180-zoom-recording-transcription-node/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Zoom Cloud Recording Transcription with Deepgram

Automatically transcribe Zoom cloud recordings using Deepgram's nova-3 speech-to-text model. When a Zoom meeting recording completes, this server receives the webhook, downloads the audio, and produces a formatted transcript with speaker labels.

## What you'll build

A Node.js/Express server that receives Zoom `recording.completed` webhook events, downloads the recording via Zoom's Server-to-Server OAuth, and transcribes it using Deepgram nova-3 with speaker diarization and smart formatting.

## Prerequisites

- Node.js 18 or later
- Deepgram account — [get a free API key](https://console.deepgram.com/)
- Zoom account with a Server-to-Server OAuth app — [create one](https://developers.zoom.us/docs/internal-apps/create/)

## Environment variables

Copy `.env.example` to `.env` and fill in your credentials:

| Variable | Where to find it |
|----------|-----------------|
| `DEEPGRAM_API_KEY` | [Deepgram console → API Keys](https://console.deepgram.com/) |
| `ZOOM_ACCOUNT_ID` | [Zoom Marketplace](https://marketplace.zoom.us/) → your Server-to-Server OAuth app → App Credentials |
| `ZOOM_CLIENT_ID` | Same app → App Credentials |
| `ZOOM_CLIENT_SECRET` | Same app → App Credentials |
| `ZOOM_WEBHOOK_SECRET_TOKEN` | Same app → Feature tab → Event Subscriptions → Secret Token |

## Install and run

```bash
npm install
npm start
```

The server starts on port 3000 (override with `PORT` env var). Expose it publicly with a tunnel for Zoom webhooks:

```bash
npx localtunnel --port 3000
```

## Zoom app setup

1. Go to [Zoom Marketplace](https://marketplace.zoom.us/) → Develop → Build App
2. Choose **Server-to-Server OAuth**
3. Add scopes: `cloud_recording:read:list_recording_files:admin`
4. Under **Feature** → **Event Subscriptions**, add:
- Event subscription URL: `https://your-domain.com/webhook`
- Event type: `recording.completed`
5. Zoom will send a validation request — the server handles it automatically

## Key parameters

| Parameter | Value | Description |
|-----------|-------|-------------|
| `model` | `nova-3` | Deepgram's latest general-purpose STT model |
| `smart_format` | `true` | Adds punctuation, capitalization, number formatting |
| `diarize` | `true` | Labels speakers (Speaker 0, Speaker 1, etc.) |
| `paragraphs` | `true` | Groups transcript into readable paragraphs |

## How it works

1. A Zoom cloud recording finishes → Zoom fires a `recording.completed` webhook
2. The server validates the webhook signature using your secret token
3. It extracts the recording download URL from the payload, preferring audio-only files
4. It authenticates with Zoom's Server-to-Server OAuth to get an access token
5. It downloads the recording audio file
6. It sends the audio buffer to Deepgram's pre-recorded STT API (`transcribeFile`)
7. Deepgram returns a transcript with speaker labels and smart formatting
8. The transcript is logged (extend this to store, email, or post to Slack)

## Starter templates

[deepgram-starters](https://github.com/orgs/deepgram-starters/repositories)
18 changes: 18 additions & 0 deletions examples/180-zoom-recording-transcription-node/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{
"name": "zoom-recording-transcription-node",
"version": "1.0.0",
"description": "Transcribe Zoom cloud recordings using Deepgram nova-3",
"main": "src/server.js",
"scripts": {
"start": "node src/server.js",
"test": "node tests/test.js"
},
"dependencies": {
"@deepgram/sdk": "^5.0.0",
"dotenv": "^16.4.0",
"express": "^4.21.0"
},
"engines": {
"node": ">=18"
}
}
176 changes: 176 additions & 0 deletions examples/180-zoom-recording-transcription-node/src/server.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
'use strict';

require('dotenv').config();

const crypto = require('crypto');
const express = require('express');
const { DeepgramClient } = require('@deepgram/sdk');

const PORT = process.env.PORT || 3000;

const REQUIRED_ENV = [
'DEEPGRAM_API_KEY',
'ZOOM_ACCOUNT_ID',
'ZOOM_CLIENT_ID',
'ZOOM_CLIENT_SECRET',
'ZOOM_WEBHOOK_SECRET_TOKEN',
];

for (const key of REQUIRED_ENV) {
if (!process.env[key]) {
console.error(`Error: ${key} environment variable is not set.`);
console.error('Copy .env.example to .env and add your credentials.');
process.exit(1);
}
}

// SDK v5: constructor takes an options object, not a bare string.
const deepgram = new DeepgramClient({ apiKey: process.env.DEEPGRAM_API_KEY });

const app = express();
app.use(express.json());

// ── Zoom webhook endpoint ────────────────────────────────────────────────────
// Zoom sends two event types here:
// 1. endpoint.url_validation — a challenge/response handshake when you first
// register the webhook URL in the Zoom Marketplace.
// 2. recording.completed — fired when a cloud recording finishes processing.
app.post('/webhook', async (req, res) => {
const { event, payload } = req.body;

// ← THIS handles Zoom's webhook URL validation handshake.
// Zoom POSTs a plainToken that must be hashed with your secret and returned.
if (event === 'endpoint.url_validation') {
const hashForValidation = crypto
.createHmac('sha256', process.env.ZOOM_WEBHOOK_SECRET_TOKEN)
.update(req.body.payload.plainToken)
.digest('hex');

return res.json({
plainToken: req.body.payload.plainToken,
encryptedToken: hashForValidation,
});
}

// Verify webhook signature to ensure the request came from Zoom.
const message = `v0:${req.headers['x-zm-request-timestamp']}:${JSON.stringify(req.body)}`;
const expectedSig = `v0=${crypto
.createHmac('sha256', process.env.ZOOM_WEBHOOK_SECRET_TOKEN)
.update(message)
.digest('hex')}`;

if (req.headers['x-zm-signature'] !== expectedSig) {
console.error('Invalid webhook signature — rejecting request');
return res.status(401).json({ error: 'Invalid signature' });
}

if (event !== 'recording.completed') {
return res.json({ status: 'ignored', event });
}

res.json({ status: 'processing' });

try {
await handleRecordingCompleted(payload);
} catch (err) {
console.error('Error processing recording:', err.message);
}
});

// ── Zoom OAuth ───────────────────────────────────────────────────────────────
// Server-to-Server OAuth uses client_credentials grant with account_id.
// Token is short-lived (1 hour) — fetch a fresh one each time for simplicity.
async function getZoomAccessToken() {
const credentials = Buffer.from(
`${process.env.ZOOM_CLIENT_ID}:${process.env.ZOOM_CLIENT_SECRET}`
).toString('base64');

const resp = await fetch(
`https://zoom.us/oauth/token?grant_type=account_credentials&account_id=${process.env.ZOOM_ACCOUNT_ID}`,
{
method: 'POST',
headers: { Authorization: `Basic ${credentials}` },
}
);

if (!resp.ok) {
throw new Error(`Zoom OAuth failed: ${resp.status} ${await resp.text()}`);
}

const data = await resp.json();
return data.access_token;
}

// ── Recording handler ────────────────────────────────────────────────────────
async function handleRecordingCompleted(payload) {
const { object } = payload;
const meetingTopic = object.topic || 'Untitled Meeting';

// Prefer audio_only files — smaller and faster to transcribe than video.
const audioFile = object.recording_files.find(
(f) => f.recording_type === 'audio_only'
) || object.recording_files[0];

if (!audioFile) {
console.log('No recording files found in payload');
return;
}

console.log(`\nProcessing: "${meetingTopic}"`);
console.log(`Recording type: ${audioFile.recording_type}, format: ${audioFile.file_extension}`);

const accessToken = await getZoomAccessToken();

// Zoom download URLs require an OAuth token.
// Download the file as a buffer so we can send it to Deepgram.
const downloadUrl = `${audioFile.download_url}?access_token=${accessToken}`;
const downloadResp = await fetch(downloadUrl);

if (!downloadResp.ok) {
throw new Error(`Failed to download recording: ${downloadResp.status}`);
}

const audioBuffer = Buffer.from(await downloadResp.arrayBuffer());
console.log(`Downloaded ${(audioBuffer.length / 1024 / 1024).toFixed(1)} MB`);

// SDK v5: transcribeFile takes (buffer, options) — the buffer is the first arg.
// SDK v5: all options are flat in a single object.
// SDK v5: throws on error — use try/catch, not { result, error } destructuring.
const data = await deepgram.listen.v1.media.transcribeFile(audioBuffer, {
model: 'nova-3',
smart_format: true,
// ← THIS enables speaker labels — essential for multi-speaker meetings.
diarize: true,
// ← THIS enables paragraph detection for readable output.
paragraphs: true,
});

// data.results.channels[0].alternatives[0].transcript
const transcript = data.results.channels[0].alternatives[0].transcript;
const paragraphs = data.results.channels[0].alternatives[0].paragraphs;

console.log(`\n── Transcript: "${meetingTopic}" ──`);
console.log(transcript);

if (paragraphs?.paragraphs) {
console.log(`\n── Paragraphs: ${paragraphs.paragraphs.length} ──`);
}

const words = data.results.channels[0].alternatives[0].words;
if (words?.length > 0) {
const duration = words.at(-1).end;
console.log(`\nDuration: ${(duration / 60).toFixed(1)} min | Words: ${words.length}`);
}

return { meetingTopic, transcript };
}

app.get('/health', (_req, res) => res.json({ status: 'ok' }));

app.listen(PORT, () => {
console.log(`Zoom recording transcription server running on port ${PORT}`);
console.log(`Webhook endpoint: POST http://localhost:${PORT}/webhook`);
console.log(`Health check: GET http://localhost:${PORT}/health`);
});

module.exports = { app, getZoomAccessToken, handleRecordingCompleted };
111 changes: 111 additions & 0 deletions examples/180-zoom-recording-transcription-node/tests/test.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
'use strict';

const fs = require('fs');
const path = require('path');

// ── Credential check — MUST be first ──────────────────────────────────────────
// Exit code convention used across all examples in this repo:
// 0 = all tests passed
// 1 = real test failure (code bug, assertion error, unexpected API response)
// 2 = missing credentials (expected in CI until secrets are configured)
const envExample = path.join(__dirname, '..', '.env.example');
const required = fs.readFileSync(envExample, 'utf8')
.split('\n')
.filter(l => /^[A-Z][A-Z0-9_]+=/.test(l.trim()))
.map(l => l.split('=')[0].trim());

const missing = required.filter(k => !process.env[k]);
if (missing.length > 0) {
console.error(`MISSING_CREDENTIALS: ${missing.join(',')}`);
process.exit(2);
}
// ──────────────────────────────────────────────────────────────────────────────

const { DeepgramClient } = require('@deepgram/sdk');

const KNOWN_AUDIO_URL = 'https://dpgr.am/spacewalk.wav';
const EXPECTED_WORDS = ['spacewalk', 'astronaut', 'nasa'];

async function run() {
// ── Test 1: Deepgram pre-recorded STT works with transcribeUrl ──
console.log('Test 1: Deepgram pre-recorded STT (nova-3)...');

const deepgram = new DeepgramClient({ apiKey: process.env.DEEPGRAM_API_KEY });

const data = await deepgram.listen.v1.media.transcribeUrl({
url: KNOWN_AUDIO_URL,
model: 'nova-3',
smart_format: true,
diarize: true,
paragraphs: true,
});

const transcript = data?.results?.channels?.[0]?.alternatives?.[0]?.transcript;

if (!transcript || transcript.length < 20) {
throw new Error(`Transcript too short or empty: "${transcript}"`);
}

const lower = transcript.toLowerCase();
const found = EXPECTED_WORDS.filter(w => lower.includes(w));
if (found.length === 0) {
throw new Error(
`Expected words not found in transcript.\nGot: "${transcript.substring(0, 200)}"`
);
}

console.log(`✓ Transcript received (${transcript.length} chars)`);
console.log(`✓ Expected content verified (found: ${found.join(', ')})`);

// ── Test 2: Zoom OAuth token retrieval ──
console.log('\nTest 2: Zoom OAuth token retrieval...');

const credentials = Buffer.from(
`${process.env.ZOOM_CLIENT_ID}:${process.env.ZOOM_CLIENT_SECRET}`
).toString('base64');

const tokenResp = await fetch(
`https://zoom.us/oauth/token?grant_type=account_credentials&account_id=${process.env.ZOOM_ACCOUNT_ID}`,
{
method: 'POST',
headers: { Authorization: `Basic ${credentials}` },
}
);

if (!tokenResp.ok) {
throw new Error(`Zoom OAuth failed: ${tokenResp.status} ${await tokenResp.text()}`);
}

const tokenData = await tokenResp.json();
if (!tokenData.access_token) {
throw new Error('No access_token in Zoom OAuth response');
}

console.log('✓ Zoom OAuth token retrieved successfully');

// ── Test 3: Webhook validation logic ──
console.log('\nTest 3: Webhook signature validation logic...');

const crypto = require('crypto');
const testToken = 'test-plain-token';
const hash = crypto
.createHmac('sha256', process.env.ZOOM_WEBHOOK_SECRET_TOKEN)
.update(testToken)
.digest('hex');

if (!hash || hash.length !== 64) {
throw new Error('HMAC hash generation failed');
}

console.log('✓ Webhook validation HMAC logic works');
}

run()
.then(() => {
console.log('\n✓ All tests passed');
process.exit(0);
})
.catch(err => {
console.error(`\n✗ Test failed: ${err.message}`);
process.exit(1);
});