Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/.release-please-manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
".": "1.2.0"
}
16 changes: 16 additions & 0 deletions .github/release-please-config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"$schema": "https://raw.githubusercontent.com/googleapis/release-please/main/schemas/config.json",
"release-type": "python",
"bump-minor-pre-major": true,
"bump-patch-for-minor-pre-major": true,
"include-v-in-tag": true,
"packages": {
".": {
"component": "deepgram-captions",
"include-component-in-tag": false,
"extra-files": [
"deepgram_captions/_version.py"
]
}
}
}
41 changes: 41 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: CI

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
lint:
name: Lint & typecheck
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: "3.12"
- name: Install dev dependencies
run: pip install -e ".[dev]"
- name: Ruff format check
run: ruff format --check deepgram_captions/ test/
- name: Ruff lint
run: ruff check deepgram_captions/ test/
- name: Mypy
run: mypy deepgram_captions/

test:
name: Test Python ${{ matrix.python-version }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dev dependencies
run: pip install -e ".[dev]"
- name: Run tests
run: pytest test/ -v
69 changes: 41 additions & 28 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,42 +1,55 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

name: Release

on:
release:
types: [published]
push:
branches: [main]
workflow_dispatch:

permissions:
contents: read
contents: write
pull-requests: write

jobs:
deploy:
release-please:
name: Release Please
runs-on: ubuntu-latest
outputs:
release_created: ${{ steps.release.outputs.release_created }}
tag_name: ${{ steps.release.outputs.tag_name }}
steps:
- uses: googleapis/release-please-action@v4
id: release
with:
token: ${{ github.token }}
config-file: .github/release-please-config.json
manifest-file: .github/.release-please-manifest.json

publish:
name: Publish to PyPI
needs: release-please
if: ${{ needs.release-please.outputs.release_created }}
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/deepgram-captions
permissions:
id-token: write # required for trusted publishing

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v3
uses: actions/setup-python@v4
with:
python-version: "3.x"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build
- name: Update Version in _version.py
run: sed -i "s/0.0.0/${{ github.event.release.tag_name }}/g" ./deepgram_captions/_version.py
python-version: "3.12"

- name: Install build tools
run: pip install --upgrade pip build

- name: Build package
run: python -m build
- name: Install twine
run: python -m pip install --upgrade twine
- name: Publish package
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}

- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
# No API token needed — uses OIDC trusted publishing.
# Configure at: https://pypi.org/manage/project/deepgram-captions/settings/publishing/
88 changes: 88 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.2.0] - 2024-03-15

### Added
- `pyproject.toml` as the canonical build configuration (replaces `setup.py` as the primary build definition)
- `py.typed` marker file for PEP 561 compliance — fully typed package
- `Makefile` with `install`, `test`, `lint`, `lint-fix`, `format`, `format-check`, `typecheck`, `check`, and `dev` targets
- GitHub Actions CI workflow (`ci.yml`) running lint, type checking, and tests across Python 3.10–3.13
- `ruff` for linting and formatting (replaces `black`)
- `mypy` for static type checking
- Full type annotations on all public APIs in `helpers.py`, `converters.py`, `webvtt.py`, and `srt.py`
- Comprehensive docstrings for all public classes and functions
- `SECURITY.md` with responsible disclosure policy
- `CHANGELOG.md` (this file)

### Changed
- `DeepgramConverter`, `AssemblyAIConverter`, and `WhisperTimestampedConverter` now carry full type hints
- `webvtt()` and `srt()` functions are now fully typed with `Any` converter protocol
- `EmptyTranscriptException` and `ConverterException` are now exported from the top-level `deepgram_captions` package
- Updated classifiers to reflect Production/Stable status and Python 3.10–3.13 support
- Release workflow updated to use `actions/checkout@v4` and `actions/setup-python@v4`
- Release workflow version bumping now targets `pyproject.toml` instead of `_version.py` only

### Fixed
- `chunk_array` simplified to a single list comprehension (functionally identical, more idiomatic)

## [1.1.0] - 2023-11-08

### Added
- `AssemblyAIConverter` — support for AssemblyAI speech-to-text API responses
- `WhisperTimestampedConverter` — support for [Whisper Timestamped](https://github.com/linto-ai/whisper-timestamped) responses (word-level timestamps required)
- `replace_text_with_word()` helper to normalise `"text"` key to `"word"` for Whisper Timestamped compatibility
- Documentation note clarifying that OpenAI Whisper (without word timestamps) is not supported directly; users should use Deepgram's hosted Whisper Cloud (`model=whisper`) with `DeepgramConverter`

### Changed
- `get_lines()` on `AssemblyAIConverter` now respects `utterances` array when present, falling back to flat `words` array
- `WhisperTimestampedConverter.get_lines()` processes `segments` array and applies `replace_text_with_word` normalisation

## [1.0.0] - 2023-10-15

### Added
- Speaker diarisation support in `DeepgramConverter.get_lines()`: when word objects include a `"speaker"` field, caption lines break on speaker changes in addition to `line_length` limits
- Speaker labels in WebVTT output using voice tags: `<v Speaker 0>text</v>`
- Speaker labels in SRT output as `[speaker N]` prefix lines, emitted once per speaker change
- `use_exception` parameter on `DeepgramConverter.__init__()` — set to `False` to suppress `ConverterException` when no valid transcript is found
- `EmptyTranscriptException` raised by `webvtt()` and `srt()` when the converter returns an empty first line
- `line_length` parameter on `webvtt()` and `srt()` — controls the maximum number of words per caption cue (default: 8)
- `get_headers()` on `DeepgramConverter` returns a `NOTE` block for WebVTT output containing request ID, creation time, duration, and channel count from the Deepgram response metadata

### Changed
- `DeepgramConverter` now prefers the `utterances` array over `channels[0].alternatives[0].words` when both are present, producing more natural sentence-level caption breaks
- `webvtt()` checks for `get_headers()` capability via `hasattr`/`callable` — custom converters do not need to implement it

### Fixed
- Microsecond precision in `seconds_to_timestamp()` correctly truncated to milliseconds for both WebVTT (`.`) and SRT (`,`) formats

## [0.1.0] - 2023-09-20

### Added
- `DeepgramConverter` class wrapping Deepgram pre-recorded and streaming API responses
- `webvtt()` function generating valid WebVTT documents from any converter
- `srt()` function generating valid SRT documents from any converter
- `seconds_to_timestamp()` utility converting float seconds to `HH:MM:SS.mmm` or `HH:MM:SS,mmm`
- `chunk_array()` utility splitting word lists into fixed-length groups
- `EmptyTranscriptException` for empty transcript detection
- Support for Deepgram SDK response objects via `.to_json()` method detection
- Initial test suite covering Deepgram pre-recorded responses

## [0.0.1] - 2023-08-01

### Added
- Initial project scaffold
- Package structure: `deepgram_captions/` with `__init__.py`, `helpers.py`, `converters.py`, `webvtt.py`, `srt.py`
- `setup.py` with basic package metadata
- MIT License
- Initial README

[1.2.0]: https://github.com/deepgram/deepgram-python-captions/compare/v1.1.0...v1.2.0
[1.1.0]: https://github.com/deepgram/deepgram-python-captions/compare/v1.0.0...v1.1.0
[1.0.0]: https://github.com/deepgram/deepgram-python-captions/compare/v0.1.0...v1.0.0
[0.1.0]: https://github.com/deepgram/deepgram-python-captions/compare/v0.0.1...v0.1.0
[0.0.1]: https://github.com/deepgram/deepgram-python-captions/releases/tag/v0.0.1
Loading
Loading