Skip to main content

Overview

RinggSTTService is a streaming Speech-to-Text integration that delegates the WebSocket connection, handshake, and event parsing to the official ringglabs Python SDK. It supports streaming interim and final transcripts, server-side capitalization and punctuation, and client-driven VAD endpointing by forwarding Pipecat’s VAD frames to the server as start_speaking / stop_speaking cues.

Source Repository

Source code, examples, and issues for the Ringg AI integration

Ringg AI

Learn more about Ringg AI and sign up for an API key

Ringg SDK

The ringglabs SDK that powers the integration

Installation

This is a community-maintained package distributed separately from pipecat-ai. It is not yet published to PyPI, so install it directly from GitHub:
# With uv (recommended)
uv pip install git+https://github.com/Stonkr/pipecat-ringg.git

# With pip
pip install git+https://github.com/Stonkr/pipecat-ringg.git
A PyPI package (pip install pipecat-ringg) will be published after community review. See the source repository for the latest install instructions.

Prerequisites

Ringg AI Account Setup

Before using the Ringg AI STT service, you need:
  1. Ringg AI Account: Sign up at ringg.ai
  2. API Key: Get a Ringg AI API key from your account

Required Environment Variables

  • RINGG_API_KEY: Your Ringg AI API key for authentication
  • RINGG_BASE_URL (optional): Override the Ringg API base URL. Leave unset to use the SDK default.

Configuration

Constructor parameters

base_url
str
default:"None"
Optional override for the Ringg API base URL. Passed through to the SDK; leave as None to use the SDK default.
sample_rate
int
default:"None"
Sample rate (Hz) of the audio to be streamed. If not provided, the value is taken from the StartFrame.
params
RinggSTTParams
default:"None"
Service configuration parameters. See Params below. Defaults are used if omitted.

Params

Configuration passed via the params constructor argument using RinggSTTParams(...).
ParameterTypeDefaultDescription
api_keystr""Ringg API key for authentication.
encodingstr"int16"Audio encoding (signed 16-bit PCM).
languagestr"hi"Transcription language code.
modestr"stream""on_final" emits a final transcript on stop_speaking; "stream" emits interim transcripts.
vad_tail_sil_msint200Trailing silence (ms) for server VAD.
vad_confidencefloat0.55Server VAD confidence threshold (0.0–1.0).
enable_cap_puncboolTrueEnable server-side capitalization/punctuation.
accept_client_vad_eventsboolTrueUse client-sent VAD events for endpointing.
See the source repository for the authoritative, up-to-date list of parameters and defaults.

Usage

import os

from pipecat.pipeline.pipeline import Pipeline
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.processors.audio.vad_processor import VADProcessor

from pipecat_ringg import RinggSTTParams, RinggSTTService

stt = RinggSTTService(
    base_url=os.environ.get("RINGG_BASE_URL"),  # optional
    params=RinggSTTParams(
        api_key=os.environ["RINGG_API_KEY"],
        language="hi",
        mode="on_final",  # "stream" for interim transcripts
    ),
)

# A VADProcessor upstream of the STT service produces the VAD frames the
# service forwards to the server for endpointing.
pipeline = Pipeline([
    transport.input(),
    VADProcessor(vad_analyzer=SileroVADAnalyzer(params=VADParams(confidence=0.55))),
    stt,
    context_aggregator.user(),
    llm,
    tts,
    transport.output(),
    context_aggregator.assistant(),
])
See example.py in the source repository for a complete, runnable example.

Compatibility

Tested with Pipecat v1.2.1. Check the source repository for the latest tested version and changelog.