Fal (Wizper)

Overview

FalSTTService provides speech-to-text capabilities using Fal’s Wizper API. It offers high-quality transcription with minimal setup. The service uses Voice Activity Detection (VAD) to process only speech segments, optimizing API usage and improving response time.

Installation

To use FalSTTService, install the required dependencies:

pip install "pipecat-ai[fal]"

You’ll need to set up your Fal API key as an environment variable: FAL_KEY.

You can obtain a Fal API key from the Fal platform.

Configuration

Constructor Parameters

api_key

str

Your Fal API key. If not provided, will use the FAL_KEY environment variable.

sample_rate

int

Audio sample rate in Hz. If not provided, uses the pipeline’s sample rate.

params

InputParams

Configuration parameters for the Wizper API. See InputParams below.

InputParams

language

Language

default:"Language.EN"

Language of the audio input. Defaults to English.

task

str

default:"transcribe"

Task to perform. Options are ‘transcribe’ or ‘translate’.

chunk_level

str

default:"segment"

Level of chunking for the audio. Default is ‘segment’.

version

str

default:"3"

Version of Wizper model to use.

Input

The service processes audio data with the following requirements:

PCM audio format
16-bit depth
Single channel (mono)

Output Frames

The service produces two types of frames during transcription:

TranscriptionFrame

Generated for final transcriptions, containing:

text

string

Transcribed text

user_id

string

User identifier

timestamp

string

ISO 8601 formatted timestamp

language

Language

Detected language (if available)

ErrorFrame

Generated when transcription errors occur, containing error details.

Methods

Set Model

await service.set_model("wizper-v3")

See the STT base class methods for additional functionality.

Language Support

Fal Wizper supports a wide range of languages. The service automatically maps Language enum values to the appropriate Wizper language codes.

Language Code	Description	Wizper Code
`Language.AF`	Afrikaans	`af`
`Language.AM`	Amharic	`am`
`Language.AR`	Arabic	`ar`
`Language.AS`	Assamese	`as`
`Language.AZ`	Azerbaijani	`az`
`Language.BA`	Bashkir	`ba`
`Language.BE`	Belarusian	`be`
`Language.BG`	Bulgarian	`bg`
`Language.BN`	Bengali	`bn`
`Language.BO`	Tibetan	`bo`
`Language.BR`	Breton	`br`
`Language.BS`	Bosnian	`bs`
`Language.CA`	Catalan	`ca`
`Language.CS`	Czech	`cs`
`Language.CY`	Welsh	`cy`
`Language.DA`	Danish	`da`
`Language.DE`	German	`de`
`Language.EL`	Greek	`el`
`Language.EN`	English	`en`
`Language.ES`	Spanish	`es`
`Language.ET`	Estonian	`et`
`Language.EU`	Basque	`eu`
`Language.FA`	Persian	`fa`
`Language.FI`	Finnish	`fi`
`Language.FO`	Faroese	`fo`
`Language.FR`	French	`fr`
`Language.GL`	Galician	`gl`
`Language.GU`	Gujarati	`gu`
`Language.HA`	Hausa	`ha`
`Language.HE`	Hebrew	`he`
`Language.HI`	Hindi	`hi`
`Language.HR`	Croatian	`hr`
`Language.HT`	Haitian Creole	`ht`
`Language.HU`	Hungarian	`hu`
`Language.HY`	Armenian	`hy`
`Language.ID`	Indonesian	`id`
`Language.IS`	Icelandic	`is`
`Language.IT`	Italian	`it`
`Language.JA`	Japanese	`ja`
`Language.JW`	Javanese	`jw`
`Language.KA`	Georgian	`ka`
`Language.KK`	Kazakh	`kk`
`Language.KM`	Khmer	`km`
`Language.KN`	Kannada	`kn`
`Language.KO`	Korean	`ko`
`Language.LA`	Latin	`la`
`Language.LB`	Luxembourgish	`lb`
`Language.LN`	Lingala	`ln`
`Language.LO`	Lao	`lo`
`Language.LT`	Lithuanian	`lt`
`Language.LV`	Latvian	`lv`
`Language.MG`	Malagasy	`mg`
`Language.MI`	Maori	`mi`
`Language.MK`	Macedonian	`mk`
`Language.ML`	Malayalam	`ml`
`Language.MN`	Mongolian	`mn`
`Language.MR`	Marathi	`mr`
`Language.MS`	Malay	`ms`
`Language.MT`	Maltese	`mt`
`Language.MY`	Burmese	`my`
`Language.NE`	Nepali	`ne`
`Language.NL`	Dutch	`nl`
`Language.NN`	Norwegian Nynorsk	`nn`
`Language.NO`	Norwegian	`no`
`Language.OC`	Occitan	`oc`
`Language.PA`	Punjabi	`pa`
`Language.PL`	Polish	`pl`
`Language.PS`	Pashto	`ps`
`Language.PT`	Portuguese	`pt`
`Language.RO`	Romanian	`ro`
`Language.RU`	Russian	`ru`
`Language.SA`	Sanskrit	`sa`
`Language.SD`	Sindhi	`sd`
`Language.SI`	Sinhala	`si`
`Language.SK`	Slovak	`sk`
`Language.SL`	Slovenian	`sl`
`Language.SN`	Shona	`sn`
`Language.SO`	Somali	`so`
`Language.SQ`	Albanian	`sq`
`Language.SR`	Serbian	`sr`
`Language.SU`	Sundanese	`su`
`Language.SV`	Swedish	`sv`
`Language.SW`	Swahili	`sw`
`Language.TA`	Tamil	`ta`
`Language.TE`	Telugu	`te`
`Language.TG`	Tajik	`tg`
`Language.TH`	Thai	`th`
`Language.TK`	Turkmen	`tk`
`Language.TL`	Tagalog	`tl`
`Language.TR`	Turkish	`tr`
`Language.TT`	Tatar	`tt`
`Language.UK`	Ukrainian	`uk`
`Language.UR`	Urdu	`ur`
`Language.UZ`	Uzbek	`uz`
`Language.VI`	Vietnamese	`vi`
`Language.YI`	Yiddish	`yi`
`Language.YO`	Yoruba	`yo`
`Language.ZH`	Chinese	`zh`

Fal Wizper supports a range of languages and dialects. For the most accurate transcription, specify the correct language for your audio input.

Usage Example

from pipecat.services.fal.stt import FalSTTService
from pipecat.transcriptions.language import Language

# Configure service
stt = FalSTTService(
    api_key="your-fal-api-key",
    params=FalSTTService.InputParams(
        language=Language.EN,
    )
)

# Use in pipeline
pipeline = Pipeline([
    transport.input(),
    stt,
    llm,
    ...
])

Voice Activity Detection Integration

This service inherits from SegmentedSTTService, which uses Voice Activity Detection (VAD) to identify speech segments for processing. This approach:

Processes only actual speech, not silence or background noise
Maintains a small audio buffer (default 1 second) to capture speech that occurs slightly before VAD detection
Receives UserStartedSpeakingFrame and UserStoppedSpeakingFrame from a VAD component in the pipeline
Only sends complete utterances to the API when speech has ended

Ensure your transport includes a VAD component (like SileroVADAnalyzer) to properly detect speech segments.

Metrics Support

The service collects the following metrics:

Processing duration
API response time
Success/failure rates

Notes

Requires valid Fal API key
Uses Fal’s Wizper model
Requires VAD component in transport
Processes complete utterances, not continuous audio
Thread-safe processing

Error Handling

The service handles common API errors including:

Authentication errors
API availability issues
Invalid audio format
Network connectivity issues
API timeouts

Errors are propagated through ErrorFrames with descriptive messages.

API Reference

Services

Utilities

Frameworks

Pipeline

Base Service Classes

Overview

Installation

Configuration

Constructor Parameters

InputParams

Input

Output Frames

TranscriptionFrame

ErrorFrame

Methods

Set Model

Language Support

Usage Example

Voice Activity Detection Integration

Metrics Support

Notes

Error Handling

API Reference

Services

Utilities

Frameworks

Pipeline

Base Service Classes

​Overview

​Installation

​Configuration

​Constructor Parameters

​InputParams

​Input

​Output Frames

​TranscriptionFrame

​ErrorFrame

​Methods

​Set Model

​Language Support

​Usage Example

​Voice Activity Detection Integration

​Metrics Support

​Notes

​Error Handling

Overview

Installation

Configuration

Constructor Parameters

InputParams

Input

Output Frames

TranscriptionFrame

ErrorFrame

Methods

Set Model

Language Support

Usage Example

Voice Activity Detection Integration

Metrics Support

Notes

Error Handling