Voice Interface API Documentation

“We consistently achieved 100% accuracy” - MacLife

Flow’s Voice interface API allows you to create magical voice interactions in your applications. Flow can convert audio into text with high accuracy across 100+ languages with a blazing fast API. What separates Flow from traditional “speech-to-text” models is that it’s optimized to understand what people say, and output text in their style: with auto-edits, removing filler words, getting names right, making their message more concise while maintaining their tone and more.

Auto edits: “Let’s meet at 6pm, actually let’s do 7” -> “Let’s meet at 7pm”
Concise while maintaining tone and removing filler words: “Search for, um, what’s it called, my flight tickets” -> “Search for my flight tickets”
Getting names right: “Hi Tony, how’s it going?” -> “Hi Tanay, how’s it going?”

This lets you create the same magical experience in your application that users feel using the Wispr Flow desktop and mobile clients which let them use voice everywhere on their computers.

Two easy ways to integrate Flow

WebSocket API (recommended) Stream audio via WebSocket to /ws (lower latency + streaming response)
REST API: Send a POST request to /api (slower, for when websockets are not available)

Two ways to authenticate

API-key based auth: Keep your API key securely on your backend. This is easier, but is going to be lower latency since the requests can’t be generated client-side directly
Client side auth (recommended): Generate access tokens for your clients to let them directly access the API. This is lower latency and also makes it easier to implement with Websockets.

Getting Started

Basics

Other Endpoints

Client Side Auth

Sample Projects

Voice Interface API Documentation

Two easy ways to integrate Flow

Two ways to authenticate

Getting Started

Basics

Other Endpoints

Client Side Auth

Sample Projects

​Two easy ways to integrate Flow

​Two ways to authenticate

Two easy ways to integrate Flow

Two ways to authenticate