A new way to tell stories with video highlights and transcription
Learn more →

Transcribe video and audio recordings

Upload and transcribe recordings from interviews, usability tests, and more, in one click, with our accurate and fast AI-powered speech engine.

Dovetail’s built-in transcription and video highlights are a powerful way to share stories about your research and develop a repository of searchable audio and video clips.

Upload a video or audio recording you’ve taken from an interview, usability test, sales call, or product demo. Dovetail will process the recording into a fast, streamable format, and transcribe it using an advanced AI-powered speech engine. Then, create highlights to turn your raw recording into tagged, searchable audio and video clips.

Get started with transcription

To upload a video or audio recording:

  1. Click Data in the sidebar.
  2. Click + Add data.
  3. Click the Video (🎞) or Audio (🎵) icon.
  4. Choose a file from your computer.
  5. Next to Transcribe this file, click Begin.

Your recording will be uploaded into a note and processed to ensure fast playback. The amount of time this takes depends on the length of the recording. In general, processing takes about 30% of the length of the file; e.g. a 60 minute recording will take approximately 20 minutes to process and transcribe.

You can close the note and continue using other parts of Dovetail while your file is uploading. You can safely leave Dovetail entirely (e.g. close the browser window or turn off your computer) while it’s processing and being transcribed, and come back later.

Change and rename speakers

While our AI speech-engine will attempt to automatically detect multiple speakers, it sometimes doesn’t get it right. You can change and rename the speaker for a monologue by clicking their name, which defaults to Speaker 1 and Speaker 2.

Supported file formats

Dovetail supports the following file formats:

Video formats Audio formats
mp4 mp3
mov m4a
mpeg wav

Supported languages

At the moment, transcription only supports English.

Tips for optimal results

Here are a few tips to improve the quality of your recordings and transcript:

  • Record in a quiet setting with minimal background noise.
  • Invest in quality recording equipment, such as a microphone or recorder.
  • Speak clearly, loudly, and slowly.
  • Avoid talking over other people.

How our model is trained

The AI speech engine we use is trained on 50,000+ hours of human-transcribed content across a diversity of topics, industries, and accents. This makes our transcripts some of the most accurate available.

Listen to a comparison on our product tour

Transcription pricing

Transcription and video highlights is free to use while this feature is in beta. After beta, all paid plans will include transcription minutes, and additional minutes can be purchased for $0.20 USD per minute.

Learn more on our pricing page

2 minute read
Updated 30 Apr 2020


Benjamin Humphrey
Kai Forsyth

Contact us

Can’t find your answer here? Get in touch with our support team.

Get in touch →


Join our Slack community to chat with us and hundreds of Dovetail users.

Join our Slack →


Follow us for research and design articles, product updates, and more.

Follow us →

Try now for free

Try free for 7 days

Trusted by companies likeSee all →


Data analysisResearch repositoryCollaborative researchVideo and transcriptionCustomersIntegrationsPricing

© Dovetail Research Pty. Ltd.

Made in Australia