Free Text to Speech

Convert any text to speech using your device's built-in voices. Free, no sign-up, no download.

0 characters

About This Tool

This Text to Speech tool uses your device's built-in Web Speech API. No downloads, no installation, no account. Voice availability depends on your operating system and browser: on Safari (macOS / iOS) the synthesis happens entirely on-device, while Chrome and Edge send the text to Google or Microsoft's speech service to render the audio. Absolutool itself never sees or stores your text.

To get more voices in your language, install your OS's speech packs, Windows: Settings → Time & Language → Language → Speech. macOS and iOS include dozens of voices out of the box. Android: Settings → Accessibility → Text-to-speech output.

How the Web Speech API Works

Browsers expose a SpeechSynthesis interface (part of the Web Speech API, originally drafted by the W3C Speech API Community Group) that takes text and a chosen voice and produces audible speech via the underlying operating system's TTS engine. The full API surface is small but powerful: speechSynthesis.speak(utterance) starts speech, cancel() / pause() / resume() control playback, and getVoices() lists every voice the OS exposes. Each SpeechSynthesisUtterance carries the text, language tag, voice, rate, pitch, and volume.

The audio itself is generated by the OS, not the browser. macOS and iOS ship with dozens of high-quality voices built into the system. Windows surfaces voices installed via Settings → Time & Language → Speech. Android uses Google's Text-to-Speech engine (or alternatives like Samsung TTS). Linux falls through to whatever speech-dispatcher / espeak setup the distro provides, often robotic-sounding by default unless you've installed a richer engine.

The Cloud-vs-Local Privacy Distinction

Not every "browser" voice runs on your device. Some browsers send the text to a remote server to render the audio for higher-quality voices, then stream the result back. This matters for privacy:

If your text is sensitive (drafts of confidential documents, internal company memos, anything you wouldn't want copied to a third party) pick a voice marked as local. If you don't see local voices in the dropdown, install OS voice packs and they'll appear there.

Common Use Cases

Quirks and Limitations to Know About

Why Voice Quality Varies So Much

The quality of a TTS voice depends entirely on the underlying engine, which depends on the OS, which depends on what you've installed. The 1990s-era voices (eSpeak, Microsoft Anna, the old Mac "Fred") were synthesised from concatenated phoneme samples and sound robotic and stilted. Modern voices (Apple's Siri voices, Microsoft's Online Natural voices, Google's WaveNet-based voices, ElevenLabs' subscription voices) use deep learning to generate audio that's nearly indistinguishable from a human reader.

If the voices in your dropdown sound robotic, the fix isn't this tool, it's installing better voices in your OS:

Common Mistakes

  1. Expecting Firefox to support it. Firefox's Web Speech API support has lagged. The Speak button will be disabled when you visit in Firefox; use a Chromium-based browser or Safari for reliable TTS.
  2. Pasting confidential text into a Chrome session and assuming it's local. The default Chrome "Google" voices send text to Google's TTS service. Pick a local voice or use Safari for sensitive content.
  3. Long blocks of text in Chrome. The 15-second / ~250-character cut-off catches anyone who pastes a paragraph and expects it to read all the way through. Either split the text or use Safari (no cut-off).
  4. Setting rate or pitch too far out of range. The engine doesn't clamp; it silently produces no audio. Stay within rate 0.5–2.5 and pitch 0.5–1.5 for predictable results.
  5. Treating browser TTS as production-quality voiceover. Even the best browser voices are good enough for proofreading, accessibility, and rough drafts, not for published podcasts or commercial voiceover. For that, look at ElevenLabs, Murf, or similar paid services.
  6. Forgetting that voices download asynchronously. First page visit on Chrome may show no voices; refresh after a moment and they'll appear.

More Frequently Asked Questions

How do I tell if a voice is local or cloud-based?

Programmatically, the SpeechSynthesisVoice.localService property is true for on-device voices and false for cloud-based ones. In practice, voice naming conventions help: Chrome's voices labelled "Google" are usually cloud-based; voices that match your OS's installed voices (Microsoft David, Apple Samantha, Google Wavenet en-US-Wavenet-D) are local if the OS has them. Safari's voices are always local.

Can I save the audio as an MP3 file?

Not with the browser's Web Speech API directly, the spec doesn't expose the audio stream for capture. If you need a downloadable MP3 / WAV, options include: a dedicated voiceover app like Audacity recording your system audio, a paid TTS API (Google Cloud TTS, Amazon Polly, ElevenLabs) that returns the audio file, or a screen-recording app capturing the playback.

Why is the audio choppy or stopping mid-sentence?

The most common cause on Chrome is the long-text bug, speech stops at ~15 seconds. Refresh and try again with a shorter passage, or switch to Safari which doesn't have that limit. Other causes: a system glitch in the OS speech engine (a restart usually fixes it), or a cloud voice failing to fetch when offline (switch to a local voice).

Does this work in any language?

Any language your operating system has a voice installed for. macOS and iOS ship with dozens of languages built in. Windows requires installing speech packs per language (Settings → Time & Language → Speech → Add voices). Android needs Google TTS or a third-party engine to have the language data downloaded. The Voice dropdown lists everything available; the language tag (en-US, fr-FR, ja-JP, etc.) tells you which language each voice produces.

Is this useful for podcasting?

For drafts and pacing tests, yes. For published episodes, the quality bar is higher, even the best browser voices have subtle artefacts that listeners pick up on quickly. Paid services like ElevenLabs and Murf offer voice models trained for long-form narration and produce noticeably better results, often at a few cents per thousand characters.

Can I use this for blind / low-vision users on my own site?

A site doesn't usually need to embed TTS for accessibility, assistive technologies like screen readers (VoiceOver on Apple devices, NVDA / JAWS on Windows, TalkBack on Android) handle that universally. Embedded TTS is more useful for occasional read-aloud convenience for sighted users with reading fatigue or learners. For accessibility, focus on semantic HTML, ARIA labels, keyboard navigation, and contrast, those help every screen reader work better, including the user's own.

Related Tools