← All articles

Product Published 2026-06-24 · 3 min read

AI Voice Generation v2 is here: natural speech and your own cloned voice, on your device

A text-to-speech and voice-cloning studio built on the cross-platform AI Suite v2 stack: type any text and hear it in a natural voice, or clone a voice of your own from a short clip — with an offline default, encrypted-on-device storage, a seekable player and exports to WAV or MP3.

Today we're releasing AI Voice Generation v2 — a text-to-speech and voice-cloning studio built on the cross-platform AI Suite v2 stack. Type any text and hear it spoken in a natural voice, or record a short clip and build a custom voice that sounds like you, then reuse it any time. The point is simple: turn text into speech — and clone a voice of your own — without sending your script or your voice samples to the cloud.

What it does

Paste a sentence, a paragraph, or a whole script; pick a voice; press generate; and play back natural-sounding speech. The built-in Amy voice runs fully offline on your machine once its small model downloads on first use — turn the network off and it keeps working. Adjust the speaking speed from 0.5× to 2.0× without changing pitch, choose your output format, and replay your last clip instantly with a seekable waveform player. When you like a result, save it as a WAV or MP3 file.

Want a specific voice? Record or drop in a short reference clip and the app builds a custom cloned voice — no training step, no studio. Your clones appear at the top of the voice picker for instant reuse, so a consistent voice travels across every clip you make.

New in v2

  • A real voice library. One searchable picker holds the offline built-in voice, premium cloud voices, high-quality on-device neural voices, and your own clones. A quick filter narrows the list by name or language as you type, and a download badge shows which voices fetch their files the first time you use them.
  • Voice cloning, with your first clone free. Record 6–15 seconds of clean speech (or import a clip), name the voice, and create a clone with the free Chatterbox engine. It lands at the top of the picker with no Pro badge — a real custom voice, not a trial.
  • Engines for every situation. The offline Piper engine is the default and runs entirely on your device. Pro unlocks more Piper voices, OpenAI-compatible cloud voices (bring your own key), and on-device neural voices — Kokoro and Parler-TTS — that stay local. A keep-models-loaded option means repeat generations skip the load time.
  • Replay without re-synthesizing. A waveform player replays your last generated clip and your clone reference clips instantly, so you never regenerate just to listen again. You decide how many recent clips stay cached.
  • Offload heavy work when you want to. Cloning is compute-heavy; if this machine has no fast GPU you can offload it to an AI Server on your network. Even then your reference clip stays encrypted on your device — a one-time copy is sent only to synthesize each request and is never stored on the server.
  • In 16 languages, and scriptable. The whole interface follows your operating system's language, and it speaks and clones across many languages. A headless command-line tool generates speech from the terminal, for batch voiceovers and automation on your own infrastructure.

Free vs Pro

The free tier is genuinely usable on its own — it gives you a real voice and a real cloned voice. Free includes the offline Amy voice and your first cloned voice, made with the free cloning engine, with no usage cap and no per-word billing. Pro is about more choice: every other built-in voice, premium cloud voices, high-quality on-device neural voices, the advanced cloning engines, and unlimited clones beyond your first. Voice cloning itself is not locked behind Pro — the basic engine is free, and you can use one clone on the free tier; Pro simply adds more clones and more engines.

Why on-device matters

Scripts, narration, a voice that sounds like you — the text and voice samples people most want to work with are often exactly what can't go to a third-party cloud. AI Voice Generation brings the AI to your machine instead: the offline Amy voice and your on-device clones send nothing off the device, your clone reference clips and generation history are encrypted at rest under a key only your AI Suite apps can read, and there's no per-character meter. Only the optional cloud voices and the optional clone offload ever send anything online, and both are explicit opt-ins. It's the same on-premises, privacy-first principle behind everything Software Tailor builds [1].

Get it

AI Voice Generation v2 is available now on the Microsoft Store for Windows as a free download, with Pro available when you want the full voice library and unlimited clones [2]. Browser, iOS and Android editions also ship, and macOS and Linux desktop builds are part of the same cross-platform stack. Read the full feature tour on the product page.

References

  1. Software Tailor. "AI Voice Generation — product page." softwaretailor.com/ai-voice-generation.htm. Accessed 2026-06-24.
  2. Microsoft. "AI Voice Generation on the Microsoft Store." apps.microsoft.com/detail/9P3F6R35CNSX. Accessed 2026-06-24.

Related articles

Give your text a voice, privately.

Free to start, on-device by design — with your first cloned voice included. Get it on Windows from the Microsoft Store, or explore the full feature tour first.

订阅产品更新

全新免费 AI 产品、重大更新,以及仅在本网站发布的新版本。绝无垃圾信息。