Privacy-first speech-to-text: why TalkScribe never keeps your audio

Barak Laniado4 min read

When I started building TalkScribe, I did what every founder does: I studied the competition. The products were impressive. The privacy policies were not. Buried in page after page of legal language was the same quiet arrangement - your audio gets stored, your words become training data, and your "deleted" recordings live on in backups for an unspecified while.

For casual notes, maybe that's a fair trade. But the people I was building for - writers dictating unpublished manuscripts, journalists protecting sources, founders talking through unannounced products - can't make that trade. So TalkScribe was architected the other way around.

What privacy-first actually means here

"Privacy-first" is a marketing phrase until you pin it to mechanics. Here are ours:

  • Live audio is processed in real time and discarded. The recorder streams speech to our engine, text comes back, and the audio is gone. There is no recordings library on our servers because recordings never land there.
  • Uploaded files are deleted after transcription. An upload sits in encrypted storage exactly as long as transcription takes - minutes - and is then deleted automatically.
  • Saved history is opt-in, and off by default. Until you flip the toggle in settings, your transcripts stay in your browser. For billing we record the minutes you used - never the words.
  • Nothing trains AI.Your audio and your transcripts are not used to train models. Not ours, not anyone's.
  • No data resale. The business model is the subscription. You are the customer, not the product.

What this costs us

Honest privacy architecture has a price, and it's mostly paid by us. If you transcribe something important without opting into history and close the tab, we cannot recover it for you - there is nothing to recover. Support tickets that other companies solve with "here's your recording" end differently here, and we accept that.

It also means we gave up the thing most AI companies prize most: a giant corpus of user data to mine. Our transcription engine improves through model updates and through the custom vocabulary you explicitly configure - not by quietly learning from your dictation. That is a slower path. It is also the only one I could explain to a journalist protecting a source with a straight face.

The toggle is the philosophy

My favorite detail in the product is the save-history toggle in settings, because it makes the default visible: it ships OFF. Convenience features that require storage exist - a searchable transcript archive is genuinely useful - but you walk in, see the price of admission, and decide. Defaults are where a company's real values live. Anyone can offer a setting; what they preselect tells you who they are.

The architecture is aligned with GDPR and CCPA, and the full details live in our privacy policy - written to be read, not to be survived.

Where this goes

TalkScribe will grow - more languages, deeper AI insights, team features are all on the roadmap. The privacy architecture is the part that won't move. Every feature we add gets designed around ephemerality first, storage only by explicit choice. If a feature can't be built that way, we don't ship it.

If that matches how you think about your words, the free trial runs in your browser - no signup, no card, and true to form, we never see your audio.

See the privacy-first engine yourself

30 minutes free. No signup, no credit card - the trial runs in your browser.