robgehring's comments

robgehring · 2026-03-10T15:51:55 1773157915

Hi HN! I built AI Document Translator for Slack because I got fed up with translators that wreck document layout. Google Translate / ChatGPT / similar tools return plain text, leaving you to reflow fonts, tables, and slides, which often takes longer than the translation itself.

What we did differently:

- Preserve layout: the service keeps the original structure (fonts, tables, slides, sections) so you don't need to fix formatting after translation. - Context-aware translation: instead of translating sentence-by-sentence, we use LLMs to translate large context chunks (paragraphs/sections). That gives more natural phrasing and consistent terminology across the whole file. - Slack-first: not just a web UI - a Slack integration so teams can translate files where they already work.

Why it matters:

- Saves time: less manual layout cleanup. - Better consistency: one term = one translation across the doc. - Practical for real docs: marketing decks, contracts, manuals - not just chat snippets.

Will be happy to hear your feedback!

robgehring · 2025-09-09T19:36:17 1757446577

Hi HN,

I built SpeechText.AI for research interviews. It is a transcription service aimed at researchers and students who record interviews and focus groups.

What it does:

1. Upload audio → get a transcript in minutes

2. Separates speakers and adds timecodes

3. Editable transcript in a browser editor

4. Export to DOCX, TXT, SRT, VTT (works with NVivo, Atlas.ti, etc.)

What is different:

Hosted in the EU with GDPR compliance (many research institutions require this); Focus on qualitative research audio (not general dictation or meeting notes); Users keep control over their data: you can delete files permanently; Custom domain-trained models optimized for research terminology and noisy field audio

Happy to answer any questions or hear thoughts on how to make this more useful.

robgehring · on Nov 26, 2020

Could you suggest a better and cheaper alternative?

robgehring · on Nov 23, 2020

Hello! We are happy to release the speech recognition API service for developers. The SpeechText.AI API automatically transcribes speech to text and summarizes audio data with high accuracy in multiple languages. SpeechText.AI uses a combination of speech recognition and natural language processing models to auto-summarize your recordings and highlight key moments in discussion. The unique domain-specific speech recognition technology enables users to improve the accuracy of automatic transcription for industries such as finance, healthcare, legal, IT, HR, and others. The API can recognize multiple speakers and add word-by-word timestamps, punctuation, casing to transcription results. SpeechText.AI supports almost all common media file formats and can transcribe audio/video files stored on your hard drive or files accessible over public URLs (HTTP, FTP, Google Drive, Dropbox, etc.).

robgehring · on Oct 9, 2020

We are happy to release new transcription software for journalists. The interview transcription service supports 30+ languages and powered by domain-optimized speech recognition models. Domain-optimized models were trained on domain-specific language data to better understand domain-specific terminology. The speech recognition accuracy for different domains up to 96% depending on audio quality.

robgehring · on Sept 16, 2020

We've recently released new transcription service for podcasters. We trained our speech recognition model on 40000+ hours of human-transcribed podcasts. It helps us to achieve up to 97% transcription accuracy (depending on audio quality). You can check our free trial plan to see how new deep learning model works for podcasts. Select the audio type as 'Podcast' and your files will be accurately converted to text in just a few minutes.

jclos · on Sept 16, 2020

This looks nice, and I might give it a try to transcribe the live Q&A sessions I am doing for the course I teach. My question is how well does it handle accents? My slight French accent often trips the Cortana-powered transcription that is integrated in PowerPoint, but I assume your models are a bit more complex than those.

robgehring · on Sept 16, 2020

We offer Global English model for en-US language. All training data with different accents were contained into single model and it should work with different speaker accents (about 90% accuracy). But of course it depends on audio quality. For very noisy speech the accuracy may be lower than expected.

robgehring · on Aug 28, 2020

Hey everyone! I’m very excited to introduce our new product, SpeechText.AI.

SpeechText.AI transcription service can accurately transcribe conference calls, interviews, podcasts, lectures, and meeting records in more than 30 different languages and dialects. Our award-winning speech recognition technology achieves a word error rate of 3.8% on the open source LibriSpeech dataset (~1000 hours of clear English speech). SpeechText.AI's speech recognition technology is now almost as accurate as human transcriptionists. Please feel free to create a trial account https://speechtext.ai to see how it works. We will be looking forward to your feedback and questions in the comments about SpeechText AI.

loxias · on Aug 28, 2020

Exciting! Can't wait to play with it. Particularly interested in how someone w/o Google/Microsoft/Amazon resources for training models can produce better output.

Also, I might have missed it, I couldn't find the list of supported languages.

hamsta · on Aug 30, 2020

Do you provide an API? If yes, you should add this information to your website. What are the max and min durations that you support? Do you charge per second/minute?