LingoFlow - Real-Time Voice Translation, Live Interpretation, and Audio Transcription

Q: What's the difference between Conversation Mode, Realtime Interpretation, and Realtime Caption Translation?

Conversation Mode is two-way voice interpretation for 1:1 talks. Realtime Interpretation is one-way voice interpretation for meetings and presentations. Realtime Caption Translation shows original and translated text side-by-side with no voice output, ideal for screen-sharing.

Q: What languages are supported?

Voice output for interpretation: Japanese, English, Chinese. Realtime Caption Translation: input Japanese, English, Vietnamese, Chinese / translated text Japanese, English, Chinese. Audio transcription: input and output in Japanese, English, Vietnamese, Chinese.

LingoFlow real-time translation interface

Four ways to use LingoFlow

Speak with someone, listen to a talk, project captions on a screen, or upload a file. Pick the mode that fits.

Conversation Mode (two-way voice)

Switch between speakers in a 1:1 talk and let LingoFlow voice both sides. Voice output: Japanese, English, Chinese.
Realtime Interpretation (one-way voice)

Live interpretation that reads back the translated speech as the speaker continues. Voice output: Japanese, English, Chinese.
Realtime Caption Translation

Live captions and translation shown side-by-side, ideal for screen-sharing and projecting on a meeting room display. Recognized input: Japanese, English, Vietnamese, Chinese. Translated text: Japanese, English, Chinese.
Audio transcription (upload)

Upload .m4a, MP3, WAV, MP4, MPEG, MPGA, or WebM and get a transcript, translation, summary, and Markdown. Input and output supported in Japanese, English, Vietnamese, Chinese.

How your data is handled

What we actually do with audio and text — written from the implementation, not as marketing.

Not used to train AI

We do not use your audio, transcripts, translations, or summaries to train LingoFlow-owned AI models.
Explicit consent before recording

An audio-processing consent screen is shown before any session. Please confirm with the people in the conversation before you start.
User-deletable history

You can delete history entries, recorded transcription jobs, and related recording assets from the app where implemented.

See Security · Data Management · Subprocessors for details.

How it works

From speech to reusable notes in a few steps.

1
Start live or upload audio

Use the microphone for a live conversation, or upload a recording from your device.
2
Review transcript and translation

LingoFlow creates the original transcript and translated text with language settings you control.
3
Save the outcome

Generate a summary, reopen results from history, and export Markdown for follow-up work.

Use cases

Built for repeated communication work, not one-off demos.

International meetings

Follow discussions across Japanese, English, Vietnamese, and Chinese, then keep a searchable record.
Interviews and field notes

Record on your phone, upload later, and turn the recording into structured text and summaries.
Lectures and internal training

Convert long recordings into text that is easier to review, translate, and share with teammates.

Our Products

Tools that help global teams communicate without friction.

Voice Interpretation

Real-time voice translation that speaks for you in meetings, 1:1 conversations, and live events.

Learn more →

Realtime Caption Translation

Live captions and translation, side-by-side, for meetings and presentations.

Learn more →

Meeting Translation

Real-time translation for in-person international meetings on a single smartphone. Reduce misunderstandings and align faster on decisions.

Learn more →

Audio Transcription

Upload recordings and get transcripts, translations, summaries, and Markdown notes.

Learn more →

Example scenarios

The same conversation often calls for a different mode depending on who is in the room.

1:1 talks with overseas teammates

Use Voice Interpretation in Conversation Mode to switch between speakers. Each side hears the other's words spoken in their own language, so people can keep eye contact instead of reading a screen.

International conferences and internal training

Run Realtime Interpretation for the spoken translation, and project Realtime Caption Translation on a separate screen — covering both listeners and readers in the same session.

Field notes, interviews, and recording archives

Upload phone-recorded audio to Audio Transcription for transcript, translation, summary, and Markdown export in one pass. History entries can be deleted at any time.

Pricing

Simple monthly plans. You're charged immediately at checkout and your subscription auto-renews monthly.

Free

¥0/month

Up to 10 minute-equivalent credits/month

7-day retention

Pro

¥980/month (tax included)

Up to 120 minute-equivalent credits/month (no daily limit)

30-day retention

Ticket top-up: Pro plan only

Quick usage guide

With the Pro plan, you can use about 120 minutes of realtime caption translation, about 60 minutes of voice interpretation, or about 480 minutes of recorded transcription per month.

Realtime features and recorded transcription use different usage pools.

Mode	How usage is counted	Approx. usage with Pro plan
Realtime Caption Translation	Each 1 minute used consumes 1 minute from the realtime pool	About 120 min
Voice Interpretation / Conversation Mode	Each 1 minute used consumes 2 minutes from the realtime pool	About 60 min
Recorded Transcription	Each 1 minute of uploaded audio consumes 1 recorded transcription minute	About 480 min/month

Usage times are estimates. Actual usage may vary depending on audio length, processing type, and whether translation or summarization is enabled.

Quotas are tracked in minute-equivalent credits. Realtime caption transcription uses 1 credit per minute, while voice interpretation and voice-output translation use 2 credits per minute. So the Pro plan's 120 minute-equivalent credits can be spent, for example, as 120 minutes of caption translation, 60 minutes of voice interpretation, or any mix in between. See Pricing for full plan details.

Frequently Asked Questions

What's the difference between Conversation Mode, Realtime Interpretation, and Realtime Caption Translation?

Conversation Mode is two-way voice interpretation. Switch between speakers and the app reads each side's speech back in the other side's language. Use it for 1:1 talks and face-to-face meetings.

Realtime Interpretation is one-way voice interpretation. The app listens to a presenter or meeting and produces a translated voice in your target language while speech continues.

Realtime Caption Translation shows the original transcript and translation side-by-side as text — no voice output. It's intended for screen-sharing and projecting captions for an audience.

What languages are supported?

Voice output for interpretation: Japanese, English, Chinese.

Realtime Caption Translation: input Japanese, English, Vietnamese, Chinese / translated text Japanese, English, Chinese.

Audio transcription (upload): input and output in Japanese, English, Vietnamese, Chinese.

Are recordings or transcripts used to train AI?

No. We do not use your audio, transcripts, translations, or summaries to train LingoFlow-owned AI models. See Security for details.

Can I delete my history?

Yes. History entries, recorded transcription jobs, and related recording assets can be deleted from the app. See Data Management for the deletion scope.

What does the Pro plan's "120 minute-equivalent credits" actually mean?

Quotas are tracked in minute-equivalent credits. Realtime caption transcription uses 1 credit per minute, while voice interpretation and voice-output translation use 2 credits per minute. So 120 credits can become 120 minutes of caption translation, 60 minutes of voice interpretation, or any mix.

Can I use LingoFlow without installing an app?

Yes. LingoFlow runs in a modern browser. You can also install it as a PWA for an app-like experience.

Which audio files can I upload?

Audio transcription supports .m4a, .mp3, .wav, .mp4, .mpeg, .mpga, and .webm files. Free supports up to 10 minutes or 25MB per file; Pro supports up to 90 minutes or 100MB per file. On Pro, files over 25MB are split automatically for processing.

Refund & Cancellation

Refunds: No refunds as a rule.

Cancellation: You can cancel anytime via the Stripe Customer Portal.

Access after cancellation depends on your plan's renewal date (see Terms of Service for details).

Related links: Privacy Policy ・ Terms of Service ・ Contact

Start translating in seconds

Free to start. No credit card required.
10 minute-equivalent credits per month on the Free plan.

Try Free See Voice Interpretation

Real-time voice translation that speaks for you, and live captions that translate as people talk.

Four ways to use LingoFlow

Conversation Mode (two-way voice)

Realtime Interpretation (one-way voice)

Realtime Caption Translation

Audio transcription (upload)

How your data is handled

Not used to train AI

Explicit consent before recording

User-deletable history

How it works

Start live or upload audio

Review transcript and translation

Save the outcome

Use cases

International meetings

Interviews and field notes

Lectures and internal training

Our Products

Voice Interpretation

Realtime Caption Translation

Meeting Translation

Audio Transcription

Example scenarios

1:1 talks with overseas teammates

International conferences and internal training

Field notes, interviews, and recording archives

Pricing

Free

Pro

Quick usage guide

Frequently Asked Questions

Refund & Cancellation

Start translating in seconds

Real-time voice translation that speaks for you,
and live captions that translate as people talk.