09/11/2025
๐ก What if your AI could interrupt you naturallyโjust like a real conversation?
๐น Train with Dataocean AIโs 9,000-Hour Chinese Full-Duplex Corpus โ powering the next generation of real-time, interruptible AI.
โ
10,000 speakers across diverse scenarios
โ
Rich annotations: interruptions, overlaps, laughter, feedback cues
โ
Diverse scenarios: daily conversations, business meetings, AI assistants, new energy scenarios, and more
โ
High transcription accuracy: up to 97%
๐If you want your models to reach GPT Realtimeโlevel fluency, this dataset is your starting point.
๐ Explore the full story here:
Currently, most speech training datasets consist of continuous recordings with complete conversational turns, lacking the naturally occurring, hard-to-model