Gemini gives Google Docs a voice with new ‘audio’ feature

Gemini gives Google Docs a voice with new ‘audio’ feature

The feature, announced through a low-key update on Google’s Workspace blog, is already being hailed as one of the most user-focused AI enhancements to the platform in years.

Google has rolled out a new feature that lets your documents speak back to you. Starting this week, users of Google Docs can now have their content read aloud, thanks to the integration of Gemini’s new “Audio” text-to-speech technology.

The feature, announced through a low-key update on Google’s Workspace blog, is already being hailed as one of the most user-focused AI enhancements to the platform in years.

The feature is nestled under the "Tools" menu, called “Listen to this tab.”

With just a click, an embedded mini audio player appears at the top of your document. Users can now go ahead and press play and hear their words read back in a smooth, AI-generated voice, ideal for multitasking, editing, or simply absorbing information in a different format.

For creators and educators who want their readers to listen instead of read, a second option under the Insert then Audio buttons, allows users to place custom play buttons directly in the document.

These buttons can be resized, styled, and labelled, turning even a plain text doc into a dynamic, audio-friendly experience.

Google is reimagining Docs as more than just a writing tool, it is now a listening tool too.

Students can review notes while commuting or doing other tasks, while writers can hear how their articles sound aloud.

Professionals can turn internal memos into bite-sized briefings, and most importantly for users with visual or cognitive disabilities, the accessibility benefits are immediate and meaningful.

The new functionality is backed by Gemini, Google’s flagship AI platform. And unlike traditional robotic narration, Gemini’s voices are context-aware and customisable.

Users can select from six voice styles- Narrator, Educator, Coach, Persuader, Explainer, and Motivator- giving the spoken word a touch of personality.

The rollout is already underway, as of August 18, the “Audio” feature is available to users on Rapid Release domains, with Scheduled Release users getting access from August 25.

The feature is being made available across multiple Google Workspace tiers, including: Business Standard and Plus, Enterprise Standard and Plus, Education Plus and Gemini for Education and Google AI Pro and Ultra.

Notably, those on discontinued Gemini Business plans will still retain access.

However, not everyone is convinced. Privacy advocates have raised questions about the implications of documents being turned into streamed audio.

Additionally, early users report occasional hiccups, mispronunciations of names, uneven pacing, or awkward emphasis.

Google has promised improvements, but like most AI features, the system will likely rely on user feedback to get smarter.

Reader Comments

Trending

Popular Stories This Week

Stay ahead of the news! Click ‘Yes, Thanks’ to receive breaking stories and exclusive updates directly to your device. Be the first to know what’s happening.