Developers can access two variants tailored to different use cases. Lyria 3 Pro focuses on full-length tracks with detailed structure, while Lyria 3 Clip is optimized for speed, producing shorter segments suited for rapid iteration or social content. Both models support multilingual vocals and a range of genres.
The models include more granular controls, enabling users to guide outputs through natural language prompts. These controls allow for tempo adjustments, structured song sections such as intros and choruses, and timing of lyrics. Lyria 3 also supports image-based inputs to influence mood and style, extending beyond text prompts.
Google is integrating the models across its broader ecosystem. Lyria 3 Pro is available in the Gemini app for paid users and is being added to products including Google Vids, Vertex AI, and AI Studio. It is also accessible through the Gemini API for developers building applications.
The company said the model was trained using partner data along with permitted data from YouTube and Google. It added that the system does not replicate specific artists, though prompts referencing artists may guide output by taking “broad inspiration.”
All generated tracks are marked with SynthID, a digital watermark designed to identify AI-created audio even after modification.
The release positions Lyria as part of Google’s broader effort to expand generative AI tools across developer and enterprise platforms, while addressing transparency concerns around AI-generated content.
This analysis is based on reporting from Google.
Image courtesy of Google.
This article was generated with AI assistance and reviewed for accuracy and quality.