One of many extra surprising merchandise to launch out of the Microsoft Ignite 2023 occasion is a instrument that may create a photorealistic avatar of an individual and animate that avatar saying issues that the individual didn’t essentially say.
Known as Azure AI Speech textual content to speech avatar, the brand new function, obtainable in public preview as of right now, lets customers generate movies of an avatar talking by importing pictures of an individual they need the avatar to resemble and writing a script. Microsoft’s instrument trains a mannequin to drive the animation, whereas a separate text-to-speech mannequin — both prebuilt or educated on the individual’s voice — “reads” the script aloud.
“With textual content to speech avatar, customers can extra effectively create video … to construct coaching movies, product introductions, buyer testimonials [and so on] merely with textual content enter,” writes Microsoft in a blog post. “You should use the avatar to construct conversational brokers, digital assistants, chatbots and extra.”
Avatars can communicate in a number of languages. And, for chatbot situations, they will faucet AI fashions like OpenAI’s GPT-3.5 to answer off-script questions from clients.
Now, there are numerous methods such a instrument might be abused — which Microsoft to its credit score realizes. (Comparable avatar-generating tech from AI startup Synthesia has been misused to supply propaganda in Venezuela and false information studies promoted by pro-China social media accounts.) Most Azure subscribers will solely be capable to entry prebuilt — not customized — avatars at launch; customized avatars are at present a “restricted entry” functionality obtainable by registration solely and “just for sure use instances,” Microsoft says.
However the function raises a bunch of uncomfortable moral questions.
One of many main sticking factors within the latest SAG-AFTRA strike was the usage of AI to create digital likenesses. Studios finally agreed to pay actors for his or her AI-generated likenesses. However what about Microsoft and its clients?
I requested Microsoft its place on corporations utilizing actors’ likenesses with out, within the actors’ views, correct compensation and even notification. The corporate didn’t reply — nor did it say whether or not it will require that corporations label avatars as AI-generated, like YouTube and a growing number of different platforms.
Private voice
Microsoft seems to have extra guardrails round a associated generative AI instrument, private voice, that’s additionally launching at Ignite.
Private voice, a brand new functionality inside Microsoft’s customized neural voice service, can replicate a consumer’s voice in just a few seconds supplied a one-minute speech pattern as an audio immediate. Microsoft pitches it as a solution to create personalised voice assistants, dub content material into totally different languages and generate bespoke narrations for tales, audio books and podcasts.
To thrust back potential authorized complications, Microsoft’s requiring that customers give “express consent” within the type of a recorded assertion earlier than a buyer can use private voice to synthesize their voices. Entry to the function is gated behind a registration kind in the interim, and clients should agree to make use of private voice solely in functions “the place the voice doesn’t learn user-generated or open-ended content material.”
“Voice mannequin utilization should stay inside an software and output should not be publishable or shareable from the appliance,” Microsoft writes in a weblog publish. “[C]ustomers who meet restricted entry eligibility standards preserve sole management over the creation of, entry to and use of the voice fashions and their output [where it concerns] dubbing for movies, TV, video and audio for leisure situations solely.”
Microsoft didn’t reply TechCrunch’s questions on how actors is likely to be compensated for his or her private voice contributions — or whether or not it plans to implement any form of watermarking tech in order that AI-generated voices is likely to be extra simply recognized.
For extra Microsoft Ignite 2023 protection:
This story was initially printed at 8am PT on Nov. 15 and up to date at 3:30pm PT.