YouTube Shorts adds AI avatars: a workflow for cloning creators’ faces and voices inside short-form video

This article was generated by AI and cites original sources.

YouTube Shorts is rolling out a new AI-powered feature that lets creators make a digital “avatar” of themselves—using their face and voice—and then insert that avatar into existing Shorts or generate entirely new clips from prompts. As reported by The Verge, the rollout is framed by YouTube as a “safer and more secure way” to use AI, while the company’s broader relationship with AI-generated content remains “fraught,” with ongoing struggles around AI-generated “slop,” deepfake scams, and impersonations (source: The Verge, Apr. 9, 2026).

What YouTube is shipping: an avatar that can be reused in Shorts

The core capability described in the source is an AI avatar system integrated into YouTube Shorts. YouTube says the tool will let users create a digital version of themselves, called an avatar, that can be inserted into existing Shorts videos or used to generate entirely new ones (source: The Verge, Apr. 9, 2026). In other words, the feature is not just a one-off “generate a clip” button; it is positioned as a creator-specific asset that can be reused during Shorts creation.

YouTube’s stated goal is that avatars will “look and sound like you” (source: The Verge). That phrasing matters technically because it implies a combination of facial and vocal capture, as well as generation that aims to preserve identity cues. The source notes that creating an avatar involves a guided recording process rather than a single click.

After an avatar is created, the workflow splits into two use cases:

Prompt-to-video generation: while creating a video, users can select “make a video with my avatar” to generate a clip from prompts. The clip length is described as “up to eight seconds long” (source: The Verge via 9to5google).
Avatar insertion into existing content: users can add their avatar to “eligible Shorts” in their feed. The source says YouTube did not specify what makes a Short eligible (source: The Verge).

This distinction suggests YouTube is building both a generative pipeline (text/prompts to short clips) and a reuse/remixing mechanism (avatar integration into other Shorts), even if eligibility rules are not fully disclosed in the article.

The capture step: “live selfie” face and voice recording

According to the source, building an avatar is “a bit more involved” than pressing a button. YouTube’s blog post describes a requirement to record a “live selfie” capturing both the face and voice while following a series of prompts (source: The Verge, Apr. 9, 2026). That design choice indicates YouTube is attempting to standardize the input data used for avatar creation, because the quality of voice and facial capture can affect downstream synthesis quality.

The source lists specific recommendations YouTube provides for best results:

Good lighting
A quiet area
A background free of other people or images of faces
Holding the phone at eye level

While these are framed as usability guidance, the technical implication is that the avatar model likely depends on clean visual and audio inputs. The “background free of other people or images of faces” recommendation is particularly relevant because it points to the risk of confusing identity signals or contaminating training/conditioning data with additional faces.

From an engineering perspective, this resembles a controlled enrollment flow: YouTube asks creators to provide a consistent capture setup so the system can learn or condition on a specific face/voice representation. Observers may watch for how strictly YouTube enforces the prompt and capture requirements, since the source describes the process as “fairly straightforward,” but still gated by recording quality.

Guardrails: restrictions on where avatars can appear and who can remix

The source emphasizes that the avatar feature includes “fairly tight restrictions.” These constraints are central to how YouTube is trying to address the trust problems that have surrounded generative AI video—especially deepfakes and impersonations.

Per The Verge, avatars can only be used in the creator’s own original videos. The creator also controls whether their Shorts can be remixed. Additionally, the creator can delete their avatar or videos where it appears at any time (source: The Verge, Apr. 9, 2026).

There is also a time-based constraint mentioned at the end of the provided excerpt: avatars that aren’t used to create new content for three years will … (the source text cuts off before completing the rule) (source: The Verge, Apr. 9, 2026). Because the excerpt ends mid-sentence, the exact details of this retention/expiration policy are not fully available in the supplied material.

Even with that limitation, the described controls show a policy-and-product coupling: YouTube is tying avatar usage to creator ownership and adding user-controlled takedown options. In analysis terms, this suggests YouTube is attempting to reduce the surface area for impersonation by limiting avatar application to the account that created it, rather than allowing broader reuse across the platform.

At the same time, the source notes YouTube “did not specify what makes a Short eligible” for avatar insertion in the feed (source: The Verge). That uncertainty leaves room for interpretation: eligibility could be driven by factors like the original creator’s remixed permissions, the content format, or other internal criteria. Since the article does not enumerate the rules, it is not possible to confirm the mechanism from the provided text.

Why this matters: AI avatars inside a platform already dealing with “AI slop” and scams

The feature lands in the context of what The Verge describes as YouTube’s “fraught relationship” with AI-generated content. The article points to ongoing struggles to contain AI-generated “slop,” deepfake scams, and impersonations, even as the platform adds more generative features (source: The Verge, Apr. 9, 2026).

In that environment, the avatar tool is a notable product direction: it operationalizes identity cloning in a way that is explicitly linked to creator-controlled capture and usage. YouTube frames avatars as “safer and more secure” for creating new content (source: The Verge). The technical and procedural details provided—live selfie enrollment, recommended capture conditions, creator-only usage, and deletion controls—are part of that framing.

At the industry level, this could signal a broader pattern for short-form video platforms: rather than treating generative AI as purely content-creation tooling, platforms may increasingly integrate AI identity workflows into their existing creator ecosystems. The source does not explicitly compare YouTube to competitors, but it does show YouTube building a repeatable pipeline for avatar creation and prompt-based clip generation with an eight-second maximum (source: The Verge via 9to5google).

There are also product implications for how remixing and distribution work. The ability to add an avatar to “eligible Shorts” and the mention that creators control whether their Shorts can be remixed suggest that permissions and eligibility will become a key part of the user experience. However, because the source does not define eligibility criteria, any deeper conclusions about how broadly avatars can spread across the feed would be speculation beyond the provided material.

Finally, the three-year unused avatar rule (incomplete in the excerpt) hints at lifecycle management for generated identity assets. If fully implemented as a time-bound policy, it could affect how creators plan long-term usage of avatars, and it could influence how platforms handle storage and risk over time. The supplied text does not finish the sentence, so the exact behavior remains unclear.

Source: The Verge