“Increasing demand is driving the boom in digital humans,” says Shiyan Li, head of Baidu’s robotics and digital humans business, which created the digital model actor, Gong. “In China alone, there are over 400 million fans of ACGN (animation, comics, games and novels) and a business market worth hundreds of billions of dollars focused on digital humans.” And according to a company that tracks business records, Qichacha, in China there are now more than 280,000 businesses engaged in people-related digital activities.
Another type of digital
Baidu’s digital celebrity debut may not seem like much at first, as the concept of “virtual idols” has been around for years. For example, American virtual influencer Lil Miquela has appeared alongside real human celebrities in online ads and TV commercials since 2016, gaining more than three million followers on Instagram. However, there is something different about the virtual Chinese star: a digital human with the ability to hear, speak and interact with real humans on a level never seen before. And Gong’s digital duties aren’t limited to singing. In the latest update of the Baidu app, China’s leading feed and search app, Gong appears on users’ phones, helping with searches and queries using the model actor’s real voice. Since this interactive search experience was launched in 2021, it has increased the number of voice search queries on the Baidu app by 18.2%.
Baidu AI Cloud started developing a digital employee in 2019 in collaboration with Shanghai Pudong Development (SPD) Bank. They subsequently focused their efforts on building a digital financial advisor to provide service equivalent to that of a human bank representative when real-life employees were not available. Today, SPD Bank says more than 460,000 customers rely on digital humans for banking services and portfolio management every month. “Access to digital humans outside of regular business hours allows SPD Bank to provide 24/7 customer service at low cost and high efficiency,” says a bank representative.
More recently, a virtual anchor created by Baidu provided live sign language commentary at the 2022 Beijing Winter Games for hearing-impaired viewers. In addition to looking like a real person, the avatar was equipped with voice recognition and sign language interpretation skills to ensure fast and highly accurate entry and exit. According to the World Health Organization, with approximately 430 million people worldwide suffering from “disabling” hearing loss, there is great potential for this technology to be used to increase their ability to access a wide range of of contents.
XiLing: A Next Generation in an AI Platform
From entertainment to public services, digital humans will play a larger role in our daily lives. But behind its natural and effortless appearance is a complex web of new and emerging technologies that push the boundaries of AI innovation.
Baidu AI Cloud digital celebrities and virtual sign language anchors were created through XiLing, a new digital platform launched in 2021. At the Baidu World 2022 event held on July 21, the company announced a new capability in XiLing, which supports the creation of digital content. humans who can host live broadcasts who can sing, dance and respond to comments in real time, without needing a single break. XiLing is unique in its ability to support the entire process of creating a digital human, from creating a realistic persona to equipping it with conversational and content generation skills. One of its most amazing attributes is speed. The platform can generate a 3D avatar based on a real person in a week or two, while a 2D avatar can be made in minutes.
Additionally, using XiLing’s intelligent dialogue tools, creators can quickly customize a digital human’s conversational ability, letting it adapt and learn over time. This capability is powered by Baidu’s PLATO, a one-hundred-billion-parameter dialogue model that enables digital humans to engage in open-domain conversations, meaning understand any topic and provide relevant responses. High-precision speech recognition and lip-syncing with over 98.5% accuracy enables the digital human to have smoother and more human-like interactions. “The use of advanced AI technologies will continue to reduce the cost of building digital humans and significantly improve their interactions with real humans,” says Li.
Just as each real human has its own set of skills and talents, so does the new generation of digital humans. This may even include giving digital humans the ability to be creative themselves, thanks to the recent breakthrough made by big AI models like Baidu’s ERNIE, which can generate text and create realistic images when asked. Digital humans designed to serve as brand spokespersons, for example, can independently create and post on social media, design posters, and act in videos.