Yubo, the live social discovery app for Gen Z, has expanded its audio moderation technology for livestreams across four of its largest markets: the United States, the United Kingdom, Australia, and Canada. In partnership with Hive, Yubo introduced this technology in the United States at the end of May, becoming the first major social media platform in the world to tackle the challenges of real-time audio analysis.
While significant strides have been made in the advancement of real-time image and video moderation technology, audio moderation has remained an unsolved challenge. Roughly half of people who have reported experiencing harassment in online gaming – where livestreaming has historically been most prevalent – were targeted by voice, according to a report by the Anti-Defamation League.
Yubo has since expanded the trial phase to include all majority-English speaking regions where it has large user bases, reaching a critical inflection point to gather substantive insights into this breakthrough safety tool. Although still in its infancy, the technology has proven to be particularly effective at detecting potential real-world risk, such as violence to others or self-harm.
Hive's audio moderation technology on Yubo currently works by recording and automatically transcribing 10-second snippets of audio in livestreams of 10 or more people. The text is then instantly scanned using artificial intelligence. Only transcripts containing words or phrases that violate the app's Community Guidelines are flagged for review by Yubo's Safety Specialists, who begin investigating the incidents in real time to determine what actions should be taken, including whether it is necessary to escalate to law enforcement. Transcripts with no suspected violations are not reviewed nor kept.
The algorithms that power this audio moderation technology utilize machine learning and will therefore continue to improve and become more precise with time. To protect user privacy, livestream transcripts that have not been flagged for investigation are deleted after 24 hours. Transcripts that are flagged and require investigation internally or by law enforcement are stored for up to a year.
"Our expansion of audio moderation technology is not only a key element of Yubo's ever-evolving safety product roadmap, but a critical development in expanding the parameters of online safety industry-wide," said Yubo Chief Operating Officer Marc-Antoine Durand. "There is still a lot of progress to be made in the area of voice detection, but we are proud to be forging a path for our peers by being the first to launch audio moderation with Hive and helping make this tool more reliable and effective through this trial."
Yubo first deployed audio moderation technology this summer to a small cross-section of users in the US. With greater scale across four primary markets, audio moderation keywords now trigger an average of 600 livestreams per day for review by Safety Specialists. Still, a significant share of these constitute "false positives." False positives refer to instances where, for example, a song playing in the background or playful language containing triggering keywords are flagged for review, but are not actually instances of harmful speech. False positives highlight not just the complexity of effective online content moderation, but also the importance of combining technical tools with human oversight for nuance and context. That's why at Yubo human moderators always have the final say on what moderation action to take and continuously supervise the detection algorithms.
"Effective content moderation is a responsibility that we know firsthand is of the utmost importance to Yubo. We are excited to be powering its ecosystem with our technology and learning together how to make online communities safer through continuous innovation," said Hive Chief Executive Officer Kevin Guo.