CSE DSI Machine Learning with Neil Gong (EE, Duke University)

Secure Content Moderation for Generative AI

Generative AI–such as GPT-4 and DALL-E 3–raises many ethical and legal concerns such as the generation of harmful content, scaling disinformation and misinformation campaigns, as well as disrupting education and learning. Content moderation for generative AI aims to address these ethical and legal concerns via 1) preventing a generative AI model from synthesizing harmful content, and 2) detecting AI-generated content. Prevention is often implemented using safety filters, while detection is implemented by watermark. Both prevention and watermark-based detection have been recently widely deployed by industry. In this talk, we will discuss the security of existing prevention and watermark-based detection methods in adversarial settings.

Neil Gong is an Assistant Professor in the Department of Electrical and Computer Engineering and Department of Computer Science (secondary appointment) at Duke University. His research interests are cybersecurity and privacy with a recent focus on AI security. He received an NSF CAREER Award, Army Research Office Young Investigator Program (YIP) Award, Rising Star Award from the Association of Chinese Scholars in Computing, IBM Faculty Award, Facebook Research Award, and multiple best paper or best paper honorable mention awards. He received a B.E. from the University of Science and Technology of China in 2010 (with the highest honor) and a Ph.D in Computer Science from the University of California Berkeley in 2015.

CSE DSI Machine Learning with Neil Gong (EE, Duke University)

Secure Content Moderation for Generative AI

Share