Time: Monday, January 20, 2025
Location: Capital Suite 7
This tutorial will provide a fresh and comprehensive view of AI safety, examining issues such as harmful content, deceptive/persuasive model behaviors, and the possibility of "model consciousness." We will cover both theoretical and practical aspects of large language models (LLMs), multi-modal systems, and agentic AI, illustrating how to identify and address vulnerabilities—from common jailbreak attacks to complex, autonomous decision-making scenarios. Attendees will learn cutting-edge attack and defense techniques, and will explore emerging research directions that balance AI innovation with robust safety strategies. This tutorial is geared toward AI researchers, developers, and security professionals who aim to stay ahead of evolving threats and ensure responsible development of next-generation AI systems.