111: Aligning our strategies on building and adopting a Safe and Secure Artificial Intelligence

Science, Knowledge, and Policy

Panel: 111

Aligning our strategies on building and adopting a Safe and Secure Artificial Intelligence

Organized by: IVADO / Université de Montréal

Panel Date: November 20, 2024

Speakers:

Dhanya Sridhar
Christopher Pal
Nicolas Papernot
Surdas Mohit
Marthe Kassouf

Abstract:
As artificial intelligence becomes integral across our economy, addressing both the incredible potential and inherent risks associated with its advancement and implementation is imperative. The panel will elucidate the challenges and strategies for developing safe and secure AI technologies that are innovative and beneficial to society at large. It will explore the current state-of-the-art, potential pitfalls, and risk mitigation approaches. Discussions will focus on effective design, management, and deployment of AI systems, addressing unpredictability, interpretability, and the need for robust safety measures. Expert implementation is imperative. The panel will elucidate the challenges and strategies for developing safe and secure AI technologies that are innovative and beneficial to society at large. It will explore the current state-of-the-art, potential pitfalls, and risk mitigation approaches. Discussions will focus on effective design, management, and deployment of AI systems, addressing unpredictability, interpretability, and the need for robust safety measures. Expert panelists will highlight research, government, and industry efforts to improve AI safety and alignment with societal values.

Summary of Conversations

The panel delved into AI safety, emphasizing machine learning’s role and associated risks like harmful stereotypes and unsafe actions. The discussion highlighted the challenges in aligning AI with human values, noting that current alignment techniques are insufficient. Panellists explored the prevalence of hallucinations in language models and methods to improve factuality through retrieval augmented generation. A crucial point was the necessity of formally defining trust assumptions among AI stakeholders, advocating for cryptography to ensure responsible AI use. The conversation touched on the need for contextual AI regulation to balance innovation and risk mitigation, alongside the significance of testing AI systems under various operational constraints to prevent unintended consequences.

Take Away Messages/ Current Status of Challenges

AI systems, especially large language models, are prone to generating harmful stereotypes and unsafe responses, necessitating robust alignment mechanisms.
Current AI alignment techniques are insufficient to fully address unanticipated behaviours and consequences in AI systems.
Trust assumptions among AI stakeholders must be formally defined to ensure stronger guarantees from deployed AI technologies.
AI models often hallucinate facts, emphasizing the need for techniques like retrieval augmented generation to improve accuracy.
Achieving a balance between fostering AI innovation and mitigating risks through regulation remains a central challenge.
Testing AI systems across diverse operational constraints is crucial to uncovering vulnerabilities and potential failures.
There is a risk of AI systems learning from their incorrect outputs, leading to model collapse over time.
A collaborative approach involving academia, industry, and government is essential for framing the design and use of AI technologies.

Recommendations/Next Steps

Develop rigorous testing frameworks for AI systems, especially those acting as agents, to identify and mitigate potential risks.
Implement mechanisms for AI systems to provide feedback on their reasoning, enhancing transparency and trust for human operators.
Incorporate cryptography to enable companies to prove the responsible use of AI, promoting accountability and building trust with regulators and the public.
Establish clear, adaptable regulatory frameworks that address the specific risks associated with different AI use cases.
Foster international collaboration to coordinate AI regulation, ensuring consistent standards and preventing the pollution of the internet with incorrect information.
Promote the development and adoption of standards for risk assessment on AI systems to build trust and facilitate responsible innovation.
Encourage the encoding of metadata indicating whether content is AI-generated and including reliability estimates to address the challenge of AI-generated misinformation.
Advance research into verifiable building blocks for AI systems, enabling more transparent and adaptable regulation.

* This summary has been generated with the assistance of AI tools