Upcoming Events
-

Weird AI Wednesday: What the Luddites Can Teach Us About Societal Response to AI
Wednesday, November 19 6:00 PM - 9:00 PM
The Luddites were a 1800's-era movement of textile workers who smashed weaving machines because they feared being replaced.
Jason Yung revisits the Luddites as a case study in societal resistance to technological disruption and asks: what lessons do they offer for today’s adaptation to AI, especially under scenarios of large-scale displacement and economic restructuring?
-

AI Safety Thursday: Introduction to Corrigibility
Thursday, November 20 6:00 PM - 9:00 PM
Rubi Hudson will discuss the concept of "corrigibility", where an AI is willing to accept updates to its goal, why it's important for AI safety, challenges to achieving it, and some promising new work on the subject.
-

AI Policy Tuesday: Predicting Shifts in AI-driven Security Risks
Tuesday, November 25 6:00 PM - 9:00 PM
Will AI driven security risks be more severe before the development of transformative AI, or after?
Wim Howson Creutzberg will give an overview of current research on how the severity and nature of risks stemming from the development of advanced AI are expected to change over time, drawing centrally on “The Artificial General Intelligence Race and International Security".
-

AI Safety Thursday: Sandbagging - How Models Use Reward-Hacking to Downplay Their True Capabilities
Thursday, November 27 6:00 PM - 9:00 PM
Robert Adragna will report the results of his research on the growing ability and willingness of models to "sandbag" - that is, to deliberately suggest weaker capabilities during training via reward-hacking.
Past Events
-
AI Safety Thursdays: Avoiding Gradual Disempowerment
Thursday, July 3rd, 6pm-8pm
This talk explored the concept of gradual disempowerment as an alternative to the abrupt takeover scenarios often discussed in AI safety. Dr David Duvenaud examined how even incremental improvements in AI capabilities can erode human influence over critical societal systems, including the economy, culture, and governance.
-
AI Policy Tuesdays: Agent Governance
Tuesday, June 24th.
Kathrin Gardhouse presented on the nascent field of Agent Governance, drawing from a recent report by IAPS.
The presentation covered current agent capabilities, expected developments, governance challenges, and proposed solutions.
-
Hackathon: Apart x Martian Mechanistic Router Interpretability Hackathon
Friday, May 30 - Sunday, Jun 1.
We hosted a jamsite for Apart Research and Martian's hackathon.
-
AI Safety Thursdays: Advanced AI's Impact on Power and Society
Thursday, May 29th, 6pm-8pm
Historically, significant technological shifts often coincide with political instability, and sometimes violent transfers of power. Should we expect AI to follow this pattern, or are there reasons to hope for a smooth transition to the post AI world?
Anson Ho drew upon economic models, broad historical trends, and recent developments in deep learning to guide us through an exploration of this question.
-
AI + Human Flourishing: Policy Levers for AI Governance
Sunday, May 4, 2025.
Considerations of AI governance are increasingly urgent as powerful models become more capable and widely deployed. Kathrin Gardhouse delivered a presentation on the available mechanisms we can use to govern AI, from policy levers to technical AI governance. It was a high-level introduction to the world of AI policy to get a sense of the lay of the land.
-
AI Safety Thursday: "AI-2027"
Thursday April 24th, 2025.
On April 3rd, a team of AI experts and superforecasters at The AI Futures Project, published a narrative called AI-2027 outlining a possible scenario of explosive AI development and takeover occurring during the coming 2 years.
Mario Gibney guided us through a presentation and discussion of the scenario where we explore how likely it is to actually track reality in the coming years.

