Category Archives: AI safety

Is there a tradeoff between immediate and longer-term AI safety efforts?

Something I often hear in the machine learning community and media articles is “Worries about superintelligence are a distraction from the *real* problem X that we are facing today with AI” (where X = algorithmic bias, technological unemployment, interpretability, data privacy, etc). This competitive attitude gives the impression that immediate and longer-term safety concerns are in conflict. But is there actually a tradeoff between them?


We can make this question more specific: what resources might these two types of efforts be competing for?

Continue reading


2017-18 New Year review

2017 progress


FLI / other AI safety:

Continue reading

NIPS 2017 Report


This year’s NIPS gave me a general sense that near-term AI safety is now mainstream and long-term safety is slowly going mainstream. On the near-term side, I particularly enjoyed Kate Crawford’s keynote on neglected problems in AI fairness, the ML security workshops, and the Interpretable ML symposium debate that addressed the “do we even need interpretability?” question in a somewhat sloppy but entertaining way. There was a lot of great content on the long-term side, including several oral / spotlight presentations and the Aligned AI workshop.

Continue reading

Tokyo AI & Society Symposium

I just spent a week in Japan to speak at the inaugural symposium on AI & Society – my first conference in Asia. It was inspiring to take part in an increasingly global conversation about AI impacts, and interesting to see how the Japanese AI community thinks about these issues. Overall, Japanese researchers seemed more open to discussing controversial topics like human-level AI and consciousness than their Western counterparts. Most people were more interested in near-term AI ethics concerns but also curious about long term problems.

The talks were a mix of English and Japanese with translation available over audio (high quality but still hard to follow when the slides are in Japanese). Here are some tidbits from my favorite talks and sessions.

Continue reading

Portfolio approach to AI safety research

dimensionsLong-term AI safety is an inherently speculative research area, aiming to ensure safety of advanced future systems despite uncertainty about their design or algorithms or objectives. It thus seems particularly important to have different research teams tackle the problems from different perspectives and under different assumptions. While some fraction of the research might not end up being useful, a portfolio approach makes it more likely that at least some of us will be right.

In this post, I look at some dimensions along which assumptions differ, and identify some underexplored reasonable assumptions that might be relevant for prioritizing safety research. (In the interest of making this breakdown as comprehensive and useful as possible, please let me know if I got something wrong or missed anything important.)

Continue reading

Highlights from the ICLR conference: food, ships, and ML security

It’s been an eventful few days at ICLR in the coastal town of Toulon in Southern France, after a pleasant train ride from London with a stopover in Paris for some sightseeing. There was more food than is usually provided at conferences, and I ended up almost entirely subsisting on tasty appetizers. The parties were memorable this year, including one in a vineyard and one in a naval museum. The overall theme of the conference setting could be summarized as “finger food and ships”.


There were a lot of interesting papers this year, especially on machine learning security, which will be the focus on this post. (Here is a great overview of the topic.)

Continue reading

2016-17 New Year review

2016 progress

Research / career:

  • Got a job at DeepMind as a research scientist in AI safety.
  • Presented MiniSPN paper at ICLR workshop.
  • Finished RNN interpretability paper and presented at ICML and NIPS workshops.
  • Attended the Deep Learning Summer School.
  • Finished and defended PhD thesis.
  • Moved to London and started working at DeepMind.


  • Talk and panel (moderator) at Effective Altruism Global X Boston
  • Talk and panel at the Governance of Emerging Technologies conference at ASU
  • Talk and panel at Brain Bar Budapest
  • AI safety session at OpenAI unconference
  • Talk and panel at Effective Altruism Global X Oxford
  • Talk and panel at Cambridge Catastrophic Risk Conference run by CSER

Continue reading