This year’s Neural Information Processing Systems conference was larger than ever, with almost 6000 people attending, hosted in a huge convention center in Barcelona, Spain. The conference started off with two exciting announcements on open-sourcing collections of environments for training and testing general AI capabilities – the DeepMind Lab and the OpenAI Universe. Among other things, this is promising for testing safety properties of ML algorithms. OpenAI has already used their Universe environment to give an entertaining and instructive demonstration of reward hacking that illustrates the challenge of designing robust reward functions.
I was happy to see a lot of AI-safety-related content at NIPS this year. The ML and the Law symposium and Interpretable ML for Complex Systems workshop focused on near-term AI safety issues, while the Reliable ML in the Wild workshop also covered long-term problems. Here are some papers relevant to long-term AI safety:
Last weekend, I attended OpenAI’s self-organizing conference on machine learning (SOCML 2016), meta-organized by Ian Goodfellow (thanks Ian!). It was held at OpenAI’s new office, with several floors of large open spaces. The unconference format was intended to encourage people to present current ideas alongside with completed work. The schedule mostly consisted of 2-hour blocks with broad topics like “reinforcement learning” and “generative models”, guided by volunteer moderators. I especially enjoyed the sessions on neuroscience and AI and transfer learning, which had smaller and more manageable groups than the crowded popular sessions, and diligent moderators who wrote down the important points on the whiteboard. Overall, I had more interesting conversation but also more auditory overload at SOCML than at other conferences.
To my excitement, there was a block for AI safety along with the other topics. The safety session became a broad introductory Q&A, moderated by Nate Soares, Jelena Luketina and me. Some topics that came up: value alignment, interpretability, adversarial examples, weaponization of AI.
AI safety discussion group (image courtesy of Been Kim)
I recently defended my PhD thesis, and a chapter of my life has now come to an end. It feels both exciting and a bit disorienting to be done with this phase of much stress and growth. My past self who started this five years ago, with a very vague idea of what she was getting into, was a rather different person from my current self.
I have developed various skills over these five years, both professionally and otherwise. I learned to read papers and explain them to others, to work on problems that take months rather than hours and be content with small bits of progress. I used to believe that I should be interested in everything, and gradually gave myself permission not to care about most topics to be able to focus on things that are actually interesting to me, developing some sense of discernment. In 2012 I was afraid to comment on the LessWrong forum because I might say something stupid and get downvoted – in 2013 I wrote my first post, and in 2014 I started this blog. I went through the Toastmasters program and learned to speak in front of groups, though I still feel nervous when speaking on technical topics, especially about my own work. I co-founded a group house and a nonprofit, both of which are still flourishing. I learned how to run events and lead organizations, starting with LessWrong meetups and the Harvard Toastmasters club, which were later displaced by running FLI.
A few weeks ago, Janos and I attended the Deep Learning Summer School at the University of Montreal. Various well-known researchers covered topics related to deep learning, from reinforcement learning to computational neuroscience (see the list of speakers with slides and videos). Here are a few ideas that I found interesting in the talks (this list is far from exhaustive):
You can do transfer learning in convolutional neural nets by freezing the parameters in some layers and retraining others on a different domain for the same task (paper). For example, if you have a neural net for scene recognition trained on real images of bedrooms, you could reuse the same architecture to recognize drawings of bedrooms. The last few layers represent abstractions like “bed” or “lamp”, which apply to drawings just as well as to real images, while the first few layers represent textures, which would differ between the two data modalities of real images and drawings. More generally, the last few layers are task-dependent and modality-independent, while the first few layers are the opposite.
There has been a lot of discussion about the appropriate level of openness in AI research in the past year – the OpenAI announcement, the blog post Should AI Be Open?, a response to the latter, and Nick Bostrom’s thorough paper Strategic Implications of Openness in AI development.
There is disagreement on this question within the AI safety community as well as outside it. Many people are justifiably afraid of concentrating power to create AGI and determine its values in the hands of one company or organization. Many others are concerned about the information hazards of open-sourcing AGI and the resulting potential for misuse. In this post, I argue that some sort of compromise between openness and secrecy will be necessary, as both extremes of complete secrecy and complete openness seem really bad. The good news is that there isn’t a single axis of openness vs secrecy – we can make separate judgment calls for different aspects of AGI development, and develop a set of guidelines.
Google Brain just released an inspiring research agenda, Concrete Problems in AI Safety, co-authored by researchers from OpenAI, Berkeley and Stanford. This document is a milestone in setting concrete research objectives for keeping reinforcement learning agents and other AI systems robust and beneficial. The problems studied are relevant both to near-term and long-term AI safety, from cleaning robots to higher-stakes applications. The paper takes an empirical focus on avoiding accidents as modern machine learning systems become more and more autonomous and powerful.
Reinforcement learning is currently the most promising framework for building artificial agents – it is thus especially important to develop safety guidelines for this subfield of AI. The research agenda describes a comprehensive (though likely non-exhaustive) set of safety problems, corresponding to where things can go wrong when building AI systems:
- Finished paper on the Selective Bayesian Forest Classifier algorithm
- Made an R package for SBFC (beta)
- Worked at Google on unsupervised learning for the Knowledge Graph with Moshe Looks during the summer (paper)
- Joined the HIPS research group at Harvard CS and started working with the awesome Finale Doshi-Velez
- Ratio of coding time to writing time was too high overall
- Co-organized two meetings to brainstorm biotechnology risks
- Co-organized two Machine Learning Safety meetings
- Gave a talk at the Shaping Humanity’s Trajectory workshop at EA Global
- Helped organize NIPS symposium on societal impacts of AI
Rationality / effectiveness:
- Extensive use of FollowUpThen for sending reminders to future selves
- Mapped out my personal bottlenecks
- Tracked insomnia (26% of nights) and sleep time (average 1:30am, stayed up past 1am on 31% of nights)
- Started working on sleep hygiene
- Stopped using melatonin (found it ineffective)
This year’s NIPS was an epicenter of the current enthusiasm about AI and deep learning – there was a visceral sense of how quickly the field of machine learning is progressing, and two new AI startups were announced. Attendance has almost doubled compared to the 2014 conference (I hope they make it multi-track next year), and several popular workshops were standing room only. Given that there were only 400 accepted papers and almost 4000 people attending, most people were there to learn and socialize. The conference was a socially intense experience that reminded me a bit of Burning Man – the overall sense of excitement, the high density of spontaneous interesting conversations, the number of parallel events at any given time, and of course the accumulating exhaustion.
Some interesting talks and posters
Sergey Levine’s robotics demo at the crowded Deep Reinforcement Learning workshop (we showed up half an hour early to claim spots on the floor). This was one of the talks that gave me a sense of fast progress in the field. The presentation started with videos from this summer’s DARPA robotics challenge, where the robots kept falling down while trying to walk or open a door. Levine proceeded to outline his recent work on guided policy search, alternating between trajectory optimization and supervised training of the neural network, and granularizing complex tasks. He showed demos of robots successfully performing various high-dexterity tasks, like opening a door, screwing on a bottle cap, or putting a coat hanger on a rack. Impressive!