I recently defended my PhD thesis, and a chapter of my life has now come to an end. It feels both exciting and a bit disorienting to be done with this phase of much stress and growth. My past self who started this five years ago, with a very vague idea of what she was getting into, was a rather different person from my current self.
I have developed various skills over these five years, both professionally and otherwise. I learned to read papers and explain them to others, to work on problems that take months rather than hours and be content with small bits of progress. I used to believe that I should be interested in everything, and gradually gave myself permission not to care about most topics to be able to focus on things that are actually interesting to me, developing some sense of discernment. In 2012 I was afraid to comment on the LessWrong forum because I might say something stupid and get downvoted – in 2013 I wrote my first post, and in 2014 I started this blog. I went through the Toastmasters program and learned to speak in front of groups, though I still feel nervous when speaking on technical topics, especially about my own work. I co-founded a group house and a nonprofit, both of which are still flourishing. I learned how to run events and lead organizations, starting with LessWrong meetups and the Harvard Toastmasters club, which were later displaced by running FLI.
I remember agonizing over whether I should do a PhD or not, and I wish I had instead spent more time deciding where and how to do it. I applied to a few statistics departments in the Boston area and joined the same department that Janos was in, without seriously considering computer science, even though my only research experience back in undergrad was in that field. The statistics department was full of interesting courses and brilliant people that taught me a great deal, but the cultural fit wasn’t quite right and I felt a bit out of place there. I eventually found my way to the computer science department at the end of my fourth year, but I wish I had started out there to begin with.
My research work took a rather meandering path that somehow came together in the end. My first project was part of the astrostatistics seminar, which I was not particularly motivated about, but I expected myself to be interested in everything. I never quite understood what people were talking about in the seminar or what I was supposed to be doing, and quietly dropped the project at the end of my first year when leaving for my quantitative analyst internship at D.E.Shaw. The internship was my first experience in industry, where I learned factor analysis and statistical coding in Python (the final review from my manager boiled down to “great coder, research skills need work”). In second year, my advisor offered me a project that was unfinished by his previous students, which would take a few months to polish up. The project was on a new method for classification and variable selection called SBFC. I dug up a bunch of issues with the existing model and code, from runtime performance to MCMC detailed balance, and ended up stuck on the project for 3 years. During that time, I dabbled with another project that sort of petered out, did a Google internship on modeling ad quality, and sank a ton of time into FLI. In the middle of fourth year, SBFC was still my only project, and things were not looking great for graduating.
This was when I realized that the part of statistics that was interesting to me was the overlap with computer science and AI, a.k.a. machine learning. I went to the NIPS conference for the first time, and met a lot of AI researchers – I didn’t understand a lot of their work, but I liked the way they thought. I co-organized FLI’s Puerto Rico conference and met more AI people there. I finally ventured outside the stats department and started sitting in on ML lab meetings at the CS department, which mostly consisted of research updates on variational autoencoders that went right over my head. I studied a lot to fill the gaps in my ML knowledge that were not covered by my statistics background, namely neural networks and reinforcement learning (still need to read Sutton & Barto…). To my surprise, many people at the ML lab were also transplants from other departments, officially doing PhDs in math or physics.
That summer I did my second internship at Google, on sum-product network models (SPNs) for anomaly detection in the Knowledge Graph. I wondered if it would result in a paper that could be shoehorned into my thesis, and whether I could find a common thread between SPNs, SBFC and my upcoming project at the ML lab. This unifying theme turned out to be interpretability – the main selling point of SBFC, an advantage of SPNs over other similarly expressive models, and one of my CS advisor’s interests. Working on interpretability was a way to bring more of the statistical perspective into machine learning, and seemed relevant to AI safety as well. With this newfound sense of direction, in a new environment, my fifth year had as much research output as the previous three put together, and I presented two workshop posters in 2016 – on SPNs at ICLR, and on RNN interpretability at ICML.
Volunteering for FLI during grad school started out as a kind of double life, and ended up interacting with my career in interesting ways. For a while I didn’t tell anyone in my department that I co-founded a nonprofit trying to save the world from existential risk, which was often taking up more of my time than research. However, FLI’s outreach work on AI safety was also beneficial to me – as one of the AI experts on the FLI core team, I met a lot of AI researchers who I may not have connected with otherwise. When I met the DeepMind founders at the Puerto Rico conference, I would not have predicted that I’ll be interviewing for their AI safety team a year later. The two streams of my interests, ML and AI safety, have finally crossed, and the double life is no more.
What lessons have I drawn from the grad school experience, and what advice could I give to others?
- Going to conferences and socializing with other researchers was super useful and fun. I highly recommend attending NIPS and ICML even if you’re not presenting.
- Academic departments vary widely in their requirements. For example, the statistics department expected PhD students to teach 10 sections (I got away with doing 5 sections and it was still a lot of work), while the CS department only expected 1-2 sections.
- Internships were a great source of research experience and funding (a better use of time than teaching, in my opinion). It’s worth spending a summer interning at a good company, even if you are definitely going into academia.
- Contrary to common experience, writer’s block was not an obstacle for me. My actual bottleneck was coding, debugging and running experiments, which was often tedious and took over half of my research time, so it’s well worth optimizing those aspects of the work.
- The way FLI ended up contributing to my career path reminds me of a story about Steve Jobs sitting in on a calligraphy class that later turned out to be super relevant to creating snazzy fonts for Apple computers. I would recommend making time for seemingly orthogonal activities during grad school that you’re passionate about, both because they provide a stimulating break from research, and because they could become unexpectedly useful later.
Doing a PhD was pretty stressful for me, but ultimately worthwhile. A huge thank you to everyone who guided and supported me through it!