Category Archives: rationality

2016-17 New Year review

2016 progress

Research / career:

  • Got a job at DeepMind as a research scientist in AI safety.
  • Presented MiniSPN paper at ICLR workshop.
  • Finished RNN interpretability paper and presented at ICML and NIPS workshops.
  • Attended the Deep Learning Summer School.
  • Finished and defended PhD thesis.
  • Moved to London and started working at DeepMind.


  • Talk and panel (moderator) at Effective Altruism Global X Boston
  • Talk and panel at the Governance of Emerging Technologies conference at ASU
  • Talk and panel at Brain Bar Budapest
  • AI safety session at OpenAI unconference
  • Talk and panel at Effective Altruism Global X Oxford
  • Talk and panel at Cambridge Catastrophic Risk Conference run by CSER

Rationality / effectiveness:

  • Went to a 5-day Zentensive meditation retreat with Janos, in between grad school and moving to London. This was very helpful for practicing connecting with my direct emotional experience, and a good way to reset during a life transition.
  • Stopped using 42goals (too glitchy) and started recording data in a Google form emailed to myself daily. Now I am actually entering accurate data every day instead of doing it retroactively whenever I remember. I tried a number of goal tracking apps, but all of them seemed too inflexible (I was surprised not to find anything that provides correlation charts between different goals, e.g. meditation vs. hours of sleep).

Random cool things:

  • Hiked in the Andes to an altitude of 17,000 feet.
  • Visited the Grand Canyon.
  • New countries visited: UK, Bolivia, Spain.
  • Started a group house in London (moving there in a few weeks).
  • Started contributing to the new blog Approximately Correct on societal impacts of machine learning.


2016 prediction outcomes


  1. Finish PhD thesis (70%) – done
  2. Write at least 12 blog posts (40%) – 9
  3. Meditate at least 200 days (50%) – 245
  4. Exercise at least 200 days (50%) – 282
  5. Do at least 5 pullups in a row (40%) – still only 2-3
  6. Record at least 50 new thoughts (50%) – 29
  7. Stay up past 1:30am at most 20% of the nights (40%) – 26.8%
  8. Do at least 10 pomodoros per week on average (50%) – 13


  1. At least one paper accepted for publication (70%) – two papers accepted to workshops
  2. I will get at least one fellowship (40%)
  3. Insomnia at most 20% of nights (20%) – 18.3%
  4. FLI will co-organize at least 3 AI safety workshops (50%) – AAAI, ICML, NIPS


  • Low predictions (20-40%): 1/5 = 20% (overconfident)
  • Medium predictions (50-70%): 6/7 = 85% (underconfident)
  • It’s interesting that my 40% predictions were all wrong, and my 50% predictions were almost all correct. I seem to be translating system 1 labels of ‘not that likely’ and ‘reasonably likely’ to 40% and 50% respectively, while they should translate to something more like 25% and 70%. After the overconfident predictions last year, I tried to tone down the predictions for this year, but the lower ones didn’t get toned down enough.
  • I seem to be more accurate on predictions than resolutions, probably due to wishful thinking. Experimenting with no resolutions for next year.

2017 predictions

  1. Our AI safety team will have at least two papers accepted for publication at a major conference, not counting workshops (70%).
  2. I will write at least 9 blog posts (50%).
  3. I will meditate at least 250 days (45%).
  4. I will exercise at least 250 days (55%).
  5. I will visit at least 2 new countries (80%).
  6. I will attend Burning Man (85%).

Using humility to counteract shame

u0sm9wx“Pride is not the opposite of shame, but its source. True humility is the only antidote to shame.”

Uncle Iroh, “Avatar: The Last Airbender”


Shame is one of the trickiest emotions to deal with. It is difficult to think about, not to mention discuss with others, and gives rise to insidious ugh fields and negative spirals. Shame often underlies other negative emotions without making itself apparent – anxiety or anger at yourself can be caused by unacknowledged shame about the possibility of failure. It can stack on top of other emotions – e.g. you start out feeling upset with someone, and end up being ashamed of yourself for feeling upset, and maybe even ashamed of feeling ashamed if meta-shame is your cup of tea. The most useful approach I have found against shame is invoking humility.

What is humility, anyway? It is often defined as a low view of your own importance, and tends to be conflated with modesty. Another common definition that I find more useful is acceptance of your own flaws and shortcomings. This is more compatible with confidence, and helpful irrespective of your level of importance or comparison to other people. What humility feels like to me on a system 1 level is a sense of compassion and warmth towards yourself while fully aware of your imperfections (while focusing on imperfections without compassion can lead to beating yourself up). According to LessWrong, “to be humble is to take specific actions in anticipation of your own errors”, which seems more like a possible consequence of being humble than a definition.

Humility is a powerful tool for psychological well-being and instrumental rationality that is more broadly applicable than just the ability to anticipate errors by seeing your limitations more clearly. I can summon humility when I feel anxious about too many upcoming deadlines, or angry at myself for being stuck on a rock climbing route, or embarrassed about forgetting some basic fact in my field that I am surely expected to know by the 5th year of grad school.

While humility comes naturally to some people, others might find it useful to explicitly build an identity as a humble person. How can you invoke this mindset? One way is through negative visualization or pre-hindsight, considering how your plans could fail, which can be time-consuming and usually requires system 2. A faster and less effortful way is to is to imagine a person, real or fictional, who you consider to be humble. I often bring to mind my grandfather, or Uncle Iroh from the Avatar series, sometimes literally repeating the above quote in my head, sort of like an affirmation. I don’t actually agree that humility is the only antidote to shame, but it does seem to be one of the most effective.

(Cross-posted to LessWrong. Thanks to Janos Kramar for his feedback on this post.)

2015-16 New Year review

2015 progress


  • Finished paper on the Selective Bayesian Forest Classifier algorithm
  • Made an R package for SBFC (beta)
  • Worked at Google on unsupervised learning for the Knowledge Graph with Moshe Looks during the summer (paper)
  • Joined the HIPS research group at Harvard CS and started working with the awesome Finale Doshi-Velez
  • Ratio of coding time to writing time was too high overall


  • Co-organized two meetings to brainstorm biotechnology risks
  • Co-organized two Machine Learning Safety meetings
  • Gave a talk at the Shaping Humanity’s Trajectory workshop at EA Global
  • Helped organize NIPS symposium on societal impacts of AI

Rationality / effectiveness:

  • Extensive use of FollowUpThen for sending reminders to future selves
  • Mapped out my personal bottlenecks
  • Sleep:
    • Tracked insomnia (26% of nights) and sleep time (average 1:30am, stayed up past 1am on 31% of nights)
    • Started working on sleep hygiene
    • Stopped using melatonin (found it ineffective)

Random cool things I did:

  • Improv class
  • Aerial silks class
  • Climbed out of a glacial abyss (moulin)
  • Placed second at Toastmasters area speech contest

2015 prediction outcomes

Out of the 17 predictions I made a year ago, 5 were true, and the rest were false.

  1. Submit the SBFC paper for publication (95%)
  2. Submit another paper besides SBFC (40%)
  3. Present SBFC results at a conference (JSM, ICML or NIPS) (40%) – presented at a workshop (NESS)
  4. Get a new external fellowship to replace my expiring NSERC fellowship (50%)
  5. Skim at least 20 research papers in machine learning (70%) – probably a lot more
  6. Write at least 12 blog posts (70%) – wrote 9 posts
  7. Climb a 5.12 without rope cheating (50%) – no longer endorsed at this level
  8. Lead climb a 5.11a (50%) – no longer endorsed at this level
  9. Do 10 pullups in a row (60%) – no longer endorsed at this level
  10. Meditate at least 150 times (80%) – 206 times
  11. Record at least 150 new thoughts (70%) – recorded 62, no longer endorsed at this level
  12. Make at least 100 Anki cards by the end of the year (70%)
  13. Read at least 10 books (60%) – read 4 books, no longer endorsed at this level
  14. Attend Burning Man (90%)
  15. Boston will have a second rationalist house by the end of the year (30%)
  16. FLI will hire a full-time project manager or administrator (80%) – no, but we now have a full time website editor…
  17. FLI will start a project on biotech safety (70%) – had some meetings, but no concrete action plan yet


  • low predictions, 30-60%: 0/8 = 0% (super overconfident)
  • high predictions, 70-95%: 5/9 = 56% (overconfident)

(Yikes! Worse than last year…)


  • I forgot about most of these goals after a few months – will need a recurring reminder for next year.
  • All 3 physical goals ended up disendorsed – I think I set those way too high. My climbing habits got disrupted by moving to California in summer and a hand injury, so I’m still trying to return to my spring 2014 skill level.

2016 goals and predictions

Given the overconfidence of last year’s predictions, toning it down for next year.


  1. Finish PhD thesis (70%)
  2. Write at least 12 blog posts (40%)
  3. Meditate at least 200 days (50%)
  4. Exercise at least 200 days (50%)
  5. Do at least 5 pullups in a row (40%)
  6. Record at least 50 new thoughts (50%)
  7. Stay up at most 20% of the nights (40%)
  8. Do at least 10 pomodoros per week on average (50%)


  1. At least one paper accepted for publication (70%)
  2. I will get at least one fellowship (40%)
  3. Insomnia at most 20% of nights (20%)
  4. FLI will co-organize at least 3 AI safety workshops (50%)

Systems I have tried: an overview

I have used various organization and productivity systems in the past few years – this is an overview of what worked and what didn’t.

Main systems I currently use:

  1. Follow Up Then: Sends an email to a future self, with the date and time specified in the email address, e.g. I use it for delaying tasks, recurring reminders, and following up on email threads. This reduces clutter in my todo list, calendar and inbox, and frees my working memory. Lately, I noticed myself remembering a thing shortly before receiving a follow up about it – probably due to the same mechanism that sometimes wakes me up a few minutes before the morning alarm.
  2. Complice: Daily to-do list organized according to goals, with archives and regular reviews. Helpful for specifying the next action to take at a given time, and for tracking progress on individual goals. Downside: I sometimes hesitate to enter tasks into the list, because entered tasks cannot be erased, and leaving a task unfinished is aversive, so often end up entering tasks after they are done instead.
  3. Workflowy: Nested list structure – searchable, with collapsible and sharable sublists. I keep my ongoing todo list (in GTD form) and most of my notes here. Downside: doesn’t work for goal factoring, since it only supports tree structures.
  4. Google Calendar: Self-explanatory. I have recently started adding tentative meeting slots, indicated by a question mark, e.g. “dinner with Janos?”. This has been helpful for keeping track of which time slots I’ve offered to someone. I also added a calendar that shows Facebook events that I’ve been invited to, which is handy.
  5. 42 Goals: Goal tracking with summary graphs and cute symbols. I use this for tracking habits (like exercise and meditation) and other random things (like insomnia occurrences). The graphs are useful – this is how I know that I have the most insomnia on Mondays! Downsides: doesn’t allow non-binary categories, and the phone app is so unreliable that I never use it – if you know good alternative tracking systems, let me know!

Systems I no longer use:

  • Beeminder: Goal tracking with nice graphs, and goal setting with reminders and financial penalties in case of failure. I liked the graphs and reminders, but the penalties made me feel even more overwhelmed than usual, and sometimes induced suboptimal short-term priorities. I decided to obtain the different benefits separately, setting recurring reminders for habits on Follow Up Then, and using 42 Goals for tracking.
  • Toggl: Time tracking for activities and tasks, organized by project or goal, with an option for retroactive time entries. I started out using it to track all my time, and though I stopped after about a month due to the excessive overhead of tracking and categorizing short activities, I learned a lot about where my time was going. I used it for about a year after that to track work hours, and eventually stopped because of overhead and redundancy with Complice.
  • Paper checklist: Checklist for daily habits. Worked well in terms of catching my eye in the morning, but was often forgotten when traveling. It was redundant with 42 Goals, and required double data entry, so I eventually gave up on the paper version.
  • Habit tracking with reminders, with a pretty good phone app. I found it particularly useful for several-times-a-week habits. It also has built-in habit programs like building up to a certain number of chinups. I mostly stopped using it because I had too many other systems that were redundant with it.
  • Pomodoros: Setting a timer to focus on a specific task for 25-40 minutes, followed by a break of 5 minutes. I found it unpleasant to be forced to take breaks, developed a habit of ignoring the break signal, and gave up on using pomodoros altogether.

Over the past couple of years, I have become less willing to force myself to do things or overwhelm myself with instructions or data entry overhead, which has led me to reduce the number of systems I use, and to prefer gently guiding systems to strict ones.

Hamming questions and bottlenecks

The CFAR alumni workshop on the first weekend of May was focused on the Hamming question. Mathematician Richard Hamming was known to approach experts from other fields and ask “what are the important problems in your field, and why aren’t you working on them?”. The same question can be applied to personal life: “what are the important problems in your life, and what is stopping you from working on them?”.

Over the course of the weekend, the twelve of us asked this question of ourselves and each other, in many forms and guises: “if Vika isn’t making a major impact on the world in 5 years, what would have stopped her?”, “what are your greatest bottlenecks?”, “how can we actually try?”, etc. The intense focus on mental pain points was interspersed with naps and silly games to let off steam. On the last day, we did a group brainstorm, where everyone who wanted to receive feedback took a turn in the center of the circle, and everyone else speculated on what they thought were the biggest bottlenecks of the person in the center. By this time, we had mostly gotten to know each other, and even the impressions from those who knew me less well were surprisingly accurate. I am very grateful to everyone at the workshop for being so insightful and supportive of each other (and actually caring).

Most of the issues that came up were things I was aware of on some level, but over the course of the workshop it became particularly salient to me how interconnected my problems are and how the gears in the system affect each other. Working memory overload leads to confusion, which reduces confidence. Sleep deprivation reduces working memory and increases anxiety. Anxiety reduces the affordance for exploration and creativity, and increases the frequency of insomnia. Ignoring or neglecting signals from system 1 takes up working memory slots with looping messages from system 1, and increases anxiety. After a few of these circular explanations, I gave up on writing them all down, and made a diagram instead.

bottlenecks - New Page (1)

A few things jump out at me about this diagram. The highest degree nodes are anxiety and working memory, both of which are difficult to affect directly. The two nodes I have the most influence over are the amount of sleep I get and the degree to which I listen to system 1 signals. I have started experimenting with sleep interventions that I haven’t yet tried, like taking melatonin 4 hours before bedtime, using a weighted blanket, etc. Attunement to system 1 can be improved through meditation, Focusing, belief reporting and such. While I have sporadically meditated for years, I could use more practice at the other techniques, which involve more explicit internal querying than meditation.

Curiously, the graph also appears to have a source and a sink. The source node is my overdeveloped sense of duty and a tendency to assume I should do things or be able to do things causes a lot of downstream issues. It would be impactful to directly hack this and become more selfish, but it appears to be a bit trickier than doing a find-and-replace on my source code, replacing “I have to do X” with “my goals require X”. The sink node has to do with my capacity to allow myself time and mental space for exploration and creativity, which would among other things enable me to do my high-level goals better (e.g. research and organization strategy).

A week after the workshop, I moved to California for my summer internship at Google. The context shift and my new location a few blocks down from the CFAR office will allow me to work on my bottlenecks more systematically. I have wrestled with these for a long time, but now I feel that I have better tools and resources than ever before.

Negative visualization, radical acceptance and stoicism

In anxious, frustrating or aversive situations, I find it helpful to visualize the worst case that I fear might happen, and try to accept it. I call this “radical acceptance”, since the imagined worst case is usually an unrealistic scenario that would be extremely unlikely to happen, e.g. “suppose I get absolutely nothing done in the next month”. This is essentially the negative visualization component of stoicism.

There are many benefits to visualizing the worst case:

  • Feeling better about the present situation by contrast.
  • Turning attention to the good things that would still be in my life even if everything went wrong in one particular domain.
  • Weakening anxiety using humor (by imagining an exaggerated “doomsday” scenario).
  • Being more prepared for failure, and making contingency plans (pre-hindsight).
  • Helping make more accurate predictions about the future by reducing the “X isn’t allowed to happen” effect (or, as Anna Salamon once put it, “putting X into the realm of the thinkable”).
  • Reducing the effect of ugh fields / aversions, which thrive on the “X isn’t allowed to happen” flinch.
  • Weakening unhelpful identities like “person who is always productive” or “person who doesn’t make stupid mistakes”.

Let’s say I have an aversion around meetings with my advisor, because I expect him to be disappointed with my research progress. When I notice myself worrying about the next meeting or finding excuses to postpone it so that I have more time to make progress, I can imagine the worst imaginable outcome a meeting with my advisor could have – perhaps he might yell at me or even decide to expel me from grad school (neither of these have actually happened so far). If the scenario is starting to sound silly, that’s a good sign. I can then imagine how this plays out in great detail, from the disappointed faces and words of the rest of the department to the official letter of dismissal in my hands, and consider what I might do in that case, like applying for industry jobs. While building up these layers of detail in my mind, I breathe deeply, which I associate with meditative acceptance of reality. (I use the word “acceptance” to mean “acknowledgement” rather than “resignation”.)

I am trying to use this technique more often, both in the regular and situational sense. A good default time is my daily meditation practice. I might also set up a trigger-action habit of the form “if I notice myself repeatedly worrying about something, visualize that thing (or an exaggerated version of it) happening, and try to accept it”. Some issues have more natural triggers than others – while worrying tends to call attention to itself, aversions often manifest as a quick flinch away from a thought, so it’s better to find a trigger among the actions that are often caused by an aversion, e.g. procrastination. A trigger for a potentially unhelpful identity could be a thought like “I’m not good at X, but I should be”. A particular issue can simultaneously have associated worries (e.g. “will I be productive enough?”), aversions (e.g. towards working on the project) and identities (“productive person”), so there is likely to be something there that makes a good trigger. Visualizing myself getting nothing done for a month can help with all of these to some degree.

System 1 is good at imagining scary things – why not use this as a tool?


2014-15 New Year review

2014 progress

If someone told me at the beginning of 2014 that I would co-found an organization to mitigate technological risks to humanity, I might not have believed them. Thanks Max, Meia, Anthony and Jaan for the great initiative!

I am almost done with my first research project on variable selection and classification using a Bayesian forest model – I simplified the variable partition in the model, came up with better tree updates, added a hyperprior, sped up the algorithm by an order of magnitude, and started testing on real data. Among the other ambitious projects of the past year are two MIRIx workshops plus writing up the results, and starting this blog.

Improvements in personal effectiveness:

  • started using a daily checklist of morning habits
  • started taking melatonin every night
  • started tagging new thoughts
  • started using FollowUpThen to schedule future tasks without overloading my todo list
  • started using Toggl to track work hours only
  • stopped using Beeminder (too stressful), and replaced it with a combination of 42goals and FollowUpThen (works well)
  • quit as President of the Toastmasters club
  • made a volunteer application form for FLI, so that instead of being inundated with 7 freeform emails per month from interested folks, I get the relevant information in an organized spreadsheet and I’m not required to respond

Random cool things I did:

  • climbing a 5.11a
  • climbing outdoors
  • indoor surfing
  • polar bear swim

2014 resolutions

A year ago, I made a number of New Year resolutions (and assigned a probability of completion for each goal). Here is how they worked out:


  • continue meditation practice (>10 minutes daily, > 120 times) (70%) – did ~200 times
  • start a new research project (70%) – started working with a new advisor, did some background reading, narrowed down a topic, wrote a grant proposal
  • do at least 5 pullups in a row (85%)
  • reading stats blogs
  • go to 3 conferences, including a MIRI workshop (80%) – went to NESS, JSM, NIPS, and two MIRIx workshops
  • give 5 speeches at Toastmasters (75%)
  • give at least 5 LessWrong meetup talks (70%)
  • run comfort zone expansion outings at least twice (80%)
  • start a group project at Citadel (40%) – FLI work groups and MIRIx workshops sort of count for this
  • introduce at least 3 friends to LW meetups (50%)
  • help people achieve their goals – helped run a weekly habit training session at Citadel

Essentially succeeded:

  • publish paper about current research project (90%) – almost done, hope to submit by March 1
  • write at least 5 LW posts (80%) – wrote 4 posts
  • more writing – reflections (did some), stories (sort of), poems (nope), also journals and blog posts

Failed and no longer endorsed:

  • continue to avoid Beeminder debt – didn’t work, then stopped using Beeminder, now use 42goals for goal tracking and FollowUpThen for reminders
  • do consulting for Metamed (60%)
  • read 90% of the LW Sequences (70%) – made progress, but no longer want to read such a high fraction (waiting for the ebook to come out)
  • finish Pearl’s Causality (50%) – read the LW review of the book instead
  • learn more economics and biology


  • low predictions, 40-60%: 2/4 = 50% (perfectly calibrated)
  • high predictions, 70-90%: 7/10 = 70% (overconfident)


  • Reading goals mostly don’t work for me – if I do set them, flexible goals of the form “read some of X” (like stats blogs) do better than more fixed and time-consuming goals like “read most of X” (like the Sequences)
  • The rate of goal disendorsement is 5/19 = 26%.
  • I don’t tend to completely fail on goals I continue to endorse – yay!

2015 goals and habits

Broad categories of goals for the coming year:

  1. Research:
    • wrap up and submit BFC project,
    • start and make a significant progress on 1-2 new projects
  2. FLI:
    • help streamline operations and communication,
    • continue work on AI safety outreach to AI researchers,
    • encourage AI safety research,
    • start projects in risk areas other than AI safety
  3. Self-improvement:
    • reduce weekend/free-time anxiety,
    • increase acceptance of suboptimal situations,
    • improve introspection ability and my model of myself,
    • improve retention of information (using Anki),
    • eliminate cluttery speech pattern

Habits to maintain:

  • daily meditation
  • daily pushups
  • daily melatonin
  • tagging and writing down new thoughts
  • tracking goals in 42goals
  • tracking work hours in Toggl
  • journaling 1-2 times / week

2015 predictions

  1. I will submit the Bayesian Forest Classifier paper for publication (95%)
  2. I will submit another paper besides BFC (40%)
  3. I will present BFC results at a conference (JSM, ICML or NIPS) (40%)
  4. I will get a new external fellowship to replace my expiring NSERC fellowship (50%)
  5. I will skim at least 20 research papers in machine learning (70%)
  6. I will write at least 12 blog posts (70%)
  7. I will climb a 5.12 without rope cheating (50%)
  8. I will lead climb a 5.11a (50%)
  9. I will be able to do 10 pullups in a row (60%)
  10. I will meditate at least 150 times (80%)
  11. I will record at least 150 new thoughts (70%)
  12. I will make at least 100 Anki cards by the end of the year (70%)
  13. I will read at least 10 books (60%)
  14. I will attend Burning Man (90%)
  15. Boston will have a second rationalist house by the end of the year (30%)
  16. FLI will hire a full-time project manager or administrator (80%)
  17. FLI will start a project on biotech safety (70%)

I will update this post if other goals and predictions for the year come to mind before the end of January.