2023-24 New Year review

This is an annual post reviewing the last year and setting intentions for next year. I look over different life areas (work, health, parenting, effectiveness, travel, etc) and draw conclusions from my life tracking data.

Overall, this year went pretty well (and definitely better than the previous two). Highlights include a second kid, hiking in Newfoundland, some parenting milestones (night potty training and stopping breastfeeding), and how iron deficiency can feel like burnout.

  1. 2023 review
    1. Life updates
    2. Work
    3. Health
    4. Parenting 
    5. Effectiveness
    6. Travel
    7. Fun stuff
  2. 2023 prediction outcomes
  3. 2024 goals and predictions

2023 review

Life updates

We received a special gift for New Year’s – Michael (“Misha”) arrived just in time to be born in 2023! Daniel is already getting the hang of rocking his brother and singing him lullabies.

Work

Last year I mostly focused on threat models for AI risk. This year my work was split between projects aiming to address to different aspects of our main threat model: dangerous capability evaluations, mentoring MATS scholars on understanding agency and power-seeking in language models, and internal outreach about AI alignment. In retrospect I was spread a bit too thin, and didn’t make as much progress on the evaluations project as I would like, since outreach and mentoring usually seemed more urgent. So I will try to be more focused next year. 

Research

  • Paper: Power-seeking can be probable and predictive for trained agents, investigating how the training process affects power-seeking incentives and showing that they are still likely to hold for trained RL agents under some simplifying assumptions.
  • Mentoring a research group on understanding agency and power-seeking in language models for the ML Alignment & Theory Scholars program. Out of 60 people who applied to the group, I supervised 11 scholars in the initial training phase of the program, and continued working with 5 of them (on two projects) in the main program.
    • One team of my scholars made some progress on understanding the limitations of agents simulated by predictive models (paper).
    • The other team of scholars performed a theoretical investigation of how well shutdown obedience is preserved when the environment changes slightly (paper).
  • Blog post: Some high-level thoughts on the DeepMind alignment team’s strategy
  • Dangerous capability evaluations (focused on persuasion and situational awareness capabilities).

Outreach

Health

Physical health. This year I got sick less often than last year (similar number of colds but no other illnesses). The colds also lasted for less long – possibly due to copious use of the “dual defense” nasal spray, and also because we were past the initial barrage of germs from Daniel starting nursery (though he still had a runny nose most of the time).

Proportion of days I had a cold per month in 2022-23. Daniel started nursery in June 2022.

Mental health. Generally better this year. There were fewer episodes of particularly bad mental states compared to the previous two years, though still not at the pre-parenting baseline. Looking over my bug reports, I found that bad mental episodes are strongly associated with traveling with Daniel. Maybe it’s a good thing we probably won’t be traveling as much now that we have two kids.

After Daniel was born, I had recurring night terrors (being woken up by worries about the baby and looking for him in random places). Thankfully these have stopped at this point. Hopefully this was a first-time-parent-anxiety thing and won’t happen as much with the second baby…

Proportion of nights with night terrors per month in 2021-23

In spring, I took Irene Lyon’s self-therapy course based on somatic experiencing practice, which involves a lot of “neurosensory” exercises where you pay attention to your body and perceptions in different ways. I found the exercises more helpful for my wellbeing than my previous meditation practice, so I mostly do these in place of meditation now.

I didn’t do as much meditation as intended, somewhat less than in previous years (likely due to travel). Unsurprisingly, I found that I meditate less often on Saturdays, which are the days we don’t have childcare.

Sleep. Average sleep was pretty similar to last year (7 hours a night), though there was more insomnia (23% of nights). This is probably because I was going to sleep earlier and waking up at the same time. (As usual, all the sleep metrics are excluding jetlag.)

Hours awake between 12 and 6am in 2022-23

Pregnancy. Overall I felt pretty well, though had somewhat more symptoms than the first time (some backache, sciatica, rib pain, etc). Had a bit of nausea in the first trimester, which I found I could keep at bay by continually snacking (thank you, office microkitchens). I also had some food aversions that for some mysterious reason were only triggered by the office cafeteria (maybe the sauces they use) so I stuck to plain food there.

Thankfully I didn’t have fatigue and was able to stay active until the end. My usual exercise routine of prenatal yoga and personal training once a week worked pretty well (for mental wellbeing and reducing various muscle pains). I continued riding my bike (on flat ground or a slight incline), which was actually more comfortable than walking since it didn’t give me backache. I found that exercise got easier in the last few weeks before birth, since the baby moved downwards and got his feet out of my lungs.

I had some issues with an iron deficiency that was not diagnosed right away. I was pretty tired during the Newfoundland hiking trip in June, got a blood test upon returning home and was recommended to take 100mg of iron a day (for reference, a typical women’s multivitamin contains around 15mg of iron). My next blood test in September showed that my iron levels were still going down, so I was told to take 400mg of iron a day (luckily I don’t get side effects from this). I wasn’t physically tired this time, but often felt unmotivated and spacey, which magically disappeared once I quadrupled the iron dose (now I know that iron deficiency can feel like burnout). These episodes of iron deficiency were bad for productivity and generally feeling like myself, which could have probably been avoided by more frequent testing and taking a multivitamin from the beginning.

Comparing this pregnancy to the first one (in 2020), I had somewhat better sleep and less insomnia for some reason – possibly exercise, since the first time was during the pandemic, so I was more sedentary. (Hours awake at night is not a good comparison point since it’s confounded by Daniel waking up.)

Parenting 

Potty training. This year, Daniel got better at using the potty with less prompting and more reliably communicating when he needs to go. During our travels, we didn’t bring a potty and used a kids toilet seat instead, but at home he preferred the potty. After we got back from Canada in August, we put the potty away and continued using the toilet by default – he complained a bit but seemed to get used to it. At some point he rejected the kids toilet seat and used the toilet without it (with a toddler stool). Once he entered the “oppositional” phase of toddlerhood, asking him directly whether he needs the toilet would always get a “no”. I found it was much more effective to say that I’m going to the toilet, and then he would say “no, Daniel is going to the toilet!”.

Since Daniel was staying dry at night some of the time, we made a few attempts to ditch the night diapers this year – once in April when he was 2.5, once in July, and once in October when he was almost 3. The last time worked much better, and he seemed to better understand what’s going on, e.g. talking about how we don’t pee in the bed. He’s been out of night diapers since late October with 5 pee accidents, so I think we can declare victory on this one.

The timing was consistent with our potty training book (“Oh Crap”), which recommended night training around 3 years of age. I’m happy that we didn’t have to go through the full process described in the book, which involves waking up in the middle of the night and putting a sleepy child on the potty. This happened to be a convenient time to try, since I often woke up at night anyway due to insomnia. Daniel was also waking at night more often for some reason and coming to our bed, so it was easy to put him on the potty then. After a couple of weeks he refused to go on the potty at night and would just hold it until morning. 

Breastfeeding. A year ago I was still breastfeeding Daniel around twice a day and wasn’t sure what the endgame looked like. He rarely fed at night, though there was a spike in wakeups and night feeds in winter (maybe a 2-year sleep regression). A few months later we went down to one feed at bedtime – he would have a brief suck (probably out of habit) and then move on to reading his books, rather than falling asleep. During early pregnancy it became painful to breastfeed, so I decided to stop and told him the boob is empty (which was true). He asked for it a few more times and eventually stopped. Overall he breastfed until 2.5 years, though more like 2 years in terms of actually receiving milk. One consequence (or maybe correlation) of stopping breastfeeding was that he became a lot less interested in me and more interested in papa :).

Screen time. Daniel likes to watch videos of nursery songs and moving vehicles (trains or airplanes). We are not sure what is the right amount of video watching and generally try to limit it unless it’s useful in some way. Sometimes showing videos helps motivate him to get dressed / go to the potty / brush his teeth when he would otherwise refuse, which is pretty handy. It also helps get him out of tantrums when other things don’t work. At these times he often pushes us away, but his favorite video of a train coming out of a tunnel would catch his attention and calm him down enough to receive comfort from us. Overall the heuristic of “useful video-watching” has worked reasonably well.

Effectiveness

Lights. I continued using the Lights spreadsheet for tracking daily habits, which I started using in 2021. The set of habits changed slightly from last year, but overall has largely stabilized.

Here are the habits that I kept from last year:

  • Life tracking
  • Fill out lights
  • Make a list of intentions for the day
  • Ask myself what I want today
  • Meditation
  • Exercise (today or yesterday)
  • Leg & shoulder stretches
  • At least 2 hours of deep work (changed to 1 hour at some point)
  • Braindump / journaling
  • Reading (today or yesterday)
  • Negative visualization on making mistakes
  • Take supplements to avoid / mitigate colds (replaced with “use defense spray”)
  • Appreciate a thing I did today
  • Exchange appreciation with Janos
  • Go to bed by 11:30pm (changed to 11pm)

Habits from last year that I dropped:

  • Practice effective rest (breaks during the day where I pay attention to what I want). I often forgot to do this, possibly because it was not concrete enough, and simpler ways to take a break like naps or walks were more helpful.
  • Use eye drops (to address dryness from using contact lenses) – no longer needed.

I did about 64% of the Lights on an average week (down from 70% last year), probably because I wasn’t filling them out as reliably every day. I started feeling more self-conscious about filling them out at my desk at work for some reason, and it was harder to do at home where Daniel would try to take over my laptop (because watching airplane videos is clearly a higher priority).

Time tracking. This year I continued doing time tracking during work hours. In an average work week, I spent 30 hours on work activities (compared to 27 last year) and 8.5 hours on non-work activities (compared to 10 last year). The overall amount of research and reading went down somewhat in favour of outreach and comms (much of the comms was also outreach-related). The overall increase in work hours was likely due to some combination of less illness, less time spent on exercise, and adding another office night every other week.

Type20232022 (Jun-Dec)
Meetings7:56:318:52:12
Research6:17:336:53:00
Reading2:28:434:00:48
Outreach4:04:530:00:00
Comms6:38:293:30:12
Planning1:34:461:55:00
Admin1:50:001:41:36
Total work30:50:5626:52:48
Self-care5:14:256:10:48
Parenting1:27:400:52:12
Random1:44:392:55:36
Total non-work8:26:459:58:36
Time spent on different activities in an average work week

Deep work. I tracked 248 hours of deep work this year, compared to 363 last year. This was due to the increased focus on outreach and shifting my research focus from conceptual / theory work to empirical work. Deep work seems to be less of a good metric for productivity on empirical projects like capability evaluations, since I didn’t count running experiments / debugging / iterating on prompts as deep work. In light of this, my original goal of 400 deep work hours for 2023 seems particularly unrealistic, since I never exceeded 400 deep work hours except for 2019.

Digital habits. The number of times I unlock my phone per day went down from 25-30 in the first half of the year to 20-25 in the second half (a bit lower than the average of 20-40 last year). Phone use is correlated with travel, so the decrease over the year is mostly explained by a lot of travel in Mar-June and August followed by no travel. I also use my phone less at work and more on weekends (likely because I use more navigation and videos for Daniel then). There is a dip in the middle of the week, probably corresponding to office nights.

I still use the Intenty app (formerly called Actuflow) that asks me to enter an intention tag when I unlock the phone. It works a bit less well than a year ago, e.g. sometimes I automatically enter a tag without thinking about the intention. Overall it still creates an extra inconvenience to unlocking, and seems to reduce phone use.

External engagements. This year the number of external requests (speaking, interviews, etc) went up substantially compared to the previous few years. I got 75 requests this year, which hasn’t happened since 2018 (when I got 89 requests and had some overcommitment issues). Some of this is due to the pandemic winding down, but there is also a snowball effect where doing more external outreach leads to more requests, which I need to be mindful of. 

YearNumber of requestsAccepted requestsAcceptance rateAverage regret (0-1)
2021267 (0 research-related)0.270.21
2022299  (2 research-related)0.310.15
20237517 (2 research-related)0.230.25
External engagements in 2021-23

I accepted 17 requests this year (2 research-related and 15 non-research), which seems too high. Even though I was filling out my request evaluation form, I didn’t sufficiently stick to the default of saying no to everything that isn’t an exceptional opportunity. In particular, I gave too many repeated talks that seemed easy individually, but were a distraction in aggregate. I no longer have a clear rationale for giving these talks – getting even more students interested in AI alignment is low priority at this point. The average regret for accepted commitments was 25%, higher than previous years. I also spent too much time making decisions about requests – I will try setting a 5 minute timer when I fill out the form, and say no at the end unless I decide otherwise. 

Travel

In March we went to southern France to visit my aunt and cousin. We enjoyed hiking in the calanques along the coast (without Daniel for a change).

In May we spent a weekend in the Brecon Beacons national park in Wales, which we were meaning to visit for a while. We did the Mt Pen y Fan hike and the Four Waterfalls trail, and Daniel did a lot of it on his own. He was very excited to play in the waterfall pools, so we had to be careful with him there.

In June we hiked the East Coast trail in Newfoundland, a lovely Canadian province we’ve never visited before. It was an organized hike where we stayed in a cabin, and the tour company dropped us off at different sections of the trail every day and picked us up when we finished. This was very convenient since we didn’t have to carry a lot of stuff. The nature did not disappoint – very beautiful and few other people. We saw some whales puffing in the distance off the coast (and even an iceberg once).

In August we returned to the cottage on Manitoulin Island. Daniel was very excited about climbing big rocks and throwing smaller rocks in the water.

Fun stuff

  • I read 7 books this year: The Artist’s Way, Kindred, The Handmaid’s Tale, A World Without Email, One Hundred Years of Solitude, Boom!, How to Have a Baby.
  • Swimming and gymnastics for Daniel. We did regular classes for a while which he enjoyed, but started to resist the structure more as he got older, so we shifted to free-play sessions without instruction.
  • I took a solo weekend trip to Edinburgh to give a talk, with a side of tourism and getting a break from parenting. This was my first visit to the city, not counting that time in 2017 when our sleeper train broke down and we wandered around the deserted downtown with suitcases for an hour at 4am. It was really nice to do some solo touristing, learning about the history and just exploring.

2023 prediction outcomes

Goals

  • Meditate on at least 250 days (80%) – no (215 days)
  • At least 400 deep work hours (60%) – no (248 hours)
  • Write at least 4 blog posts (70%) – yes (4 posts)

Predictions

  • I will avoid processed sugar for at least 10 months of the year (80%) – yes (11 months, Jan-Nov)
  • I will read at least 7 books (80%) – yes (7 books)
  • I will catch at most 4 colds (60%) – no (6 colds)
  • Daniel will be potty-trained for the night by August (70%) – no (night-trained in October)

Calibration was not that great for lower-confidence predictions:

  • 60%: 0/2 😦
  • 70%: 1/2
  • 80%: 2/3

In particular, the targets for meditation and deep work were clearly too ambitious in retrospect. Some of the other predictions came out really close to the target (blog posts, books, and night training), so lower targets would have made sense for these as well.

2024 goals and predictions

Goals

  • Meditate on at least 200 days (70%)
  • At most 10 external engagements that are not research-related (15 last year) (70%)
  • Write at least 3 blog posts (4 last year) (60%)

Predictions

  • I will avoid processed sugar for at least 9 months of the year (11 last year) (80%)
  • I will be able to do a chinup by end of year (2 last year) (60%)
  • I will read at least 4 books (7 last year) (70%)
  • We will move near a good school for Daniel (80%)

Past new year reviews: 2022-23, 2021-222020-212019-202018-192017-182016-172015-162014-15.

Leave a comment