Whymo?

My journey from Google DeepMind to Waymo

Vincent Vanhoucke
3 min readAug 13, 2024
Gemini’s rendition of a self-driving car in SF

A few months ago, I was faced with a choice: some health issues needed my time and focus, while my work, leading Robotics research at Google DeepMind, demanded increased attention. I made a snap decision: doing both in parallel would ultimately be a disservice to the team and to my own recovery. Stepping back wasn’t a difficult decision: the team at Google DeepMind is absolutely wonderful, and my tech leads in particular were ready to step up to pick up the reins. And so they did, which opened up for me the tantalizing question: after 8 years of taking this effort from zero to what I still humbly consider the best research team on robotics and AI today, what could be next?

Building this team was my third ‘career’ at Alphabet, after working on speech recognition in Android and computer vision on the Brain Team. Through every transition, I’ve sought roles where I can bring 50% of the familiar with me, and 50% learn something new. It was time to take another step into maximizing my impostor syndrome.

While going from medical appointment to physical therapy session, as yet unable to drive myself due to physical limitations, I got to spend a lot of this reflection time inside a Waymo car, clocking dozens of rides all across San Francisco. I can confidently say that I have never interacted with an AI product that is such a delight. Within the first couple of rides, as the novelty inevitably fades, you’re left with a deep sense that this is how personal transport ought to be. The case for this use of AI being of net social benefit was immediately self-evident: it helped me directly in a time of need — I genuinely don’t want to drive a car ever again. I also don’t need to hear another appeal to the inevitability of AGI to make a case for this being the right product/market fit at the right time.

It is not perfect of course, and I find grounds to file user feedback almost every other ride. Which is a good segue into: why work here? I made the case at CoRL last year that too much of robotics research was focused on taking research to the least publishable unit of ~70% performance levels, but not looking as much at the 9X+% problem. I also argued that taking robotic systems to high levels of performance and safety was not about ‘engineering details’ or ‘more data,’ but required different breakthroughs. My first few weeks at Waymo have only reinforced this perspective.

In robotics, autonomous driving included, the long tail of difficulty is increasingly more about common sense reasoning than about low-level planning and control. It is about a deep understanding of the situation and reasoning not just about geometry, but about the semantics of a scene, which in a multi-agent scenario includes other agents’ actions. Much of it is also about scaling: every order of magnitude you grow your usage, every new context you bring your autonomous system into, brings you to the very edge of your generalization capabilities. In that sense, my favorite grand challenges in perception, semantic understanding and reasoning have not aged a bit. But what is different today is that with large multimodal models we have new tools at our disposal to address them.

And it’s not just a theoretical exercise either. These approaches are making it into real deployment today and making a difference. Simply put, there are few places today that can legitimately claim to have connected AI, robotics, and real-world users, and can even start looking at this next generation of technological challenges. Even fewer perhaps that are doing it with such a strong culture of safety at their core, which enables each and every engineer to work knowing that they are backed by the best supporting processes every step of the way. Most ‘AI robotics’ startups and industry efforts are rightfully still focussed on the data problem (I discussed some of the challenges here), but there is something liberating about exploring what happens when data collection is no longer the dominant bottleneck and you’ve already made contact with reality at some scale.

If this sounds like an ad, it kind of is: we’re investing, expanding, and hiring for every ML function, across research, engineering and leadership. Join us! (Opinions remain my own, not that of Waymo.)

--

--

Vincent Vanhoucke

I am a Distinguished Engineer at Waymo, working on Machine Learning and Robotics. Previously head of robotics research at Google DeepMind.