Rafael Rafailov is a researcher in artificial intelligence working primarily on decision making and reinforcement learning. He is currently at Thinking Machines, where he worked on the company's first public release - Tinker. Before that he completed a Ph.D. in Computer Science at Stanford University and was a student researcher at Google DeepMind, where he co-authored influential works on embodied AI - the RT-X and OpenVLA series and RLHF post-training - Direct Preference Optimization and Generative Reward Models which have been widely adopted in industry. He thinks a lot about Meta-Learning these days.