This breakthrough in robotic training employs imitation learning, significantly simplifying the process of programming surgical robots. By using visual input rather than coding each movement, this approach advances the potential for robots to perform complex surgeries autonomously.
"It's really magical to have this model and all we do is feed it camera input and it can predict the robotic movements needed for surgery," said senior author Axel Krieger. "We believe this marks a significant step forward toward a new frontier in medical robotics."
The research, highlighted at the Conference on Robot Learning in Munich, showcases collaboration between Johns Hopkins University and Stanford University. The team trained the da Vinci Surgical System, known for its widespread use but also its precision limitations, to perform tasks like needle manipulation, tissue lifting, and suturing. Unlike traditional training, which requires precise, step-by-step programming, this model uses machine learning similar to that behind ChatGPT. Instead of processing language, this model interprets kinematic data - breaking robotic motion into mathematical expressions.
Researchers trained their model using hundreds of wrist camera recordings from da Vinci robots during surgeries. These recordings, collected globally for postoperative analysis, provide a vast dataset for imitation learning. The da Vinci system, employed in nearly 7,000 units worldwide and familiar to over 50,000 surgeons, offered ample video data.
The innovation lies in training the model to recognize and execute relative motions, avoiding inaccuracies associated with absolute actions. "All we need is image input and then this AI system finds the right action," explained lead author Ji Woong "Brian" Kim. With just a few hundred demonstrations, the model can learn and adapt to new environments.
The robot demonstrated proficiency in performing the selected surgical tasks, mirroring human skill levels. Remarkably, it adapted to unexpected situations, such as picking up a dropped needle autonomously. "Here the model is so good learning things we haven't taught it," Krieger noted.
The researchers envision rapid training for various surgical procedures, contrasting with the lengthy hand-coding previously required. "It's very limiting," Krieger said. "What is new here is we only have to collect imitation learning of different procedures, and we can train a robot to learn it in a couple days. It allows us to accelerate to the goal of autonomy while reducing medical errors and achieving more accurate surgery."
The team is now working on expanding this method to train robots for complete surgeries. Contributors from Johns Hopkins included PhD student Samuel Schmidgall, Associate Research Engineer Anton Deguet, and Associate Professor Marin Kobilarov. The Stanford team included PhD student Tony Z. Zhao.