Abstract
Purpose: This paper presents a deep learning approach to recognize and predict surgical activity in robot-assisted minimally invasive surgery (RAMIS). Our primary objective is to deploy the developed model for implementing a real-time surgical risk monitoring system within the realm of RAMIS. Methods: We propose a modified Transformer model with the architecture comprising no positional encoding, 5 fully connected layers, 1 encoder, and 3 decoders. This model is specifically designed to address 3 primary tasks in surgical robotics: gesture recognition, prediction, and end-effector trajectory prediction. Notably, it operates solely on kinematic data obtained from the joints of robotic arm. Results: The model’s performance was evaluated on JHU-ISI Gesture and Skill Assessment Working Set dataset, achieving highest accuracy of 94.4% for gesture recognition, 84.82% for gesture prediction, and significantly low distance error of 1.34 mm with a prediction of 1 s in advance. Notably, the computational time per iteration was minimal recorded at only 4.2 ms. Conclusion: The results demonstrated the excellence of our proposed model compared to previous studies highlighting its potential for integration in real-time systems. We firmly believe that our model could significantly elevate realms of surgical activity recognition and prediction within RAS and make a substantial and meaningful contribution to the healthcare sector.
Original language | English |
---|---|
Article number | 106151 |
Pages (from-to) | 743-752 |
Number of pages | 10 |
Journal | International Journal of Computer Assisted Radiology and Surgery |
Volume | 20 |
Issue number | 4 |
DOIs | |
Publication status | Published - Apr 2025 |
All Science Journal Classification (ASJC) codes
- Surgery
- Biomedical Engineering
- Radiology Nuclear Medicine and imaging
- Computer Vision and Pattern Recognition
- Computer Science Applications
- Health Informatics
- Computer Graphics and Computer-Aided Design