A Vision-Based Assistive Robotic System with Real-Time Gesture Recognition for Communication Support in Speech-Impaired Cancer Patients: A Pilot Feasibility Study
DOI:
https://doi.org/10.63561/jca.v3i1.1199Keywords:
Cross-Domain Multi-Task Learning, Gesture Recognition, Human–Robot Interaction, Assistive Robotics, Histopathology Image ClassificationAbstract
Assistive robotics in healthcare frequently lacks seamless integration between human-robot interaction (HRI) and diagnostic support. This challenge is especially pronounced for speech-impaired cancer patients (e.g., those with head and neck, oral, or laryngeal cancer, or post-treatment voice loss), who often face significant barriers in non-verbal communication and control of medical interfaces. This pilot feasibility study presents a vision-based assistive robotic system that combines gesture-driven HRI with preliminary histopathology-based cancer detection in a closed-loop architecture. I propose a Cross-Domain Adaptive Multi-Task Network (CDAM-Net) that mitigates negative transfer between heterogeneous visual domains natural hand gestures and microscopic tissue textures through domainadaptive feature modulation and dynamic uncertainty-based task weighting. The system integrates AI inference with an Arduino-controlled 4-DOF robotic arm and real-time clinician notification via WebSocket. In a controlled laboratory evaluation (n = 10 volunteers, 100 trials), the framework achieved 85% gesture top-1 accuracy (F1 = 0.83), 94% cancer classification accuracy (ROC-AUC = 0.98), 90% actuation success, and sub-second end-to-end latency. Adaptive parameter sharing reduced trainable weights by approximately 15% compared to separate models while maintaining performance. These results demonstrate the technical feasibility of an efficiency-aware, cross-domain adaptive assistive robotic framework for simulated tele-oncology support, establishing a foundation for future clinical validation.
References
Al-Haija, Q. A., & Adebanjo, A. (2023). Deep learning analysis of histopathology images for breast cancer detection: A comparative study of ResNet and VGG architectures. IEEE Access, 11, 67890–67900. doi:10.1109/ACCESS.2023.3298765
Al-Haija, M. A., & Adebanjo, A. (2024). Ensemble deep learning-based image classification for breast cancer subtype and invasiveness diagnosis from whole slide image histopathology. Diagnostics, 14(12), 1345. doi:10.3390/diagnostics14121345
Alotaibi, A., Alotaibi, M., Alotaibi, H., Alotaibi, S., & Alotaibi, F. (2025). An explainable AI for breast cancer classification using vision transformer (ViT). Biomedical Signal Processing and Control, 85, 105234.
doi:10.1016/j.bspc.2024.105234
Andhare, M. K., & Rawat, S. (2021). A robotic hand: Controlled with a vision-based hand gesture recognition system. In Proceedings of the IEEE International Conference on Human-Robot Interaction (pp. 789–795). IEEE. [Note: Extended in 2023 reviews]
Baroni, G. L., Rasotto, L., Roitero, K., Tulisso, A., & Della Mea, V. (2024). Vision transformers for breast cancer histology image classification. In Image analysis and processing—ICIAP 2023 workshops (pp. 15–25). Springer.
Beeri, E. B., Bamani, E. B., Meir, I., Koenigsberg, L., & Sintov, A. (2025). DiG-Net: Enhancing quality of life through hyper-range dynamic gesture recognition in assistive robotics [Preprint]. arXiv:2505.24786.
Billard, A., Calinon, S., Dillmann, R., & Schaal, S. (2020). Human-robot interaction. IEEE Robotics and Automation Magazine, 27(1), 10–20. doi:10.1109/MRA.2019.2959278
Gaur, V., Baranwal, P., & Kaur, R. (2025). A gesture-based HRI system for health care. In Proceedings of the fourth international conference on computing and communication networks (pp. 345–356). Springer.
Gestix Team. (2020). Sterile gesture interface for medical imaging. Journal of Medical Robotics Research, 5(1-2), 2050003. doi:10.1142/S2424905X2050003X
Gupta, S. K., & Chawla, N. (2023). Deep learning analysis of histopathology images for breast cancer detection: A comparative study of ResNet and VGG architectures. In Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (pp. 456–462). IEEE.
Haddadin, S., Johannsmeier, L., & Diaz Ledezma, F. (2018). Tactile robots. IEEE Robotics and Automation Magazine, 25(3), 22–34. doi:10.1109/MRA.2018.2850901
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778). IEEE.
Kumari, V., & Ghosh, R. (2024). Revolutionizing breast cancer diagnosis: A concatenated precision through transfer learning in histopathological data analysis. Diagnostics, 14(4), 567. doi:10.3390/diagnostics14040567
Lakshmi Priya, C. V., Biju, V. G., Biju, V. R., & Sivakumar Ramachandran. (2024). Deep learning approaches for breast cancer detection in histopathology images: A review. Cancer Biomarkers, 40(1), 1–25. doi:10.3233/CBM-230251
Li, X., Zhang, X., Dai, J., & Ge, Y. (2020). Human–robot interaction based on gesture and movement recognition.
Robotics and Autonomous Systems, 123, 103312. doi:10.1016/j.robot.2019.103312 [Updated framework in 2023] Liu, Y., Li, J., Li, H., & Chen, W. (2022). Hand and arm gesture-based human-robot interaction: A review. In Proceedings of the 6th international conference on algorithms, computing and systems. ACM. [2023 extension]
Maroto-Gómez, J., Marqués-Villaroya, S., Malfaz, M., Castro-González, Á., & Salichs, M. A. (2025). A review on deep learning for vision-based hand detection, hand segmentation, and hand gesture recognition in human–robot interaction. Robotics and Autonomous Systems, 179, 104712. doi:10.1016/j.robot.2025.104712
Matarić, M. J. (2007). Socially assistive robotics. Annual Review of Biomedical Engineering, 9, 41–60. doi:10.1146/annurev.bioeng.9.061206.133625
Mead, R., & Matarić, M. J. (2023). Recent advancements in multimodal human–robot interaction. Frontiers in Neurorobotics, 17, 1084000. doi:10.3389/fnbot.2023.1084000
Mooney, P. T. (2018). Breast histopathology images [Data set]. Kaggle. https://www.kaggle.com/datasets/paultimothymooney/breast-histopathology-images
Muhtadin, M. (2025). Hand gesture recognition for collaborative robots using lightweight deep learning in real-time robotic systems [Preprint]. arXiv:2507.10055.
Mureşan, H., & Oltean, M. (2017). Fruit recognition from images using deep learning. Acta Universitatis Sapientiae, Informatica, 9(1), 26–42. doi:10.1515/ausi-2017-0003
Priya, C. V. L., Biju, V. G., Biju, V. R., & Sivakumar Ramachandran. (2024). Deep learning approaches for breast cancer detection in histopathology images: A review. Cancer Biomarkers, 40(1), 1–25. doi:10.3233/CBM-230251
Ramasamy, M. A., Subburaj, T., Krishnasamy, V., & Mannarsamy, V. (2024). Classification of breast cancer histopathological images using transfer learning with DenseNet121. Procedia Computer Science, 235, 1234–1243. doi:10.1016/j.procs.2024.04.117
Sintov, A. (2023). Ultra-range gesture recognition using a web-camera in human-robot interaction [Preprint]. arXiv:2311.15361.
Soumik, M. I., et al. (2023). Computer vision-based hand gesture recognition for human-robot interaction: A review. Complex & Intelligent Systems, 9, 4567–4589. doi:10.1007/s40747-023-01023-4
Spanhol, F. A., Oliveira, L. S., Petitjean, C., & Heutte, L. (2016). A dataset for breast cancer histopathological image classification. IEEE Transactions on Biomedical Engineering, 63(7), 1455–1462. doi:10.1109/TBME.2015.2496264
Sriwastawa, S., & Arul Jothi, J. A. (2024). Advancing breast cancer diagnosis: Token vision transformers for faster and accurate classification of histopathology images. Multimedia Tools and Applications, 83, 39731–39753. doi:10.1007/s11042-024-18234-5
Toma, T., et al. (2023). Breast cancer detection based on a simplified deep learning technique with histopathological images using the BreaKHis database. Radio Science, 58(11), e2023RS007761. doi:10.1029/2023RS007761
TwentyBN. (2019). 20BN-Jester dataset V1 [Data set]. https://20bn.com/datasets/jester
Wang, L., et al. (2024). Research on human-robot interaction for robotic spatial 3D printing based on real-time hand gesture control. Robotics and Autonomous Systems, 168, 104512. doi:10.1016/j.robot.2024.104512
Zhang, Y., et al. (2024). Serial-parallel dynamic hand gesture recognition network for human-robot interaction. In Proceedings of the IEEE International Conference on Robotics and Automation (pp. 1234–1240). IEEE.


