29, November 2025

AI-Integrated Unmanned Ground Rover with Visual Perception and Autonomous Decision-Making (Empowered with AI | Vision | Emotions | Protection)

Author(s): 1. Nellaturu Praveen, 2. Professor S. Swarnalatha

Authors Affiliations:

1Master of Technology, 2 Professor (Ph.D),

Department of Electronics and Communication Engineering, Sri Venkateswara University College of Engineering Tirupati, Andhra Pradesh, India

DOIs:10.2015/IJIRMF/202511020     |     Paper ID: IJIRMF202511020


Abstract
Keywords
Cite this Article/Paper as
References

This paper presents a next-generation humanoid smart rover that seamlessly integrates Artificial Intelligence (AI), Machine Learning (ML), real-time vision, and voice-driven interactions. It serves not only as a vigilant security companion capable of patrolling and surveillance but also as an emotionally aware entertainer to uplift the morale of soldiers at outposts. This unique system combines a dual-core architecture with an ESP32 master module handling sensory intelligence, mobility, and voice synthesis, while an AI-Thinker ESP32-CAM slave module delivers live video streaming and pan/tilt operations.

Enhancements include the integration of the AI-Thinker VC-02 voice assistant module, enabling natural voice commands and conversational interactions, and an AI+ML module, adding smart patrolling and autonomous area learning. The platform stands as designed for soldiers on remote outposts, it:

Patrols & Protects: Learns its environment, automatically follows safe routes, and raises alerts if something unusual happens.

Sees & Understands: Streams real-time video (via the ESP32CAM) and can pan or tilt its camera to "look" around corners or track targets.

Speaks & Listens: Uses the AI-Thinker VC-02 module to recognize spoken commands ("Start patrol," "Show me the map," "Dance!") and to reply naturally in voice.

Learns & Adapts: Employs an AI/ML model right on the ESP32 master to improve its patrol routes over time and to trigger a playful dancing routine when duty is done.

Feels & Expresses: Displays simple emotive icons on a small OLED ("😊" when all clear, "⚠️" on alert) and modulates its voice tone to match the situation—reassuring in emergencies, upbeat when entertaining.

By merging these capabilities—secure patrols, environmental sensing (BMP280, MQ135), GPS location, AI-driven voice dialogue, and emotion-aware feedback—this rover becomes both a reliable guardian and a cheerful companion.

ESP32 Microcontroller, ESP32 CAM, motor shield L293D, DC Motor, servo motor, BMP280, GPS & GSM Module, AI-Thinker VC-02 Voice Assistant Module.

1. Nellaturu Praveen, 2. Professor S. Swarnalatha (2025); AI-Integrated Unmanned Ground Rover with Visual Perception and Autonomous Decision-Making (Empowered with AI | Vision | Emotions | Protection), International Journal for Innovative Research in Multidisciplinary Field, ISSN(O): 2455-0620, Vol-11, Issue-11, Pp. 121-131.         Available on –   https://www.ijirmf.com/

  1. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv:1704.04861, 2017.
  2. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics. Cambridge, MA: MIT Press, 2005.
  3. Siegwart and I. R. Nourbakhsh, Introduction to Autonomous Mobile Robots. Cambridge, MA: MIT Press, 2004.
  4. Hinton et al., “Deep Neural Networks for Acoustic Modeling in Speech Recognition,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, Nov. 2012.
  5. S. Department of Defense, Summary of the 2018 National Defense Strategy, Washington, DC, 2018.
  6. Al-Fuqaha et al., “Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications,” IEEE Communications Surveys & Tutorials, 2015.
  7. Fong, I. Nourbakhsh, and K. Dautenhahn, “A Survey of Socially Interactive Robots,” Robotics and Autonomous Systems, 2003.
  8. Warden and D. Situnayake, TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers. O’Reilly Media, 2019.
  9. L. Hall and J. Llinas, “An Introduction to Multisensor Data Fusion,” Proceedings of the IEEE, 1997.

Download Full Paper

Download PDF No. of Downloads:10 | No. of Views: 80