Thesis : "Hexapod autonomous control system based on auditory scene analysis: real-time sound source localization and keyword spotting for voice command recognition"
--- In progress
Diploma project completed at Warsaw University of Science and Technology as a part of Master of Science in Engineering - Computer Science.
This project aims to develop an autonomous control system for a hexapod walking robot, using auditory scene analysis as the primary modality for navigation and environmental interaction. The system integrates sound source localization (Direction of Arrival estimation - DOA) and beamforming techniques via the ODAS framework, employing a circular microphone array for enhanced spatial precision. This enables the robot to accurately detect and characterize sound sources, allowing real-time responses to acoustic stimuli for dynamic, context-aware behavior.
A Keyword Spotting (KWS) module, powered by machine learning, is incorporated to recognize predefined voice commands, enabling effective human-robot interaction. The research focuses on developing the hardware and software infrastructure to seamlessly integrate acoustic processing with the robot's control system.
The project includes designing and building the robot's platform, encompassing both the mechanical structure and embedded systems. The hexapod's platform is engineered to support advanced auditory processing, ensuring optimal performance in real-world scenarios. This involves creating a robust mechanical framework for stable, agile locomotion and an embedded system architecture for real-time processing and decision-making.
The hardware is designed to accommodate the circular microphone array, ensuring precise sound capture, while the software facilitates seamless communication between auditory processing modules, the control system, and actuators. This comprehensive approach ensures the robot can perform complex tasks, such as navigating dynamic environments and responding accurately to auditory cues.
[Click the image below to watch the full demonstration video]
This video demonstrates an autonomous hexapod robot performing advanced auditory scene analysis in real-time. The complete ODAS (Open embeddeD Audition System) pipeline with beamforming is showcased, featuring:
- Real-time Direction of Arrival (DoA) estimation using a 6-microphone circular array
- Live GUI visualization showing sound source tracking and spatial mapping
- Terminal debug output displaying active sound sources with coordinates and activity levels
- Elevation and azimuth time charts showing temporal tracking of sound source positions
- System monitoring panel showing CPU usage, temperature, memory usage, and IP address
- Robot view - top-down view of the hexapod responding to acoustic stimuli
- LED feedback system indicating detected sound sources through visual cues
- Multi-source tracking - demonstrating the system's ability to track up to 4 simultaneous sound sources
- Automatic audio stream separation and recording of individual source audio files
This represents a complete autonomous control system where the hexapod can navigate and interact based purely on auditory cues, enabling sophisticated human-robot interaction through voice commands and environmental sound awareness.
Copyright (c) 2025 Krystian Głodek [email protected]. All rights reserved.