Embodied Drone Agents

Welcome to the Embodied Drone Agents project! This repository contains code for creating embodied AI-driven drone agents using the AutoGen framework and MavSDK. Our goal is to enable the conversion of language prompts into parameterized function calls to control drone behavior programmatically.

Introduction

Embodied Drone Agents is a project focused on developing embodied AI agents in drones that can understand and execute complex tasks based on natural language instructions. By leveraging AutoGen for language processing and MavSDK for drone control, we aim to create a seamless interface for drone operation.

Currently, the repository has two main directories, InitialAgent-D and Agent-Drone. InitialAgent-D was our first attempt at using a single Autogen agent to accomplish autonomous programmatic control over the drone. The working project is located in the Agent-Drone directory. It can be run by activating a virtual environment, installing all the requirements found in requirements.txt, and running Agent-Drone/main.py.

Project demo video

System Information

This project relies on the following technologies:

Ubuntu 22.04 LTS
Windows Subsystem for Linux (when developing on Windows computer)
PX4
Python 3.9+
MavSDK Python package
Gazebo PX4 shell (make px4_sitl gazebo)
Other Python package requirements

Read more about software dependencies in the set up section

The hardware our project uses is the HolyBro x500 Drone

Features

Language to Action: Convert natural language prompts into parameterized function calls.
Hierarchial Agent Structure: Use of high-level planner LLM and executor agent that communicate with each other in order to achieve desired complex behavior.
Drone Control: Utilize MavSDK to control various aspects of drone behavior.
Extensibility: Easily add new commands and behaviors through modular design.
Voice capability: Users can input either voice or text queries into the agent system.

Skills & Prompting

Found in Agent-Drone/agent_d/skills/ are various python scripts that are skills the agent can use to move the drone. The scripts all use Python's MavSDK package. The planner LLM agent has knowledge about what skills can be executed, and the control agent has power only to execute the skills.

Accessible skills: takeoff, land, circle a point, fly to coordinates, hover at location, return to launch, & rotate to specific yaw

To-Be Implemented skills: follow me, got objects, & get poses

Prompts.py found in Agent-Drone/agent_d/ is crucial to the Autogen agent system for understanding what their roles are. Prompts.py contains prompts for the agents and for each skill, outlining what each component means. Furthermore, Autogen understands the parameters and return types of external functions via the Python package Annotated from the typing library. Each of our skills' functions are annotated.

Installation

To get started with the Embodied Drone Agents project, it's important to determine the type of computer that you own and what your goals are with respect to the use of this project. This project is not compatible with Apple devices. You must use either a Windows or Ubuntu Linux machine. Follow these steps:

Clone the repository:

git clone https://github.com/EmergenceAI/embodied-drone-agents.git
cd embodied-drone-agents

Install dependencies: Ensure you have Python 3.8+ installed. Then, install the required Python packages:
```
pip install -r requirements.txt
```
Setup MavSDK: Follow the MavSDK installation guide to install MavSDK.

Setup for Windows:

a. Setup Ubuntu Subsystem:

Open Windows Powershell as Admin

Run the following commands:

dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart
dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart
wsl --set-default-version 2

Install Ubuntu 22.04 via the Microsoft Store
Run: wsl --install
In the Ubuntu terminal, run:
```
sudo apt update
sudo apt upgrade
```

b. Setup PX4:

cd
git clone https://github.com/PX4/PX4-Autopilot.git --recursive
bash ./PX4-Autopilot/Tools/setup/ubuntu.sh
# Close terminal and reopen terminal
cd PX4-Autopilot/
make px4_sitl

c. Setup MAVSDK:

cd ..
sudo apt update
sudo apt install -y git cmake build-essential libboost-system-dev libeigen3-dev libopencv-dev
git clone https://github.com/mavlink/MAVSDK.git
cd MAVSDK
pip install mavsdk
cd ..

d. Setup Gazebo:

sudo apt install gazebo
cd ~/PX4-Autopilot/

Setup environment: First, check if setup_gazebo.bash exists by looking through the full PX4-Autopilot directory, as it may be in a different location. Check within both ~/PX4-Autopilot/Tools/ and ~/PX4-Autopilot/Tools/simulation/gazebo-classic/ for setup_gazebo.bash. If setup_gazebo.bash exists already, then note the full path where it exists and jump directly to step 'e'. If it doesn't exist, then create it using the following steps and skip step 'e':

cd ~/PX4-Autopilot/Tools
nano setup_gazebo.bash

Use the following code within the bash file:

#!/bin/bash

# Set Gazebo plugin path
export GAZEBO_PLUGIN_PATH=$GAZEBO_PLUGIN_PATH:~/PX4-Autopilot/build/px4_sitl_default/build_gazebo

# Set Gazebo model path
export GAZEBO_MODEL_PATH=$GAZEBO_MODEL_PATH:~/PX4-Autopilot/Tools/sitl_gazebo/models

# Set library path
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

# Set ROS package path
export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:~/PX4-Autopilot:~/PX4-Autopilot/Tools/sitl_gazebo

Save and exit (Ctrl+O, Enter, Ctrl+X)

chmod +x setup_gazebo.bash
nano ~/.bashrc

Add these lines to the end of the file:

source ~/PX4-Autopilot/Tools/setup_gazebo.bash ~/PX4-Autopilot ~/PX4-Autopilot/build/px4_sitl_default
export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:~/PX4-Autopilot:~/PX4-Autopilot/Tools/sitl_gazebo

Save and exit (Ctrl+O, Enter, Ctrl+X)

source ~/.bashrc

e. Choose the option that best matches your project structure and contains the correct directories for setup_gazebo.bash and sitl_gazebo on your machine. Then, add the appropriate lines to the end of your /.bashrc file:

nano ~/.bashrc

Option 1:

source ~/PX4-Autopilot/Tools/simulation/gazebo-classic/setup_gazebo.bash ~/PX4-Autopilot ~/PX4-Autopilot/build/px4_sitl_default
export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:~/PX4-Autopilot:~/PX4-Autopilot/Tools/simulation/gazebo-classic/sitl_gazebo-classic

Option 2:

source ~/PX4-Autopilot/Tools/setup_gazebo.bash ~/PX4-Autopilot ~/PX4-Autopilot/build/px4_sitl_default
export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:~/PX4-Autopilot:~/PX4-Autopilot/Tools/sitl_gazebo

Option 3:

source ~/PX4-Autopilot/PathToBashFile/setup_gazebo.bash ~/PX4-Autopilot ~/PX4-Autopilot/build/px4_sitl_default
export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:~/PX4-Autopilot:~/PX4-Autopilot/PathToGazeboFile/sitl_gazebo

Save and exit (Ctrl+O, Enter, Ctrl+X)

source ~/.bashrc

f. Run Gazebo:

make px4_sitl gazebo

Setup for Linux (Ubuntu):

a. Open an Ubuntu terminal and run:

sudo apt update
sudo apt upgrade

b. Setup PX4:

cd
git clone https://github.com/PX4/PX4-Autopilot.git --recursive
bash ./PX4-Autopilot/Tools/setup/ubuntu.sh
# Close terminal and reopen terminal
cd PX4-Autopilot/
make px4_sitl

c. Setup MAVSDK:

cd ..
sudo apt update
sudo apt install -y git cmake build-essential libboost-system-dev libeigen3-dev libopencv-dev
git clone https://github.com/mavlink/MAVSDK.git
cd MAVSDK
pip install mavsdk
cd ..

d. Setup Gazebo: Follow the same Gazebo setup steps as in the Windows instructions above.

e. Run Gazebo:

make px4_sitl gazebo

Start mavsdk_server: Open a new terminal (making sure to keep the Gazebo terminal open) and run:
```
mavsdk_server udp://:14540
```

Usage

Windows

Start Gazebo

a. Open Powershell and click "run as administrator":

b. Launch WSL:
```
wsl --install
```
c. Launch Gazebo:
```
cd PX4-Autopilot/
make px4_sitl gazebo
```
Start MavSDK server

a. Open Powershell and click "run as administrator":

b. Launch WSL:
```
wsl --install
```
c. Start server at port:
```
mavsdk_server udp://:14540
```
Run Repo Code

a. Open Powershell and click "run as administrator":

b. Launch WSL:
```
wsl --install
```
c. Mount local files:
```
cd /mnt/[path/to/this/github/repo/]
```
d. Run main.py
```
cd /Agent-Drone/
python3 main.py
```

When running this project, you should have 2 terminals open for the Gazebo server and MavSDK server, and another terminal open to run main.py (3 terminals). You should be able to provide natural language queries to control the drone in the Gazebo simulator. Check out the Examples section below for test prompts to query into the agent system.

You can customize the behavior by modifying the configuration files and scripts provided in the config and scripts directories.

Example Prompts

After running main.py, you will be prompted to enter a query for the drone. Here are some example queries to enter.

Takeoff and Land: Take off to a height of 10m, and then come back down
Fly to a Specific Location: Fly to coordinates 10, 8, 6. Then, fly to the origin.
Complex Multi-Step Tasks: Takeoff to 12 meters, go to 2, 6 at that same altitude, then go left, then in a circle, and then land back where you took off from.

Contributing

We welcome contributions to the Embodied Drone Agents project! If you have any improvements or new features to add, please follow these steps:

Fork the repository.
Create a new branch: git checkout -b feature-branch
Commit your changes: git commit -m 'Add new feature'
Push to the branch: git push origin feature-branch
Create a pull request.

Please ensure your code follows the project's coding standards and includes appropriate tests.

To-Do

Voice enablement

Currently, the script "voice_to_text.py" in the InitialAgentD directory gets a voice recording through a GUI and transcribes it into text. This voice -> text script needs to be integrated with the AgentD project to create an end-to-end voice-to-drone movement system. The entire pipeline should look like this:

---
title: System Pipeline
---
flowchart TD;
    voice-->id1[text query];
    id1[text query]-->autogen;
    autogen-->planner & control;
    planner <-->  control;
    control --> id2[MavSDK skill calls];
    id2[MavSDK skill calls]-->Gazebo;
    id2[MavSDK skill calls]-->Hardware;

Autogen State Input

One of our primary goals was to find a way to feed the state of the Gazebo simulator as information into the agent system. In our skills, we have an example get_objects() skill that has yet to be implemented. We were unable to get pygazebo and gz transport libraries working, so we couldn't get object data from gazebo simulator. Currently, in Agent-Drone/agent_d/skills/, we have scripts get_objects.py and get_poses.sh. Both of these scripts were unsuccessful attempts to extract the information of objects in the simulators.

In order to implement this, we recommend trying to get the skill get_objects.py working so that it gets formatted data of the objects in the simulator and simply adds it as a skill in drone_control_agent.py. When adding it as a skill, be sure to register it for both the LLM and executor.

Drone Hardware

We are having some difficulty with getting the drone hardware to pass pre-flight checks. We need to demonstrate basic programmatic control of the drone and then simply use the autogen system.

Acknowledgements

We would like to thank the following individuals and organizations for their invaluable contributions as we developed this project.

Aniketh Arvind & Mihir Kulshreshtha, developers
Ashish Jagmohan, primary supervisor
Tamer Abuelsaad, Aditya Vempaty, Deepak Akkil, & Ravi Kokku technical advisors
Emergence AI, AI agent company

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.cache/41		.cache/41
Agent-Drone		Agent-Drone
InitialAgent-D		InitialAgent-D
.DS_Store		.DS_Store
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Embodied Drone Agents

Table of Contents

Introduction

System Information

Features

Skills & Prompting

Installation

Usage

Windows

Example Prompts

Contributing

To-Do

Voice enablement

Autogen State Input

Drone Hardware

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

EmergenceAI/embodied-drone-agents

Folders and files

Latest commit

History

Repository files navigation

Embodied Drone Agents

Table of Contents

Introduction

System Information

Features

Skills & Prompting

Installation

Usage

Windows

Example Prompts

Contributing

To-Do

Voice enablement

Autogen State Input

Drone Hardware

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages