A production-ready multi-agent robot task planning system using Large Language Models (LLMs) and AI2-THOR simulation environment.
- 4-Stage Pipeline: Task Decomposition → Allocation → Code Generation → AI2-THOR Execution
- Real-Time Execution: Direct AI2-THOR integration with real robot simulation
- LLM Integration: OpenAI and Anthropic API support
- Metrics Calculation: SR, TC, GCR, EXE, RU metrics from real execution
- Production Ready: No demo/simulation code - 100% real execution
- Python 3.8+
- AI2-THOR
- OpenAI API key or Anthropic API key
- Required packages (see
smartllm_3agent/requirements.txt)
- Clone the repository:
git clone <repository-url>
cd SMART_LLM_LangGraph- Install dependencies:
cd smartllm_3agent
pip install -r requirements.txt- Set up environment variables:
cp env.example .env
# Edit .env with your API keyscd smartllm_3agent
python3 run_llm_langgraph.py --floor-plan 6 --task-id 1- FloorPlan6, FloorPlan15, FloorPlan21
- FloorPlan201, FloorPlan209, FloorPlan303, FloorPlan414
python3 run_llm_langgraph.py --floor-plan 6 --list-tasks- Task Decomposition Agent: Breaks down complex tasks into subtasks
- Task Allocation Agent: Assigns subtasks to available robots
- Code Generation Agent: Generates executable Python code
- Execution Agent: Runs code in AI2-THOR environment
graph/nodes.py- All 4 agents in unified scriptgraph/graph_batch.py- Complete 4-stage pipelineadapters/skills.py- AI2-THOR skill interfaceadapters/thor_runtime.py- AI2-THOR environment managementrun_llm_langgraph.py- Main execution script
The framework calculates real-time metrics from AI2-THOR execution:
- SR (Success Rate): Overall task completion success
- TC (Task Completion): Binary task completion status
- GCR (Goal Condition Rate): Percentage of goal conditions met
- EXE (Execution Rate): Successful execution percentage
- RU (Resource Utilization): Efficiency of resource usage
SMART_LLM_LangGraph/
├── smartllm_3agent/ # Main framework
│ ├── graph/ # LangGraph components
│ │ ├── nodes.py # All 4 agents
│ │ ├── graph_batch.py # 4-stage pipeline
│ │ ├── schemas.py # Data models
│ │ ├── edges.py # Routing logic
│ │ └── validators.py # Validation functions
│ ├── adapters/ # AI2-THOR integration
│ │ ├── skills.py # Robot skill interface
│ │ └── thor_runtime.py # Environment management
│ ├── prompts/ # LLM prompt templates
│ ├── scripts/ # Utility scripts
│ ├── data/ # Training data and examples
│ ├── logs/ # Execution logs
│ └── run_llm_langgraph.py # Main runner
├── final_test/ # Test data
├── provided/ # AI2-THOR integration scripts
└── README.md
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
LLM_MAX_TOKENS=4000
ENABLE_VISUAL_DISPLAY=trueRobots are automatically configured based on task requirements with full skill sets:
- GoToObject, PickupObject, PutObject
- OpenObject, CloseObject, SwitchOn, SwitchOff
- SliceObject, BreakObject, CleanObject, ThrowObject
Run benchmark tests:
cd smartllm_3agent/scripts/testing
python3 run_benchmarks.pyExecution logs are saved to logs/FloorPlan{ID}_{timestamp}/:
decomposed_plan.py- Task decomposition resultsallocated_plan.py- Task allocation resultscode_plan.py- Generated Python codelog.txt- Complete execution log with metrics
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- AI2-THOR team for the simulation environment
- LangGraph for multi-agent orchestration
- OpenAI and Anthropic for LLM APIs