projnanda · Dislevekanku · Oct 10, 2025 · Oct 11, 2025 · Dec 10, 2025 · Dec 12, 2025
diff --git a/.gitignore b/.gitignore
@@ -72,6 +72,8 @@ Pipfile.lock
 *.pem
 *.crt
 *.key
+*.pub
+*-key
 
 # Testing
 .pytest_cache/

diff --git a/DATA_LOADING_IMPLEMENTATION.md b/DATA_LOADING_IMPLEMENTATION.md
@@ -0,0 +1,162 @@
+# Data Loading Implementation - Step 3
+
+## Summary
+Modified `examples/nanda_agent.py` to read `DATA_PATH` from environment and load attached data (CSV and JSON support). The agent now loads data on startup and makes it available via `AGENT_CONFIG["data"]`.
+
+## Changes Made
+
+### 1. Added Pandas Import (Lines 28-34)
+```python
+# Try to import pandas for CSV support
+try:
+    import pandas as pd
+    PANDAS_AVAILABLE = True
+except ImportError:
+    PANDAS_AVAILABLE = False
+    print("⚠️ Warning: pandas library not available. CSV data loading will be disabled. Install with: pip install pandas")
+```
+
+### 2. Added DATA_PATH Environment Variable (Line 100)
+```python
+# Data path from environment
+DATA_PATH = os.getenv("DATA_PATH", None)
+```
+
+### 3. Added load_attached_data() Function (Lines 106-145)
+- Supports CSV files (requires pandas)
+- Supports JSON files (built-in json module)
+- Graceful error handling
+- Informative logging
+
+**Function signature:**
+```python
+def load_attached_data(path):
+    """
+    Load data from a file path. Supports CSV and JSON formats.
+
+    Args:
+        path: File path to load data from
+
+    Returns:
+        Loaded data (DataFrame for CSV, dict/list for JSON) or None if failed
+    """
+```
+
+**Features:**
+- CSV: Returns pandas DataFrame, logs shape
+- JSON: Returns dict/list, logs keys or item count
+- Error handling: FileNotFoundError, general exceptions
+- Format validation: Only .csv and .json supported
+
+### 4. Updated main() Function (Lines 245-257)
+Added data loading logic:
+```python
+# Load attached data if DATA_PATH is provided
+attached_data = None
+if DATA_PATH:
+    print(f"📂 Loading data from: {DATA_PATH}")
+    attached_data = load_attached_data(DATA_PATH)
+    if attached_data is not None:
+        if PANDAS_AVAILABLE and isinstance(attached_data, pd.DataFrame):
+            print(f"📊 Loaded data with shape: {attached_data.shape}")
+        AGENT_CONFIG["data"] = attached_data
+    else:
+        print("⚠️ Data loading failed, continuing without attached data")
+else:
+    print("ℹ️ No DATA_PATH provided; starting agent without attached data")
+```
+
+### 5. Updated setup.py
+Added `pandas` to requirements (line 21):
+```python
+"pandas"   # For CSV data loading
+```
+
+### 6. Created Sample Test Data
+Created `tests/sample_hr.csv` with sample HR data for testing.
+
+## Data Access
+
+After loading, data is available in:
+- `AGENT_CONFIG["data"]` - Contains the loaded DataFrame (CSV) or dict/list (JSON)
+- Can be accessed in agent logic functions via the config parameter
+
+## Testing
+
+### Test Without Data
+```bash
+python examples/nanda_agent.py
+```
+Expected output:
+```
+ℹ️ No DATA_PATH provided; starting agent without attached data
+```
+
+### Test With CSV Data
+```bash
+# Windows PowerShell
+$env:DATA_PATH="tests/sample_hr.csv"; python examples/nanda_agent.py
+
+# Linux/Mac
+DATA_PATH=tests/sample_hr.csv python examples/nanda_agent.py
+```
+Expected output:
+```
+📂 Loading data from: tests/sample_hr.csv
+✅ Loaded CSV data with shape: (10, 5)
+📊 Loaded data with shape: (10, 5)
+```
+
+### Test With JSON Data
+```bash
+# Create a sample JSON file first
+echo '{"key": "value", "numbers": [1, 2, 3]}' > tests/sample.json
+
+# Then run
+DATA_PATH=tests/sample.json python examples/nanda_agent.py
+```
+Expected output:
+```
+📂 Loading data from: tests/sample.json
+✅ Loaded JSON data
+   Keys: ['key', 'numbers']
+```
+
+### Test With Invalid Path
+```bash
+DATA_PATH=/nonexistent/file.csv python examples/nanda_agent.py
+```
+Expected output:
+```
+📂 Loading data from: /nonexistent/file.csv
+❌ Failed to load data: File not found at /nonexistent/file.csv
+⚠️ Data loading failed, continuing without attached data
+```
+
+## Next Steps (Step 4 - Tomorrow)
+
+After data loads correctly, expose it via MCP tools:
+- Create MCP tool: `get_row_count`
+- Description: returns number of rows in attached dataset
+- Input: none
+- Output: integer
+
+## Files Modified
+
+1. `NEST/examples/nanda_agent.py` - Added data loading functionality
+2. `NEST/setup.py` - Added pandas to requirements
+3. `NEST/tests/sample_hr.csv` - Created sample test data
+
+## Verification Checklist
+
+- ✅ DATA_PATH read from environment
+- ✅ CSV loading with pandas support
+- ✅ JSON loading with built-in json
+- ✅ Error handling for missing files
+- ✅ Error handling for unsupported formats
+- ✅ Logging for data loading status
+- ✅ Data added to AGENT_CONFIG["data"]
+- ✅ Agent starts normally without data
+- ✅ Agent starts normally with data
+- ✅ Sample test data created
+
diff --git a/DATA_PATH_HOOK_CHANGES.md b/DATA_PATH_HOOK_CHANGES.md
@@ -0,0 +1,86 @@
+# DATA_PATH Hook Implementation - Step 2
+
+## Summary
+Added DATA_PATH parameter support to the NEST AWS deployment script. This allows Maria to optionally pass a data path that will be available to the agent as the `DATA_PATH` environment variable.
+
+## Changes Made
+
+### 1. Added DATA_PATH Parameter (Line 21)
+```bash
+DATA_PATH="${12:-}"    # Optional data path
+```
+
+### 2. Updated Usage/Help Text (Lines 25, 28, 38-42)
+- Added `[DATA_PATH]` to usage statement
+- Added DATA_PATH example in example command
+- Added DATA_PATH parameter description
+
+### 3. Added DATA_PATH to Deployment Output (Line 54)
+```bash
+echo "Data Path: ${DATA_PATH:-"None (no data attached)"}"
+```
+
+### 4. Added DATA_PATH Logging in User-Data Script (Lines 169-174)
+```bash
+# Log data path status
+if [ -n "$DATA_PATH" ]; then
+    echo "DATA_PATH is set to: $DATA_PATH"
+else
+    echo "No DATA_PATH provided; starting agent without attached data."
+fi
+```
+
+### 5. Exported DATA_PATH Environment Variable (Line 191)
+```bash
+export DATA_PATH='$DATA_PATH'
+```
+
+## Updated Defaults
+- `REGISTRY_URL` default changed to: `http://registry.chat39.com:6900`
+- `INSTANCE_TYPE` default changed to: `t3.small`
+
+## Testing
+
+### Quick Syntax Check
+```bash
+bash -n scripts/aws-single-agent-deployment.sh
+```
+
+### Simulate Script Call (No AWS)
+```bash
+# Test with DATA_PATH
+bash scripts/aws-single-agent-deployment.sh \
+  "test-agent" \
+  "sk-ant-test" \
+  "Test Agent" \
+  "testing" \
+  "test specialist" \
+  "Test description" \
+  "testing" \
+  "" \
+  "6000" \
+  "us-east-1" \
+  "t3.small" \
+  "/data/hr.csv"
+
+# Test without DATA_PATH (should work with empty 12th arg)
+bash scripts/aws-single-agent-deployment.sh \
+  "test-agent" \
+  "sk-ant-test" \
+  "Test Agent" \
+  "testing" \
+  "test specialist" \
+  "Test description" \
+  "testing"
+```
+
+### Verify User-Data Script Generation
+After running the script (even with invalid AWS credentials), check the generated `user_data_${AGENT_ID}.sh` file:
+- Should contain `export DATA_PATH='...'` line
+- Should contain the logging section for DATA_PATH
+
+## Next Steps
+- Agent code (`examples/nanda_agent.py`) can now read `DATA_PATH` from environment
+- Data loading logic can be added to agent initialization
+- MCP tools can be registered to access the data
+
diff --git a/DATA_SETUP_INSTRUCTIONS.md b/DATA_SETUP_INSTRUCTIONS.md
@@ -0,0 +1,111 @@
+# Creating HR Dataset for NEST Ecosystem
+
+## Quick Setup (Already Done ✅)
+
+The HR dataset has been created at: `NEST/data/hr.csv`
+
+## Manual Setup Instructions
+
+If you need to recreate or modify the dataset:
+
+### Steps:
+
+1. **Navigate to NEST directory:**
+   ```bash
+   cd NEST
+   ```
+
+2. **Create data directory (if it doesn't exist):**
+   ```bash
+   # Windows PowerShell
+   if (-not (Test-Path "data")) { New-Item -ItemType Directory -Path "data" }
+
+   # Linux/Mac
+   mkdir -p data
+   ```
+
+3. **Create the HR dataset:**
+   ```bash
+   # Windows PowerShell
+   # Create file: data/hr.csv
+
+   # Linux/Mac
+   nano data/hr.csv
+   ```
+
+4. **Paste this EXACT content:**
+   ```csv
+   employee_id,department,salary,years_at_company
+   1001,Engineering,145000,5
+   1002,Sales,92000,2
+   1003,HR,86000,4
+   1004,Marketing,78000,1
+   ```
+
+5. **Save the file:**
+   - **Windows**: Save in your editor
+   - **Linux/Mac**: CTRL+O, ENTER, CTRL+X
+
+## Using the Dataset
+
+### Local Testing
+
+```bash
+# From NEST directory
+DATA_PATH=data/hr.csv python examples/nanda_agent.py
+```
+
+### AWS Deployment
+
+```bash
+bash scripts/aws-single-agent-deployment.sh \
+  "hr-agent" \
+  "sk-ant-xxxxx" \
+  "HR Assistant" \
+  "human resources" \
+  "HR specialist" \
+  "I help with HR questions and employee data" \
+  "HR,employee data,payroll,benefits" \
+  "http://registry.chat39.com:6900" \
+  "6000" \
+  "us-east-1" \
+  "t3.small" \
+  "data/hr.csv"  # <-- 12th parameter: DATA_PATH
+```
+
+## Dataset Structure
+
+The HR dataset contains:
+- **4 employees** across different departments
+- **Columns:**
+  - `employee_id`: Unique identifier (1001-1004)
+  - `department`: Engineering, Sales, HR, Marketing
+  - `salary`: Annual salary (78k-145k)
+  - `years_at_company`: Years of service (1-5)
+
+## Purpose
+
+This dataset allows the HR agent to:
+- Demonstrate "having data" 
+- Respond to questions based on actual employee data
+- Show how agents can use attached datasets
+
+## File Location in NEST
+
+```
+NEST/
+├── data/
+│   ├── hr.csv          ← HR dataset (ready to use)
+│   └── README.md       ← Data directory documentation
+├── examples/
+│   └── nanda_agent.py ← Agent that loads data
+└── scripts/
+    └── aws-single-agent-deployment.sh ← Deployment with DATA_PATH
+```
+
+## Next Steps
+
+1. ✅ Dataset created at `data/hr.csv`
+2. ✅ Agent can load it via `DATA_PATH=data/hr.csv`
+3. 🔜 Step 4: Expose data via MCP tools (get_row_count, query_data, etc.)
+
-Original file line number
+Diff line change
@@ Expand Up / @@ -72,6 +72,8 @@ Pipfile.lock @@
     *.pem
     *.crt
     *.key
+    *.pub
+    *-key
     # Testing
     .pytest_cache/
@@ Expand Down @@