Add Brightdata Deep Web Researcher #192

sitammeur · 2025-09-06T10:11:47Z

Summary by CodeRabbit

New Features
- Introduced an interactive Agentic Deep Researcher app with chat interface, session history, and Bright Data API key input.
- Added a multi-stage deep research workflow with web search, specialist analysis, and final synthesis; async execution supported.
Documentation
- New README with setup, environment variables, installation, and run instructions for both workflow and Streamlit UI.
- Included a sample research report for reference.
Chores
- Added environment template with required keys.
- Added project configuration and dependencies.

coderabbitai · 2025-09-06T10:11:53Z

Walkthrough

Introduces a new Bright Data–powered, multi-agent deep-research workflow with CrewAI, a Streamlit chat UI to drive it, configuration scaffolding, and project packaging. Adds flow orchestration, MCP server setup, async research entrypoint, documentation, an example output report, and environment/dependency configuration.

Changes

Cohort / File(s)	Summary
Docs & Config Templates `brightdata_deep_researcher/.env.example`, `brightdata_deep_researcher/README.md`, `brightdata_deep_researcher/output.md`	Adds env template with OPENAI/BRIGHTDATA/GEMINI keys; README describing setup, keys, commands, and run instructions; sample output report.
Flow Orchestration `brightdata_deep_researcher/flow.py`	New CrewAI-based DeepResearchFlow with URL bucketing, specialist extraction, synthesis, MCP server config, and LLM configurations; introduces data models and flow state; async kickoff example.
App & Entry `brightdata_deep_researcher/app.py`, `brightdata_deep_researcher/research.py`	Streamlit chat UI with session state, Bright Data API key handling, async call to run_deep_research; async entrypoint running the flow and returning the final result with error handling.
Packaging `brightdata_deep_researcher/pyproject.toml`	Project metadata and dependencies for CrewAI tools with MCP, mcp, ollama, dotenv, and Streamlit; Python >=3.11.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant UI as Streamlit App (app.py)
    participant R as run_deep_research (research.py)
    participant Flow as DeepResearchFlow (flow.py)
    participant MCP as Bright Data MCP Server
    participant LLMs as LLMs (search/specialist/response)

    User->>UI: Enter prompt
    UI->>UI: Validate BRIGHT_DATA_API_TOKEN
    alt Missing API key
        UI-->>User: Prompt to enter Bright Data API Key
    else Key present
        UI->>R: asyncio.run(run_deep_research(prompt))
        R->>Flow: Instantiate + state.query = prompt
        R->>Flow: kickoff_async()
        par Start
            Flow->>Flow: start_flow()
        and Collect URLs
            Flow->>MCP: Search/query web sources
            MCP-->>Flow: URL buckets (instagram/linkedin/youtube/x/web)
        and Specialists
            Flow->>LLMs: Per-platform extraction tasks
            LLMs-->>Flow: Specialist outputs (summaries/metadata)
        and Synthesis
            Flow->>LLMs: Final synthesis request
            LLMs-->>Flow: Markdown response
        end
        Flow-->>R: {"result": final_response}
        R-->>UI: final_response
        UI-->>User: Render assistant message
    end
    note over UI,Flow: Errors caught and returned as error string

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Update Stagehand Agentic Demo #179 — Similar flow startup pattern: assigning flow.state.query before kickoff_async, indicating related orchestration changes.

Poem

In burrows of code, I hop and I seek,
Bright webs of data, Crew minds that speak.
Streams of chats, a key in the light,
Specialists gather, synthesize insight.
From prompt to prose, I thump with cheer—
Deep research done, the carrots are near! 🥕✨

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 16

🧹 Nitpick comments (8)

brightdata_deep_researcher/output.md (1)
78-84: Tighten citations and fix minor style nits.

Add publication dates next to sources.

For YouTube titles, keep exact casing but add channel + upload date.

Consider archiving links for durability.

Apply:
--  9to5Mac: ["iPhone 17 release date: Here’s when to expect Apple’s big launch"](https://9to5mac.com/2025/09/04/iphone-17-release-date-heres-when-to-expect-apples-big-launch/)
+- 9to5Mac (2025-09-04): ["iPhone 17 release date: Here’s when to expect Apple’s big launch"](https://9to5mac.com/2025/09/04/iphone-17-release-date-heres-when-to-expect-apples-big-launch/)
brightdata_deep_researcher/.env.example (1)
3-3: Add trailing newline.

Conventional POSIX text files end with a newline.

Apply:
-GEMINI_API_KEY=
+GEMINI_API_KEY=
+
brightdata_deep_researcher/research.py (1)
6-12: Avoid blanket except Exception without logging.

At least log the error for diagnostics; keep user-friendly return.
+import logging
@@
-    except Exception as e:
-        return f"An error occurred: {e!s}"
+    except Exception as e:
+        logging.exception("run_deep_research failed")
+        return f"An error occurred: {e!s}"
brightdata_deep_researcher/README.md (1)
25-33: Add install step using uv and pyproject.

Recommend an explicit install so imports work outside the package dir.
 ```bash
-uv sync
-source .venv/bin/activate
+uv sync && source .venv/bin/activate
+uv pip install -e .
</blockquote></details>
<details>
<summary>brightdata_deep_researcher/app.py (2)</summary><blockquote>

`81-84`: **Avoid blind `except Exception` and use explicit conversion.**

Tighten the exception handling and adopt `!s` as Ruff suggests.


```diff
-            except Exception as e:
-                response = f"An error occurred: {str(e)}"
+            except Exception as e:
+                response = f"An error occurred: {e!s}"
81-81: Async in Streamlit: guard for running loop (optional).

In some embeddings, an event loop may already be running; asyncio.run would raise. This guard keeps it robust.
-                result = asyncio.run(run_deep_research(prompt))
+                try:
+                    loop = asyncio.get_running_loop()
+                except RuntimeError:
+                    # No loop: safe to run
+                    result = asyncio.run(run_deep_research(prompt))
+                else:
+                    # Loop exists: schedule coroutine and wait
+                    fut = asyncio.run_coroutine_threadsafe(run_deep_research(prompt), loop)
+                    result = fut.result()
brightdata_deep_researcher/flow.py (2)
51-53: Verify LLM provider credentials across OpenAI and Gemini.

You’re mixing openai/* and gemini/*. Ensure keys/config are wired for both in your deployment env, or unify to one provider to reduce setup friction.

I can parameterize these via env and default to a single provider; want a patch?

280-280: Remove unnecessary f-string.
-print(f"FINAL RESULT")
+print("FINAL RESULT")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e597969 and c89aa5e.

⛔ Files ignored due to path filters (1)

brightdata_deep_researcher/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

brightdata_deep_researcher/.env.example (1 hunks)
brightdata_deep_researcher/README.md (1 hunks)
brightdata_deep_researcher/app.py (1 hunks)
brightdata_deep_researcher/flow.py (1 hunks)
brightdata_deep_researcher/output.md (1 hunks)
brightdata_deep_researcher/pyproject.toml (1 hunks)
brightdata_deep_researcher/research.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (3)

brightdata_deep_researcher/research.py (2)

brightdata_deep_researcher/flow.py (1)

DeepResearchFlow (50-270)

web-browsing-agent/flow.py (1)

main (147-155)

brightdata_deep_researcher/app.py (1)

brightdata_deep_researcher/research.py (1)

run_deep_research (4-12)

brightdata_deep_researcher/flow.py (3)

brand-monitoring/brand_monitoring_flow/src/brand_monitoring_flow/main.py (3)

BrandMonitoringFlow (46-260)

scrape_data_and_analyse (68-260)

x_analysis (152-180)

web-browsing-agent/flow.py (1)

BrowserAutomationFlow (45-143)

Multi-Agent-deep-researcher-mcp-windows-linux/agents.py (1)

create_research_crew (58-126)

🪛 Ruff (0.12.2)

brightdata_deep_researcher/research.py

11-11: Do not catch blind exception: Exception

(BLE001)

12-12: Use explicit conversion flag

Replace with conversion flag

(RUF010)

brightdata_deep_researcher/app.py

83-83: Do not catch blind exception: Exception

(BLE001)

84-84: Use explicit conversion flag

Replace with conversion flag

(RUF010)

brightdata_deep_researcher/flow.py

116-116: Do not catch blind exception: Exception

(BLE001)

280-280: f-string without any placeholders

Remove extraneous f prefix

(F541)

🪛 LanguageTool

brightdata_deep_researcher/output.md

[grammar] ~19-~19: There might be a mistake here.
Context: ...ment Event:** Tuesday, September 9, 2025

Pre-orders Begin: Friday, September 12...

(QB_NEW_EN)

[style] ~20-~20: Some style guides suggest that commas should set off the year in a month-day-year date.
Context: ...e-orders Begin:** Friday, September 12, 2025

Market Release: Friday, Septembe...

(MISSING_COMMA_AFTER_YEAR)

[grammar] ~20-~20: There might be a mistake here.
Context: ...ders Begin:** Friday, September 12, 2025

Market Release: Friday, September 19, ...

(QB_NEW_EN)

[grammar] ~78-~78: There might be a mistake here.
Context: ...lowing sources:

YouTube Analysis:
- Matt Talks Tech: ["iPhone 17 Pro Max — 8...

(QB_NEW_EN)

[grammar] ~79-~79: There might be a mistake here.
Context: ...hone 17 Pro Max — 8 NEW LEAKS REVEALED!"](https://www.youtube.com/watch?v=ovRP80RLgWQ)

Matt Talks Tech: ["iPhone 17 — 10 HUGE L...

(QB_NEW_EN)

[grammar] ~80-~80: There might be a mistake here.
Context: ...Phone 17 — 10 HUGE Leaks Before Launch!"](https://www.youtube.com/watch?v=e9Nab1zYBF0)

TT Technology: ["Apple iPhone 17 Air - I...

(QB_NEW_EN)

[grammar] ~81-~81: There might be a mistake here.
Context: ...: "Apple iPhone 17 Air - Its Official!"

Web Reporting:
- 9to5Mac: ["iPhone ...

(QB_NEW_EN)

[grammar] ~82-~82: There might be a mistake here.
Context: ...watch?v=Cc9OcxBJPoo)

Web Reporting:
- 9to5Mac: ["iPhone 17 release date: Here’...

(QB_NEW_EN)

[grammar] ~84-~84: There might be a mistake here.
Context: ...tegic positioning and market challenges.

(QB_NEW_EN)

brightdata_deep_researcher/README.md

[grammar] ~7-~7: There might be a mistake here.
Context: ...data.com/ai/mcp-server) (Web MCP server)

CrewAI (Agent...

(QB_NEW_EN)

[grammar] ~8-~8: There might be a mistake here.
Context: ...tps://docs.crewai.com/) (Agentic design)

Streamlit to wra...

(QB_NEW_EN)

[style] ~42-~42: Consider a different adjective to strengthen your wording.
Context: ...andle the multi-agent orchestration for deep web research using Brightdata's Web MCP...

(DEEP_PROFOUND)

[grammar] ~54-~54: There might be a mistake here.
Context: ... ## 📬 Stay Updated with Our Newsletter!

Get a FREE Data Science eBook 📖 with ...

(QB_NEW_EN)

[grammar] ~56-~56: There might be a mistake here.
Context: ... 📖 with 150+ essential lessons in Data Science when you subscribe to our newsletter! S...

(QB_NEW_EN)

[style] ~62-~62: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 1750 characters long)
Context: ...ontribution

Contributions are welcome! Feel free to fork this repository and s...

(EN_EXCESSIVE_EXCLAMATION)

[style] ~62-~62: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...tribution

Contributions are welcome! Feel free to fork this repository and submit pull re...

(FEEL_FREE_TO_STYLE_ME)

🔇 Additional comments (5)

brightdata_deep_researcher/output.md (1)

60-61: Flag speculative vendor-integration claim.

“Apple Intelligence with Google Gemini” reads as speculation. Either cite a concrete source or rephrase as rumor.

Would you like me to update this line to “rumored” and attach a specific cited report?

brightdata_deep_researcher/.env.example (1)

1-3: GEMINI_API_KEY is the required env var for CrewAI’s Gemini provider No update needed.

brightdata_deep_researcher/README.md (1)

17-23: Clarify .env location.

You say “root directory,” but this package lives under brightdata_deep_researcher/. Specify where the app loads .env from to avoid confusion.

Would you like me to update the README to “create .env at repository root and at runtime we call load_dotenv() from the working directory” (or move load to an explicit path)?

brightdata_deep_researcher/app.py (1)

44-61: Nice, clean Streamlit header and reset UX.

brightdata_deep_researcher/flow.py (1)

236-268: Good synthesis stage and markdown output.

coderabbitai · 2025-09-06T10:23:16Z

brightdata_deep_researcher/.env.example

+OPENAI_API_KEY=
+BRIGHT_DATA_API_TOKEN=
+GEMINI_API_KEY=


💡 Verification agent

🧩 Analysis chain

Add guidance and ensure .env is ignored.

Add comments explaining each key and verify .env is git-ignored to prevent leaks.

Proposed:

+# OpenAI API key for GPT models (used by CrewAI LLM provider "openai/*") OPENAI_API_KEY= -BRIGHT_DATA_API_TOKEN= +# Bright Data Web MCP server token +BRIGHT_DATA_API_TOKEN= +# Gemini API key (used by CrewAI LLM provider "gemini/*") GEMINI_API_KEY=

Check ignore:

🏁 Script executed:

#!/bin/bash set -euo pipefail { rg -nP '^\s*\.env\s*$' -g .gitignore -S || true; } | sed 's/^/FOUND: /' echo "If not found above, please add '.env' to .gitignore."

Length of output: 823

Add explanatory comments to .env.example and ensure .env is ignored

In brightdata_deep_researcher/.env.example, prepend each key with a comment:
# OpenAI API key for GPT models (used by CrewAI LLM provider "openai/*") OPENAI_API_KEY= # Bright Data Web MCP server token BRIGHT_DATA_API_TOKEN= # Gemini API key (used by CrewAI LLM provider "gemini/*") GEMINI_API_KEY=

Add the line .env to brightdata_deep_researcher/.gitignore to prevent committing secrets.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In brightdata_deep_researcher/.env.example lines 1-3, add short explanatory comments above each variable describing what value should be placed (e.g., OpenAI API key, Bright Data API token, Gemini API key) so contributors know what to set; and in brightdata_deep_researcher/.gitignore add a line containing exactly ".env" to ensure local environment secrets are not committed.

coderabbitai · 2025-09-06T10:23:17Z

brightdata_deep_researcher/app.py

+
+import streamlit as st
+
+from research import run_deep_research


🛠️ Refactor suggestion

Fix import to use the package/relative path.

This will fail when app.py lives inside the brightdata_deep_researcher package. Import from the package to avoid shadowing/Path issues.

-from research import run_deep_research +from .research import run_deep_research +# or, if running as a script outside the package +# from brightdata_deep_researcher.research import run_deep_research

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from research import run_deep_research

from .research import run_deep_research

# or, if running as a script outside the package

# from brightdata_deep_researcher.research import run_deep_research

🤖 Prompt for AI Agents

In brightdata_deep_researcher/app.py around line 6, the top-level import "from research import run_deep_research" can break when this file is inside the brightdata_deep_researcher package; replace it with a package-aware import such as a relative import ("from .research import run_deep_research") or an absolute package import ("from brightdata_deep_researcher.research import run_deep_research") so Python resolves the module from the package rather than the global path.

coderabbitai · 2025-09-06T10:23:17Z

brightdata_deep_researcher/app.py

+    brightdata_api_key = st.text_input("Enter your Brightdata API Key", type="password")
+    if brightdata_api_key:
+        st.session_state.brightdata_api_key = brightdata_api_key
+        # Update the environment variable
+        os.environ["BRIGHT_DATA_API_TOKEN"] = brightdata_api_key
+        st.success("API Key stored successfully!")


💡 Verification agent

🧩 Analysis chain

Env var timing bug: token set after module import.

flow.py reads BRIGHT_DATA_API_TOKEN at import time; setting it here won’t affect already-imported module-level config. I’m proposing a fix in flow.py to lazily read the token per call. No change needed here once that lands.

Run the app after applying the flow.py fix; verify the sidebar token enables successful searches on first run.

🏁 Script executed:

#!/bin/bash # Locate flow.py in the repository fd flow.py # Search for direct reads of BRIGHT_DATA_API_TOKEN rg -n "BRIGHT_DATA_API_TOKEN" # Search for any os.getenv or environ[] usage patterns rg -n "os\.environ" rg -n "getenv"

Length of output: 13570

🏁 Script executed:

#!/bin/bash sed -n '1,200p' brightdata_deep_researcher/flow.py

Length of output: 8819

Lazy-load BRIGHT_DATA_API_TOKEN in flow.py
The global server_configurations is initialized at import with os.getenv("BRIGHT_DATA_API_TOKEN") (brightdata_deep_researcher/flow.py), so when the app later sets the env var in app.py, MCP still uses the old (empty) token on first run. Move the env={"API_TOKEN": …} assignment into the call site (e.g. inside DeepResearchFlow.collect_urls and dispatch_to_specialists) so it reads os.getenv("BRIGHT_DATA_API_TOKEN") at runtime.

coderabbitai · 2025-09-06T10:23:17Z

brightdata_deep_researcher/flow.py

+server_configurations = StdioServerParameters(
+    command="npx",
+    args=["@brightdata/mcp"],
+    env={"API_TOKEN": os.getenv("BRIGHT_DATA_API_TOKEN"), "PRO_MODE": "true"},
+)


⚠️ Potential issue

Read API token lazily; don’t capture env at import-time.

Module-level capture makes sidebar token ineffective. Build StdioServerParameters at call-time and fail fast if missing.

-# ---------- MCP Server Configurations ---------- -server_configurations = StdioServerParameters( - command="npx", - args=["@brightdata/mcp"], - env={"API_TOKEN": os.getenv("BRIGHT_DATA_API_TOKEN"), "PRO_MODE": "true"}, -) +# ---------- MCP Server Configurations ---------- +def server_params() -> StdioServerParameters: + token = os.getenv("BRIGHT_DATA_API_TOKEN") + if not token: + raise RuntimeError("BRIGHT_DATA_API_TOKEN is not set") + return StdioServerParameters( + command="npx", + args=["@brightdata/mcp"], + env={"API_TOKEN": token, "PRO_MODE": "true"}, + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

server_configurations = StdioServerParameters(

command="npx",

args=["@brightdata/mcp"],

env={"API_TOKEN": os.getenv("BRIGHT_DATA_API_TOKEN"), "PRO_MODE": "true"},

)

# ---------- MCP Server Configurations ----------

def server_params() -> StdioServerParameters:

token = os.getenv("BRIGHT_DATA_API_TOKEN")

if not token:

raise RuntimeError("BRIGHT_DATA_API_TOKEN is not set")

return StdioServerParameters(

command="npx",

args=["@brightdata/mcp"],

env={"API_TOKEN": token, "PRO_MODE": "true"},

)

🤖 Prompt for AI Agents

In brightdata_deep_researcher/flow.py around lines 42 to 46, the code currently captures the API token from the environment at module import by creating server_configurations with env={"API_TOKEN": os.getenv("BRIGHT_DATA_API_TOKEN"), ...}; change this to construct and return the StdioServerParameters at call-time (inside the function that starts the server) so the env var is read lazily, and add a fast-fail check that raises a clear error if BRIGHT_DATA_API_TOKEN is missing or empty before creating StdioServerParameters.

coderabbitai · 2025-09-06T10:23:17Z

brightdata_deep_researcher/flow.py

+    def collect_urls(self) -> Dict[str, Any]:
+        """Search web for user query and return URLBuckets object."""
+        try:
+            with MCPServerAdapter(server_configurations) as mcp_tools:
+                search_agent = Agent(
+                    role="Multiplatform Web Discovery Specialist",
+                    goal=(
+                        "Your objective is to identify and return a well-organized JSON object containing only public, directly relevant links for a given user query. "
+                        "The links should be grouped by platform: Instagram, LinkedIn, YouTube, X (formerly Twitter), and the open web."
+                    ),
+                    backstory=(
+                        "You are an expert web researcher skilled in using advanced search operators and platform-specific techniques. "
+                        "You rigorously verify that every link is public, accessible, and highly relevant to the query. "
+                        "You never include duplicates or irrelevant results, and you never fabricate information. "
+                        "If no suitable links are found for a platform, you return an empty list for that platform. "
+                        "Your output is always precise, clean, and strictly follows the required schema."
+                    ),
+                    tools=[mcp_tools["search_engine"]],
+                    llm=self.search_llm,
+                )
+


🛠️ Refactor suggestion

Ensure tool adapter uses fresh env; keep contract on error.

Use the lazy server_params() and return an empty URLBuckets on failure so downstream steps don’t KeyError.

- try: - with MCPServerAdapter(server_configurations) as mcp_tools: + try: + with MCPServerAdapter(server_params()) as mcp_tools: ... - out: URLBuckets = crew.kickoff() - return {"urls_buckets": out.model_dump(mode="raw")} + out: URLBuckets = crew.kickoff() + return {"urls_buckets": out.model_dump(mode="raw")} except Exception as e: - return {"result": f"Error processing web search task: {e}"} + logging.exception("collect_urls failed") + empty = URLBuckets().model_dump(mode="raw") + return {"urls_buckets": empty, "error": f"{e!s}"}

Also applies to: 112-118

🤖 Prompt for AI Agents

In brightdata_deep_researcher/flow.py around lines 62-82 (and similarly 112-118), the MCPServerAdapter is instantiated with a static server_configurations which can stale env; instead call the lazy server_params() inside the try so the adapter gets fresh environment values, and wrap adapter creation/agent setup in the try/except so any failure returns an empty URLBuckets instance (preserve the function's return contract to avoid downstream KeyError). Ensure the except block logs the error and returns URLBuckets with empty lists for all platforms rather than raising.

coderabbitai · 2025-09-06T10:23:18Z

brightdata_deep_researcher/pyproject.toml

+dependencies = [
+    "crewai-tools[mcp]>=0.69.0",
+    "mcp>=1.13.1",
+    "ollama>=0.5.3",
+    "python-dotenv>=1.1.1",
+    "streamlit>=1.49.1",
+]


⚠️ Potential issue

Missing core dependency: add crewai.

Agent, Task, and Crew come from the crewai package, not crewai-tools. Without crewai, runtime will fail when importing the flow. Please add it.

Apply:

dependencies = [ - "crewai-tools[mcp]>=0.69.0", + "crewai>=0.75.0", + "crewai-tools[mcp]>=0.69.0", "mcp>=1.13.1", "ollama>=0.5.3", "python-dotenv>=1.1.1", "streamlit>=1.49.1", ]

🤖 Prompt for AI Agents

In brightdata_deep_researcher/pyproject.toml around lines 7 to 13, the pyproject is missing the core runtime dependency "crewai" (Agent, Task, Crew are provided by crewai, not crewai-tools); update the dependencies list to include crewai (e.g., add "crewai" or "crewai>=<minimum_version>" alongside the existing entries) so imports succeed at runtime.

coderabbitai · 2025-09-06T10:23:18Z

brightdata_deep_researcher/README.md

+```bash
+python flow.py
+```
+


🛠️ Refactor suggestion

Fix run commands to work from repo root.

These files are inside the package folder; the current commands will fail from root.

-python flow.py +python -m brightdata_deep_researcher.flow

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

```bash

python flow.py

```

🤖 Prompt for AI Agents

In brightdata_deep_researcher/README.md around lines 38-41, the example run command uses "python flow.py" which fails when run from the repository root because the file lives inside the package folder; update the README to show a repo-root-safe invocation such as using the module form (python -m brightdata_deep_researcher.flow) or the explicit relative path (python ./brightdata_deep_researcher/flow.py), and adjust any other run examples similarly so they work when executed from the project root.

coderabbitai · 2025-09-06T10:23:18Z

brightdata_deep_researcher/README.md

+```bash
+streamlit run app.py
+```


🛠️ Refactor suggestion

Fix Streamlit run path.

Make the path explicit for a reliable DX.

-streamlit run app.py +streamlit run brightdata_deep_researcher/app.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

```bash

streamlit run app.py

```

🤖 Prompt for AI Agents

In brightdata_deep_researcher/README.md around lines 48 to 50, the Streamlit run command is not path-explicit; replace the simple "streamlit run app.py" with an explicit path to the app file (for example "streamlit run brightdata_deep_researcher/app.py" or the project-relative path that matches the repo layout) and add a short note about the expected working directory if necessary so the command works reliably from the repository root.

coderabbitai · 2025-09-06T10:23:18Z

brightdata_deep_researcher/research.py

@@ -0,0 +1,12 @@
+from flow import DeepResearchFlow


⚠️ Potential issue

Fix import to be package-safe.

Absolute from flow import ... will break when installed as a package. Use a relative import.

-from flow import DeepResearchFlow +from .flow import DeepResearchFlow

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from flow import DeepResearchFlow

from .flow import DeepResearchFlow

🤖 Prompt for AI Agents

In brightdata_deep_researcher/research.py around line 1, the file uses an absolute import "from flow import DeepResearchFlow" which will break when the project is installed as a package; change it to a package-safe relative import by replacing the absolute import with "from .flow import DeepResearchFlow" (use a single leading dot to import from the same package) and run the module/package to verify the import resolves.

coderabbitai · 2025-09-06T10:23:18Z

brightdata_deep_researcher/research.py

+async def run_deep_research(prompt):
+    """Run the deep research flow and return the result."""
+    try:
+        flow = DeepResearchFlow()
+        flow.state.query = prompt
+        result = await flow.kickoff_async()
+        return result["result"]
+    except Exception as e:
+        return f"An error occurred: {str(e)}"


🛠️ Refactor suggestion

Type hints and resilient result extraction.

Annotate the API and guard against missing result key to avoid KeyError.

-async def run_deep_research(prompt): - """Run the deep research flow and return the result.""" +from typing import Any, Dict + +async def run_deep_research(prompt: str) -> str: + """Run the deep research flow and return the markdown result.""" try: flow = DeepResearchFlow() flow.state.query = prompt - result = await flow.kickoff_async() - return result["result"] + payload: Dict[str, Any] = await flow.kickoff_async() + return str(payload.get("result", "No result returned")) except Exception as e: - return f"An error occurred: {str(e)}" + return f"An error occurred: {e!s}"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def run_deep_research(prompt):

"""Run the deep research flow and return the result."""

try:

flow = DeepResearchFlow()

flow.state.query = prompt

result = await flow.kickoff_async()

return result["result"]

except Exception as e:

return f"An error occurred: {str(e)}"

from typing import Any, Dict

async def run_deep_research(prompt: str) -> str:

"""Run the deep research flow and return the markdown result."""

try:

flow = DeepResearchFlow()

flow.state.query = prompt

payload: Dict[str, Any] = await flow.kickoff_async()

return str(payload.get("result", "No result returned"))

except Exception as e:

return f"An error occurred: {e!s}"

🧰 Tools

🪛 Ruff (0.12.2)

11-11: Do not catch blind exception: Exception

(BLE001)

12-12: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents

In brightdata_deep_researcher/research.py around lines 4 to 12, add proper type hints to the async function signature (e.g., annotate prompt as str and the return type as Union[str, Any] or Optional[Any]) and change the result extraction to be resilient to a missing "result" key by using result.get("result") (or checking 'result' in result) and handling the None case explicitly (return a clear error string or a fallback value). Also keep the try/except but ensure the exception return type matches the annotated return type.

add initial brightdata web mcp demo

c89aa5e

coderabbitai bot reviewed Sep 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Brightdata Deep Web Researcher #192

Add Brightdata Deep Web Researcher #192

Uh oh!

sitammeur commented Sep 6, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 6, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

coderabbitai bot Sep 6, 2025

Uh oh!

Uh oh!


		import streamlit as st

		from research import run_deep_research

-from research import run_deep_research
+from .research import run_deep_research
+# or, if running as a script outside the package
+# from brightdata_deep_researcher.research import run_deep_research

-server_configurations = StdioServerParameters(
-    command="npx",
-    args=["@brightdata/mcp"],
-    env={"API_TOKEN": os.getenv("BRIGHT_DATA_API_TOKEN"), "PRO_MODE": "true"},
-)
+# ---------- MCP Server Configurations ----------
+def server_params() -> StdioServerParameters:
+    token = os.getenv("BRIGHT_DATA_API_TOKEN")
+    if not token:
+        raise RuntimeError("BRIGHT_DATA_API_TOKEN is not set")
+    return StdioServerParameters(
+        command="npx",
+        args=["@brightdata/mcp"],
+        env={"API_TOKEN": token, "PRO_MODE": "true"},
+    )

	from flow import DeepResearchFlow
	from .flow import DeepResearchFlow

Add Brightdata Deep Web Researcher #192

Are you sure you want to change the base?

Add Brightdata Deep Web Researcher #192

Uh oh!

Conversation

sitammeur commented Sep 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sitammeur commented Sep 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 6, 2025 •

edited

Loading