diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 0000000..a1f9be9 --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,25 @@ +# License + +DeepCritical is licensed under the MIT License. + +## MIT License + +Copyright (c) 2024 DeepCritical Team + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/docs/api/orchestrators.md b/docs/api/orchestrators.md index 0e3c663..9e3d22d 100644 --- a/docs/api/orchestrators.md +++ b/docs/api/orchestrators.md @@ -137,4 +137,4 @@ Runs Magentic orchestration. ## See Also - [Architecture - Orchestrators](../architecture/orchestrators.md) - Architecture overview -- [Graph Orchestration](../architecture/graph-orchestration.md) - Graph execution details +- [Graph Orchestration](../architecture/graph_orchestration.md) - Graph execution details diff --git a/mkdocs.yml b/mkdocs.yml index 1d3d31b..cd3cee7 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -59,6 +59,8 @@ plugins: - codeinclude - git-revision-date-localized: enable_creation_date: true + enable_git_follow: false # Disable follow to avoid timestamp ordering issues + strict: false # Bypass warnings about timestamp ordering issues type: timeago # Shows "2 days ago" format fallback_to_build_date: true - minify: diff --git a/site/api/agents/index.html b/site/api/agents/index.html index 5f6acf4..920ce21 100644 --- a/site/api/agents/index.html +++ b/site/api/agents/index.html @@ -1 +1 @@ - Agents API Reference - The DETERMINATOR
Skip to content

Agents API Reference

This page documents the API for DeepCritical agents.

KnowledgeGapAgent

Module: src.agents.knowledge_gap

Purpose: Evaluates research state and identifies knowledge gaps.

Methods

evaluate

Evaluates research completeness and identifies outstanding knowledge gaps.

Parameters: - query: Research query string - background_context: Background context for the query (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "") - iteration: Current iteration number (default: 0) - time_elapsed_minutes: Elapsed time in minutes (default: 0.0) - max_time_minutes: Maximum time limit in minutes (default: 10)

Returns: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps

ToolSelectorAgent

Module: src.agents.tool_selector

Purpose: Selects appropriate tools for addressing knowledge gaps.

Methods

select_tools

Selects tools for addressing a knowledge gap.

Parameters: - gap: The knowledge gap to address - query: Research query string - background_context: Optional background context (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "")

Returns: AgentSelectionPlan with list of AgentTask objects.

WriterAgent

Module: src.agents.writer

Purpose: Generates final reports from research findings.

Methods

write_report

Generates a markdown report from research findings.

Parameters: - query: Research query string - findings: Research findings to include in report - output_length: Optional description of desired output length (default: "") - output_instructions: Optional additional instructions for report generation (default: "")

Returns: Markdown string with numbered citations.

LongWriterAgent

Module: src.agents.long_writer

Purpose: Long-form report generation with section-by-section writing.

Methods

write_next_section

Writes the next section of a long-form report.

Parameters: - original_query: The original research query - report_draft: Current report draft as string (all sections written so far) - next_section_title: Title of the section to write - next_section_draft: Draft content for the next section

Returns: LongWriterOutput with formatted section and references.

write_report

Generates final report from draft.

Parameters: - query: Research query string - report_title: Title of the report - report_draft: Complete report draft

Returns: Final markdown report string.

ProofreaderAgent

Module: src.agents.proofreader

Purpose: Proofreads and polishes report drafts.

Methods

proofread

Proofreads and polishes a report draft.

Parameters: - query: Research query string - report_title: Title of the report - report_draft: Report draft to proofread

Returns: Polished markdown string.

ThinkingAgent

Module: src.agents.thinking

Purpose: Generates observations from conversation history.

Methods

generate_observations

Generates observations from conversation history.

Parameters: - query: Research query string - background_context: Optional background context (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "") - iteration: Current iteration number (default: 1)

Returns: Observation string.

InputParserAgent

Module: src.agents.input_parser

Purpose: Parses and improves user queries, detects research mode.

Methods

parse

Parses and improves a user query.

Parameters: - query: Original query string

Returns: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questions

Factory Functions

All agents have factory functions in src.agent_factory.agents:

Parameters: - model: Optional Pydantic AI model. If None, uses get_model() from settings. - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)

Returns: Agent instance.

See Also

\ No newline at end of file + Agents API Reference - The DETERMINATOR
Skip to content

Agents API Reference

This page documents the API for DeepCritical agents.

KnowledgeGapAgent

Module: src.agents.knowledge_gap

Purpose: Evaluates research state and identifies knowledge gaps.

Methods

evaluate

Evaluates research completeness and identifies outstanding knowledge gaps.

Parameters: - query: Research query string - background_context: Background context for the query (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "") - iteration: Current iteration number (default: 0) - time_elapsed_minutes: Elapsed time in minutes (default: 0.0) - max_time_minutes: Maximum time limit in minutes (default: 10)

Returns: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps

ToolSelectorAgent

Module: src.agents.tool_selector

Purpose: Selects appropriate tools for addressing knowledge gaps.

Methods

select_tools

Selects tools for addressing a knowledge gap.

Parameters: - gap: The knowledge gap to address - query: Research query string - background_context: Optional background context (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "")

Returns: AgentSelectionPlan with list of AgentTask objects.

WriterAgent

Module: src.agents.writer

Purpose: Generates final reports from research findings.

Methods

write_report

Generates a markdown report from research findings.

Parameters: - query: Research query string - findings: Research findings to include in report - output_length: Optional description of desired output length (default: "") - output_instructions: Optional additional instructions for report generation (default: "")

Returns: Markdown string with numbered citations.

LongWriterAgent

Module: src.agents.long_writer

Purpose: Long-form report generation with section-by-section writing.

Methods

write_next_section

Writes the next section of a long-form report.

Parameters: - original_query: The original research query - report_draft: Current report draft as string (all sections written so far) - next_section_title: Title of the section to write - next_section_draft: Draft content for the next section

Returns: LongWriterOutput with formatted section and references.

write_report

Generates final report from draft.

Parameters: - query: Research query string - report_title: Title of the report - report_draft: Complete report draft

Returns: Final markdown report string.

ProofreaderAgent

Module: src.agents.proofreader

Purpose: Proofreads and polishes report drafts.

Methods

proofread

Proofreads and polishes a report draft.

Parameters: - query: Research query string - report_title: Title of the report - report_draft: Report draft to proofread

Returns: Polished markdown string.

ThinkingAgent

Module: src.agents.thinking

Purpose: Generates observations from conversation history.

Methods

generate_observations

Generates observations from conversation history.

Parameters: - query: Research query string - background_context: Optional background context (default: "") - conversation_history: History of actions, findings, and thoughts as string (default: "") - iteration: Current iteration number (default: 1)

Returns: Observation string.

InputParserAgent

Module: src.agents.input_parser

Purpose: Parses and improves user queries, detects research mode.

Methods

parse

Parses and improves a user query.

Parameters: - query: Original query string

Returns: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questions

Factory Functions

All agents have factory functions in src.agent_factory.agents:

Parameters: - model: Optional Pydantic AI model. If None, uses get_model() from settings. - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)

Returns: Agent instance.

See Also

\ No newline at end of file diff --git a/site/api/models/index.html b/site/api/models/index.html index 863fc63..2373645 100644 --- a/site/api/models/index.html +++ b/site/api/models/index.html @@ -1 +1 @@ - Models API Reference - The DETERMINATOR
Skip to content

Models API Reference

This page documents the Pydantic models used throughout DeepCritical.

Evidence

Module: src.utils.models

Purpose: Represents evidence from search results.

Fields: - citation: Citation information (title, URL, date, authors) - content: Evidence text content - relevance: Relevance score (0.0-1.0) - metadata: Additional metadata dictionary

Citation

Module: src.utils.models

Purpose: Citation information for evidence.

Fields: - source: Source name (e.g., "pubmed", "clinicaltrials", "europepmc", "web", "rag") - title: Article/trial title - url: Source URL - date: Publication date (YYYY-MM-DD or "Unknown") - authors: List of authors (optional)

KnowledgeGapOutput

Module: src.utils.models

Purpose: Output from knowledge gap evaluation.

Fields: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps

AgentSelectionPlan

Module: src.utils.models

Purpose: Plan for tool/agent selection.

Fields: - tasks: List of agent tasks to execute

AgentTask

Module: src.utils.models

Purpose: Individual agent task.

Fields: - gap: The knowledge gap being addressed (optional) - agent: Name of agent to use - query: The specific query for the agent - entity_website: The website of the entity being researched, if known (optional)

ReportDraft

Module: src.utils.models

Purpose: Draft structure for long-form reports.

Fields: - sections: List of report sections

ReportSection

Module: src.utils.models

Purpose: Individual section in a report draft.

Fields: - section_title: The title of the section - section_content: The content of the section

ParsedQuery

Module: src.utils.models

Purpose: Parsed and improved query.

Fields: - original_query: Original query string - improved_query: Refined query string - research_mode: Research mode ("iterative" or "deep") - key_entities: List of key entities - research_questions: List of research questions

Conversation

Module: src.utils.models

Purpose: Conversation history with iterations.

Fields: - history: List of iteration data

IterationData

Module: src.utils.models

Purpose: Data for a single iteration.

Fields: - gap: The gap addressed in the iteration - tool_calls: The tool calls made - findings: The findings collected from tool calls - thought: The thinking done to reflect on the success of the iteration and next steps

AgentEvent

Module: src.utils.models

Purpose: Event emitted during research execution.

Fields: - type: Event type (e.g., "started", "search_complete", "complete") - iteration: Iteration number (optional) - data: Event data dictionary

BudgetStatus

Module: src.utils.models

Purpose: Current budget status.

Fields: - tokens_used: Total tokens used - tokens_limit: Token budget limit - time_elapsed_seconds: Time elapsed in seconds - time_limit_seconds: Time budget limit (default: 600.0 seconds / 10 minutes) - iterations: Number of iterations completed - iterations_limit: Maximum iterations (default: 10) - iteration_tokens: Tokens used per iteration (iteration number -> token count)

See Also

\ No newline at end of file + Models API Reference - The DETERMINATOR
Skip to content

Models API Reference

This page documents the Pydantic models used throughout DeepCritical.

Evidence

Module: src.utils.models

Purpose: Represents evidence from search results.

Fields: - citation: Citation information (title, URL, date, authors) - content: Evidence text content - relevance: Relevance score (0.0-1.0) - metadata: Additional metadata dictionary

Citation

Module: src.utils.models

Purpose: Citation information for evidence.

Fields: - source: Source name (e.g., "pubmed", "clinicaltrials", "europepmc", "web", "rag") - title: Article/trial title - url: Source URL - date: Publication date (YYYY-MM-DD or "Unknown") - authors: List of authors (optional)

KnowledgeGapOutput

Module: src.utils.models

Purpose: Output from knowledge gap evaluation.

Fields: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps

AgentSelectionPlan

Module: src.utils.models

Purpose: Plan for tool/agent selection.

Fields: - tasks: List of agent tasks to execute

AgentTask

Module: src.utils.models

Purpose: Individual agent task.

Fields: - gap: The knowledge gap being addressed (optional) - agent: Name of agent to use - query: The specific query for the agent - entity_website: The website of the entity being researched, if known (optional)

ReportDraft

Module: src.utils.models

Purpose: Draft structure for long-form reports.

Fields: - sections: List of report sections

ReportSection

Module: src.utils.models

Purpose: Individual section in a report draft.

Fields: - section_title: The title of the section - section_content: The content of the section

ParsedQuery

Module: src.utils.models

Purpose: Parsed and improved query.

Fields: - original_query: Original query string - improved_query: Refined query string - research_mode: Research mode ("iterative" or "deep") - key_entities: List of key entities - research_questions: List of research questions

Conversation

Module: src.utils.models

Purpose: Conversation history with iterations.

Fields: - history: List of iteration data

IterationData

Module: src.utils.models

Purpose: Data for a single iteration.

Fields: - gap: The gap addressed in the iteration - tool_calls: The tool calls made - findings: The findings collected from tool calls - thought: The thinking done to reflect on the success of the iteration and next steps

AgentEvent

Module: src.utils.models

Purpose: Event emitted during research execution.

Fields: - type: Event type (e.g., "started", "search_complete", "complete") - iteration: Iteration number (optional) - data: Event data dictionary

BudgetStatus

Module: src.utils.models

Purpose: Current budget status.

Fields: - tokens_used: Total tokens used - tokens_limit: Token budget limit - time_elapsed_seconds: Time elapsed in seconds - time_limit_seconds: Time budget limit (default: 600.0 seconds / 10 minutes) - iterations: Number of iterations completed - iterations_limit: Maximum iterations (default: 10) - iteration_tokens: Tokens used per iteration (iteration number -> token count)

See Also

\ No newline at end of file diff --git a/site/api/orchestrators/index.html b/site/api/orchestrators/index.html index 2398678..756118e 100644 --- a/site/api/orchestrators/index.html +++ b/site/api/orchestrators/index.html @@ -1 +1 @@ - Orchestrators API Reference - The DETERMINATOR
Skip to content

Orchestrators API Reference

This page documents the API for DeepCritical orchestrators.

IterativeResearchFlow

Module: src.orchestrator.research_flow

Purpose: Single-loop research with search-judge-synthesize cycles.

Methods

run

Runs iterative research flow.

Parameters: - query: Research query string - background_context: Background context (default: "") - output_length: Optional description of desired output length (default: "") - output_instructions: Optional additional instructions for report generation (default: "")

Returns: Final report string.

Note: max_iterations, max_time_minutes, and token_budget are constructor parameters, not run() parameters.

DeepResearchFlow

Module: src.orchestrator.research_flow

Purpose: Multi-section parallel research with planning and synthesis.

Methods

run

Runs deep research flow.

Parameters: - query: Research query string

Returns: Final report string.

Note: max_iterations_per_section, max_time_minutes, and token_budget are constructor parameters, not run() parameters.

GraphOrchestrator

Module: src.orchestrator.graph_orchestrator

Purpose: Graph-based execution using Pydantic AI agents as nodes.

Methods

run

Runs graph-based research orchestration.

Parameters: - query: Research query string

Yields: AgentEvent objects during graph execution.

Note: research_mode and use_graph are constructor parameters, not run() parameters.

Orchestrator Factory

Module: src.orchestrator_factory

Purpose: Factory for creating orchestrators.

Functions

create_orchestrator

Creates an orchestrator instance.

Parameters: - search_handler: Search handler protocol implementation (optional, required for simple mode) - judge_handler: Judge handler protocol implementation (optional, required for simple mode) - config: Configuration object (optional) - mode: Orchestrator mode ("simple", "advanced", "magentic", "iterative", "deep", "auto", or None for auto-detect) - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)

Returns: Orchestrator instance.

Raises: - ValueError: If requirements not met

Modes: - "simple": Legacy orchestrator - "advanced" or "magentic": Magentic orchestrator (requires OpenAI API key) - None: Auto-detect based on API key availability

MagenticOrchestrator

Module: src.orchestrator_magentic

Purpose: Multi-agent coordination using Microsoft Agent Framework.

Methods

run

Runs Magentic orchestration.

Parameters: - query: Research query string

Yields: AgentEvent objects converted from Magentic events.

Note: max_rounds and max_stalls are constructor parameters, not run() parameters.

Requirements: - agent-framework-core package - OpenAI API key

See Also

\ No newline at end of file + Orchestrators API Reference - The DETERMINATOR
Skip to content

Orchestrators API Reference

This page documents the API for DeepCritical orchestrators.

IterativeResearchFlow

Module: src.orchestrator.research_flow

Purpose: Single-loop research with search-judge-synthesize cycles.

Methods

run

Runs iterative research flow.

Parameters: - query: Research query string - background_context: Background context (default: "") - output_length: Optional description of desired output length (default: "") - output_instructions: Optional additional instructions for report generation (default: "")

Returns: Final report string.

Note: max_iterations, max_time_minutes, and token_budget are constructor parameters, not run() parameters.

DeepResearchFlow

Module: src.orchestrator.research_flow

Purpose: Multi-section parallel research with planning and synthesis.

Methods

run

Runs deep research flow.

Parameters: - query: Research query string

Returns: Final report string.

Note: max_iterations_per_section, max_time_minutes, and token_budget are constructor parameters, not run() parameters.

GraphOrchestrator

Module: src.orchestrator.graph_orchestrator

Purpose: Graph-based execution using Pydantic AI agents as nodes.

Methods

run

Runs graph-based research orchestration.

Parameters: - query: Research query string

Yields: AgentEvent objects during graph execution.

Note: research_mode and use_graph are constructor parameters, not run() parameters.

Orchestrator Factory

Module: src.orchestrator_factory

Purpose: Factory for creating orchestrators.

Functions

create_orchestrator

Creates an orchestrator instance.

Parameters: - search_handler: Search handler protocol implementation (optional, required for simple mode) - judge_handler: Judge handler protocol implementation (optional, required for simple mode) - config: Configuration object (optional) - mode: Orchestrator mode ("simple", "advanced", "magentic", "iterative", "deep", "auto", or None for auto-detect) - oauth_token: Optional OAuth token from HuggingFace login (takes priority over env vars)

Returns: Orchestrator instance.

Raises: - ValueError: If requirements not met

Modes: - "simple": Legacy orchestrator - "advanced" or "magentic": Magentic orchestrator (requires OpenAI API key) - None: Auto-detect based on API key availability

MagenticOrchestrator

Module: src.orchestrator_magentic

Purpose: Multi-agent coordination using Microsoft Agent Framework.

Methods

run

Runs Magentic orchestration.

Parameters: - query: Research query string

Yields: AgentEvent objects converted from Magentic events.

Note: max_rounds and max_stalls are constructor parameters, not run() parameters.

Requirements: - agent-framework-core package - OpenAI API key

See Also

\ No newline at end of file diff --git a/site/api/services/index.html b/site/api/services/index.html index 1207a98..a231d58 100644 --- a/site/api/services/index.html +++ b/site/api/services/index.html @@ -46,4 +46,4 @@ evidence: list[Evidence], hypothesis: dict[str, Any] | None = None ) -> AnalysisResult -

Analyzes a research question using statistical methods.

Parameters: - query: The research question - evidence: List of Evidence objects to analyze - hypothesis: Optional hypothesis dict with drug, target, pathway, effect, confidence keys

Returns: AnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - confidence: Confidence in verdict (0.0-1.0) - statistical_evidence: Summary of statistical findings - code_generated: Python code that was executed - execution_output: Output from code execution - key_takeaways: Key takeaways from analysis - limitations: List of limitations

Note: Requires Modal credentials for sandbox execution.

See Also

\ No newline at end of file +

Analyzes a research question using statistical methods.

Parameters: - query: The research question - evidence: List of Evidence objects to analyze - hypothesis: Optional hypothesis dict with drug, target, pathway, effect, confidence keys

Returns: AnalysisResult with: - verdict: SUPPORTED, REFUTED, or INCONCLUSIVE - confidence: Confidence in verdict (0.0-1.0) - statistical_evidence: Summary of statistical findings - code_generated: Python code that was executed - execution_output: Output from code execution - key_takeaways: Key takeaways from analysis - limitations: List of limitations

Note: Requires Modal credentials for sandbox execution.

See Also

\ No newline at end of file diff --git a/site/api/tools/index.html b/site/api/tools/index.html index b7d76db..d88763a 100644 --- a/site/api/tools/index.html +++ b/site/api/tools/index.html @@ -48,4 +48,4 @@ auto_ingest_to_rag: bool = True, oauth_token: str | None = None ) -> None -

Parameters: - tools: List of search tools to use - timeout: Timeout for each search in seconds (default: 30.0) - include_rag: Whether to include RAG tool in searches (default: False) - auto_ingest_to_rag: Whether to automatically ingest results into RAG (default: True) - oauth_token: Optional OAuth token from HuggingFace login (for RAG LLM)

Methods

execute

Searches multiple tools in parallel.

Parameters: - query: Search query string - max_results_per_tool: Maximum results per tool (default: 10)

Returns: SearchResult with: - query: The search query - evidence: Aggregated list of evidence - sources_searched: List of source names searched - total_found: Total number of results - errors: List of error messages from failed tools

Raises: - SearchError: If search times out

Note: Uses asyncio.gather() for parallel execution. Handles tool failures gracefully (returns errors in SearchResult.errors). Automatically ingests evidence into RAG if enabled.

See Also

\ No newline at end of file +

Parameters: - tools: List of search tools to use - timeout: Timeout for each search in seconds (default: 30.0) - include_rag: Whether to include RAG tool in searches (default: False) - auto_ingest_to_rag: Whether to automatically ingest results into RAG (default: True) - oauth_token: Optional OAuth token from HuggingFace login (for RAG LLM)

Methods

execute

Searches multiple tools in parallel.

Parameters: - query: Search query string - max_results_per_tool: Maximum results per tool (default: 10)

Returns: SearchResult with: - query: The search query - evidence: Aggregated list of evidence - sources_searched: List of source names searched - total_found: Total number of results - errors: List of error messages from failed tools

Raises: - SearchError: If search times out

Note: Uses asyncio.gather() for parallel execution. Handles tool failures gracefully (returns errors in SearchResult.errors). Automatically ingests evidence into RAG if enabled.

See Also

\ No newline at end of file diff --git a/site/architecture/agents/index.html b/site/architecture/agents/index.html index bcc3d61..bd3053f 100644 --- a/site/architecture/agents/index.html +++ b/site/architecture/agents/index.html @@ -1 +1 @@ - Agents - The DETERMINATOR
Skip to content

Agents Architecture

DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types.

Agent Pattern

Pydantic AI Agents

Pydantic AI agents use the Agent class with the following structure:

  • System Prompt: Module-level constant with date injection
  • Agent Class: __init__(model: Any | None = None)
  • Main Method: Async method (e.g., async def evaluate(), async def write_report())
  • Factory Function: def create_agent_name(model: Any | None = None, oauth_token: str | None = None) -> AgentName

Note: Factory functions accept an optional oauth_token parameter for HuggingFace authentication, which takes priority over environment variables.

Model Initialization

Agents use get_model() from src/agent_factory/judges.py if no model is provided. This supports:

  • OpenAI models
  • Anthropic models
  • HuggingFace Inference API models

The model selection is based on the configured LLM_PROVIDER in settings.

Error Handling

Agents return fallback values on failure rather than raising exceptions:

  • KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])
  • Empty strings for text outputs
  • Default structured outputs

All errors are logged with context using structlog.

Input Validation

All agents validate inputs:

  • Check that queries/inputs are not empty
  • Truncate very long inputs with warnings
  • Handle None values gracefully

Output Types

Agents use structured output types from src/utils/models.py:

  • KnowledgeGapOutput: Research completeness evaluation
  • AgentSelectionPlan: Tool selection plan
  • ReportDraft: Long-form report structure
  • ParsedQuery: Query parsing and mode detection

For text output (writer agents), agents return str directly.

Agent Types

Knowledge Gap Agent

File: src/agents/knowledge_gap.py

Purpose: Evaluates research state and identifies knowledge gaps.

Output: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps

Methods: - async def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutput

Tool Selector Agent

File: src/agents/tool_selector.py

Purpose: Selects appropriate tools for addressing knowledge gaps.

Output: AgentSelectionPlan with list of AgentTask objects.

Available Agents: - WebSearchAgent: General web search for fresh information - SiteCrawlerAgent: Research specific entities/companies - RAGAgent: Semantic search within collected evidence

Writer Agent

File: src/agents/writer.py

Purpose: Generates final reports from research findings.

Output: Markdown string with numbered citations.

Methods: - async def write_report(query, findings, output_length, output_instructions) -> str

Features: - Validates inputs - Truncates very long findings (max 50000 chars) with warning - Retry logic for transient failures (3 retries) - Citation validation before returning

Long Writer Agent

File: src/agents/long_writer.py

Purpose: Long-form report generation with section-by-section writing.

Input/Output: Uses ReportDraft models.

Methods: - async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput - async def write_report(query, report_title, report_draft) -> str

Features: - Writes sections iteratively - Aggregates references across sections - Reformats section headings and references - Deduplicates and renumbers references

Proofreader Agent

File: src/agents/proofreader.py

Purpose: Proofreads and polishes report drafts.

Input: ReportDraft Output: Polished markdown string

Methods: - async def proofread(query, report_title, report_draft) -> str

Features: - Removes duplicate content across sections - Adds executive summary if multiple sections - Preserves all references and citations - Improves flow and readability

Thinking Agent

File: src/agents/thinking.py

Purpose: Generates observations from conversation history.

Output: Observation string

Methods: - async def generate_observations(query, background_context, conversation_history) -> str

Input Parser Agent

File: src/agents/input_parser.py

Purpose: Parses and improves user queries, detects research mode.

Output: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questions

Magentic Agents

The following agents use the BaseAgent pattern from agent-framework and are used exclusively with MagenticOrchestrator:

Hypothesis Agent

File: src/agents/hypothesis_agent.py

Purpose: Generates mechanistic hypotheses based on evidence.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Uses internal Pydantic AI Agent with HypothesisAssessment output type - Accesses shared evidence_store for evidence - Uses embedding service for diverse evidence selection (MMR algorithm) - Stores hypotheses in shared context

Search Agent

File: src/agents/search_agent.py

Purpose: Wraps SearchHandler as an agent for Magentic orchestrator.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Executes searches via SearchHandlerProtocol - Deduplicates evidence using embedding service - Searches for semantically related evidence - Updates shared evidence store

Analysis Agent

File: src/agents/analysis_agent.py

Purpose: Performs statistical analysis using Modal sandbox.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Wraps StatisticalAnalyzer service - Analyzes evidence and hypotheses - Returns verdict (SUPPORTED/REFUTED/INCONCLUSIVE) - Stores analysis results in shared context

Report Agent (Magentic)

File: src/agents/report_agent.py

Purpose: Generates structured scientific reports from evidence and hypotheses.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Uses internal Pydantic AI Agent with ResearchReport output type - Accesses shared evidence store and hypotheses - Validates citations before returning - Formats report as markdown

Judge Agent

File: src/agents/judge_agent.py

Purpose: Evaluates evidence quality and determines if sufficient for synthesis.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse - async def run_stream(messages, thread, **kwargs) -> AsyncIterable[AgentRunResponseUpdate]

Features: - Wraps JudgeHandlerProtocol - Accesses shared evidence store - Returns JudgeAssessment with sufficient flag, confidence, and recommendation

Agent Patterns

DeepCritical uses two distinct agent patterns:

1. Pydantic AI Agents (Traditional Pattern)

These agents use the Pydantic AI Agent class directly and are used in iterative and deep research flows:

  • Pattern: Agent(model, output_type, system_prompt)
  • Initialization: __init__(model: Any | None = None)
  • Methods: Agent-specific async methods (e.g., async def evaluate(), async def write_report())
  • Examples: KnowledgeGapAgent, ToolSelectorAgent, WriterAgent, LongWriterAgent, ProofreaderAgent, ThinkingAgent, InputParserAgent

2. Magentic Agents (Agent-Framework Pattern)

These agents use the BaseAgent class from agent-framework and are used in Magentic orchestrator:

  • Pattern: BaseAgent from agent-framework with async def run() method
  • Initialization: __init__(evidence_store, embedding_service, ...)
  • Methods: async def run(messages, thread, **kwargs) -> AgentRunResponse
  • Examples: HypothesisAgent, SearchAgent, AnalysisAgent, ReportAgent, JudgeAgent

Note: Magentic agents are used exclusively with the MagenticOrchestrator and follow the agent-framework protocol for multi-agent coordination.

Factory Functions

All agents have factory functions in src/agent_factory/agents.py:

Factory functions: - Use get_model() if no model provided - Accept oauth_token parameter for HuggingFace authentication - Raise ConfigurationError if creation fails - Log agent creation

See Also

\ No newline at end of file + Agents - The DETERMINATOR
Skip to content

Agents Architecture

DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types.

Agent Pattern

Pydantic AI Agents

Pydantic AI agents use the Agent class with the following structure:

  • System Prompt: Module-level constant with date injection
  • Agent Class: __init__(model: Any | None = None)
  • Main Method: Async method (e.g., async def evaluate(), async def write_report())
  • Factory Function: def create_agent_name(model: Any | None = None, oauth_token: str | None = None) -> AgentName

Note: Factory functions accept an optional oauth_token parameter for HuggingFace authentication, which takes priority over environment variables.

Model Initialization

Agents use get_model() from src/agent_factory/judges.py if no model is provided. This supports:

  • OpenAI models
  • Anthropic models
  • HuggingFace Inference API models

The model selection is based on the configured LLM_PROVIDER in settings.

Error Handling

Agents return fallback values on failure rather than raising exceptions:

  • KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])
  • Empty strings for text outputs
  • Default structured outputs

All errors are logged with context using structlog.

Input Validation

All agents validate inputs:

  • Check that queries/inputs are not empty
  • Truncate very long inputs with warnings
  • Handle None values gracefully

Output Types

Agents use structured output types from src/utils/models.py:

  • KnowledgeGapOutput: Research completeness evaluation
  • AgentSelectionPlan: Tool selection plan
  • ReportDraft: Long-form report structure
  • ParsedQuery: Query parsing and mode detection

For text output (writer agents), agents return str directly.

Agent Types

Knowledge Gap Agent

File: src/agents/knowledge_gap.py

Purpose: Evaluates research state and identifies knowledge gaps.

Output: KnowledgeGapOutput with: - research_complete: Boolean indicating if research is complete - outstanding_gaps: List of remaining knowledge gaps

Methods: - async def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutput

Tool Selector Agent

File: src/agents/tool_selector.py

Purpose: Selects appropriate tools for addressing knowledge gaps.

Output: AgentSelectionPlan with list of AgentTask objects.

Available Agents: - WebSearchAgent: General web search for fresh information - SiteCrawlerAgent: Research specific entities/companies - RAGAgent: Semantic search within collected evidence

Writer Agent

File: src/agents/writer.py

Purpose: Generates final reports from research findings.

Output: Markdown string with numbered citations.

Methods: - async def write_report(query, findings, output_length, output_instructions) -> str

Features: - Validates inputs - Truncates very long findings (max 50000 chars) with warning - Retry logic for transient failures (3 retries) - Citation validation before returning

Long Writer Agent

File: src/agents/long_writer.py

Purpose: Long-form report generation with section-by-section writing.

Input/Output: Uses ReportDraft models.

Methods: - async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput - async def write_report(query, report_title, report_draft) -> str

Features: - Writes sections iteratively - Aggregates references across sections - Reformats section headings and references - Deduplicates and renumbers references

Proofreader Agent

File: src/agents/proofreader.py

Purpose: Proofreads and polishes report drafts.

Input: ReportDraft Output: Polished markdown string

Methods: - async def proofread(query, report_title, report_draft) -> str

Features: - Removes duplicate content across sections - Adds executive summary if multiple sections - Preserves all references and citations - Improves flow and readability

Thinking Agent

File: src/agents/thinking.py

Purpose: Generates observations from conversation history.

Output: Observation string

Methods: - async def generate_observations(query, background_context, conversation_history) -> str

Input Parser Agent

File: src/agents/input_parser.py

Purpose: Parses and improves user queries, detects research mode.

Output: ParsedQuery with: - original_query: Original query string - improved_query: Refined query string - research_mode: "iterative" or "deep" - key_entities: List of key entities - research_questions: List of research questions

Magentic Agents

The following agents use the BaseAgent pattern from agent-framework and are used exclusively with MagenticOrchestrator:

Hypothesis Agent

File: src/agents/hypothesis_agent.py

Purpose: Generates mechanistic hypotheses based on evidence.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Uses internal Pydantic AI Agent with HypothesisAssessment output type - Accesses shared evidence_store for evidence - Uses embedding service for diverse evidence selection (MMR algorithm) - Stores hypotheses in shared context

Search Agent

File: src/agents/search_agent.py

Purpose: Wraps SearchHandler as an agent for Magentic orchestrator.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Executes searches via SearchHandlerProtocol - Deduplicates evidence using embedding service - Searches for semantically related evidence - Updates shared evidence store

Analysis Agent

File: src/agents/analysis_agent.py

Purpose: Performs statistical analysis using Modal sandbox.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Wraps StatisticalAnalyzer service - Analyzes evidence and hypotheses - Returns verdict (SUPPORTED/REFUTED/INCONCLUSIVE) - Stores analysis results in shared context

Report Agent (Magentic)

File: src/agents/report_agent.py

Purpose: Generates structured scientific reports from evidence and hypotheses.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse

Features: - Uses internal Pydantic AI Agent with ResearchReport output type - Accesses shared evidence store and hypotheses - Validates citations before returning - Formats report as markdown

Judge Agent

File: src/agents/judge_agent.py

Purpose: Evaluates evidence quality and determines if sufficient for synthesis.

Pattern: BaseAgent from agent-framework

Methods: - async def run(messages, thread, **kwargs) -> AgentRunResponse - async def run_stream(messages, thread, **kwargs) -> AsyncIterable[AgentRunResponseUpdate]

Features: - Wraps JudgeHandlerProtocol - Accesses shared evidence store - Returns JudgeAssessment with sufficient flag, confidence, and recommendation

Agent Patterns

DeepCritical uses two distinct agent patterns:

1. Pydantic AI Agents (Traditional Pattern)

These agents use the Pydantic AI Agent class directly and are used in iterative and deep research flows:

  • Pattern: Agent(model, output_type, system_prompt)
  • Initialization: __init__(model: Any | None = None)
  • Methods: Agent-specific async methods (e.g., async def evaluate(), async def write_report())
  • Examples: KnowledgeGapAgent, ToolSelectorAgent, WriterAgent, LongWriterAgent, ProofreaderAgent, ThinkingAgent, InputParserAgent

2. Magentic Agents (Agent-Framework Pattern)

These agents use the BaseAgent class from agent-framework and are used in Magentic orchestrator:

  • Pattern: BaseAgent from agent-framework with async def run() method
  • Initialization: __init__(evidence_store, embedding_service, ...)
  • Methods: async def run(messages, thread, **kwargs) -> AgentRunResponse
  • Examples: HypothesisAgent, SearchAgent, AnalysisAgent, ReportAgent, JudgeAgent

Note: Magentic agents are used exclusively with the MagenticOrchestrator and follow the agent-framework protocol for multi-agent coordination.

Factory Functions

All agents have factory functions in src/agent_factory/agents.py:

Factory functions: - Use get_model() if no model provided - Accept oauth_token parameter for HuggingFace authentication - Raise ConfigurationError if creation fails - Log agent creation

See Also

\ No newline at end of file diff --git a/site/architecture/graph_orchestration/index.html b/site/architecture/graph_orchestration/index.html index e54981e..be63b4d 100644 --- a/site/architecture/graph_orchestration/index.html +++ b/site/architecture/graph_orchestration/index.html @@ -72,4 +72,4 @@ IterativeFlow->>JudgeHandler: assess_evidence() JudgeHandler-->>IterativeFlow: should_continue end - end

Graph Structure

Nodes

Graph nodes represent different stages in the research workflow:

  1. Agent Nodes: Execute Pydantic AI agents
  2. Input: Prompt/query
  3. Output: Structured or unstructured response
  4. Examples: KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent

  5. State Nodes: Update or read workflow state

  6. Input: Current state
  7. Output: Updated state
  8. Examples: Update evidence, update conversation history

  9. Decision Nodes: Make routing decisions based on conditions

  10. Input: Current state/results
  11. Output: Next node ID
  12. Examples: Continue research vs. complete research

  13. Parallel Nodes: Execute multiple nodes concurrently

  14. Input: List of node IDs
  15. Output: Aggregated results
  16. Examples: Parallel iterative research loops

Edges

Edges define transitions between nodes:

  1. Sequential Edges: Always traversed (no condition)
  2. From: Source node
  3. To: Target node
  4. Condition: None (always True)

  5. Conditional Edges: Traversed based on condition

  6. From: Source node
  7. To: Target node
  8. Condition: Callable that returns bool
  9. Example: If research complete → go to writer, else → continue loop

  10. Parallel Edges: Used for parallel execution branches

  11. From: Parallel node
  12. To: Multiple target nodes
  13. Execution: All targets run concurrently

State Management

State is managed via WorkflowState using ContextVar for thread-safe isolation:

State transitions occur at state nodes, which update the global workflow state.

Execution Flow

  1. Graph Construction: Build graph from nodes and edges using create_iterative_graph() or create_deep_graph()
  2. Graph Validation: Ensure graph is valid (no cycles, all nodes reachable) via ResearchGraph.validate_structure()
  3. Graph Execution: Traverse graph from entry node using GraphOrchestrator._execute_graph()
  4. Node Execution: Execute each node based on type:
  5. Agent Nodes: Call agent.run() with transformed input
  6. State Nodes: Update workflow state via state_updater function
  7. Decision Nodes: Evaluate decision_function to get next node ID
  8. Parallel Nodes: Execute all parallel nodes concurrently via asyncio.gather()
  9. Edge Evaluation: Determine next node(s) based on edges and conditions
  10. Parallel Execution: Use asyncio.gather() for parallel nodes
  11. State Updates: Update state at state nodes via GraphExecutionContext.update_state()
  12. Event Streaming: Yield AgentEvent objects during execution for UI

GraphExecutionContext

The GraphExecutionContext class manages execution state during graph traversal:

Methods: - set_node_result(node_id, result): Store result from node execution - get_node_result(node_id): Retrieve stored result - has_visited(node_id): Check if node was visited - mark_visited(node_id): Mark node as visited - update_state(updater, data): Update workflow state

Conditional Routing

Decision nodes evaluate conditions and return next node IDs:

Parallel Execution

Parallel nodes execute multiple nodes concurrently:

Budget Enforcement

Budget constraints are enforced at decision nodes:

If any budget is exceeded, execution routes to exit node.

Error Handling

Errors are handled at multiple levels:

  1. Node Level: Catch errors in individual node execution
  2. Graph Level: Handle errors during graph traversal
  3. State Level: Rollback state changes on error

Errors are logged and yield error events for UI.

Backward Compatibility

Graph execution is optional via feature flag:

This allows gradual migration and fallback if needed.

See Also

\ No newline at end of file + end

Graph Structure

Nodes

Graph nodes represent different stages in the research workflow:

  1. Agent Nodes: Execute Pydantic AI agents
  2. Input: Prompt/query
  3. Output: Structured or unstructured response
  4. Examples: KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent

  5. State Nodes: Update or read workflow state

  6. Input: Current state
  7. Output: Updated state
  8. Examples: Update evidence, update conversation history

  9. Decision Nodes: Make routing decisions based on conditions

  10. Input: Current state/results
  11. Output: Next node ID
  12. Examples: Continue research vs. complete research

  13. Parallel Nodes: Execute multiple nodes concurrently

  14. Input: List of node IDs
  15. Output: Aggregated results
  16. Examples: Parallel iterative research loops

Edges

Edges define transitions between nodes:

  1. Sequential Edges: Always traversed (no condition)
  2. From: Source node
  3. To: Target node
  4. Condition: None (always True)

  5. Conditional Edges: Traversed based on condition

  6. From: Source node
  7. To: Target node
  8. Condition: Callable that returns bool
  9. Example: If research complete → go to writer, else → continue loop

  10. Parallel Edges: Used for parallel execution branches

  11. From: Parallel node
  12. To: Multiple target nodes
  13. Execution: All targets run concurrently

State Management

State is managed via WorkflowState using ContextVar for thread-safe isolation:

State transitions occur at state nodes, which update the global workflow state.

Execution Flow

  1. Graph Construction: Build graph from nodes and edges using create_iterative_graph() or create_deep_graph()
  2. Graph Validation: Ensure graph is valid (no cycles, all nodes reachable) via ResearchGraph.validate_structure()
  3. Graph Execution: Traverse graph from entry node using GraphOrchestrator._execute_graph()
  4. Node Execution: Execute each node based on type:
  5. Agent Nodes: Call agent.run() with transformed input
  6. State Nodes: Update workflow state via state_updater function
  7. Decision Nodes: Evaluate decision_function to get next node ID
  8. Parallel Nodes: Execute all parallel nodes concurrently via asyncio.gather()
  9. Edge Evaluation: Determine next node(s) based on edges and conditions
  10. Parallel Execution: Use asyncio.gather() for parallel nodes
  11. State Updates: Update state at state nodes via GraphExecutionContext.update_state()
  12. Event Streaming: Yield AgentEvent objects during execution for UI

GraphExecutionContext

The GraphExecutionContext class manages execution state during graph traversal:

Methods: - set_node_result(node_id, result): Store result from node execution - get_node_result(node_id): Retrieve stored result - has_visited(node_id): Check if node was visited - mark_visited(node_id): Mark node as visited - update_state(updater, data): Update workflow state

Conditional Routing

Decision nodes evaluate conditions and return next node IDs:

Parallel Execution

Parallel nodes execute multiple nodes concurrently:

Budget Enforcement

Budget constraints are enforced at decision nodes:

If any budget is exceeded, execution routes to exit node.

Error Handling

Errors are handled at multiple levels:

  1. Node Level: Catch errors in individual node execution
  2. Graph Level: Handle errors during graph traversal
  3. State Level: Rollback state changes on error

Errors are logged and yield error events for UI.

Backward Compatibility

Graph execution is optional via feature flag:

This allows gradual migration and fallback if needed.

See Also

\ No newline at end of file diff --git a/site/architecture/middleware/index.html b/site/architecture/middleware/index.html index 928003f..9857e14 100644 --- a/site/architecture/middleware/index.html +++ b/site/architecture/middleware/index.html @@ -37,4 +37,4 @@ if not tracker.can_continue("research_loop"): # Budget exceeded, stop research pass -

Models

All middleware models are defined in src/utils/models.py:

Thread Safety

All middleware components use ContextVar for thread-safe isolation:

See Also

\ No newline at end of file +

Models

All middleware models are defined in src/utils/models.py:

Thread Safety

All middleware components use ContextVar for thread-safe isolation:

See Also

\ No newline at end of file diff --git a/site/architecture/orchestrators/index.html b/site/architecture/orchestrators/index.html index c02ded5..493239f 100644 --- a/site/architecture/orchestrators/index.html +++ b/site/architecture/orchestrators/index.html @@ -1 +1 @@ - Orchestrators - The DETERMINATOR
Skip to content

Orchestrators Architecture

DeepCritical supports multiple orchestration patterns for research workflows.

Research Flows

IterativeResearchFlow

File: src/orchestrator/research_flow.py

Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete

Agents Used: - KnowledgeGapAgent: Evaluates research completeness - ToolSelectorAgent: Selects tools for addressing gaps - ThinkingAgent: Generates observations - WriterAgent: Creates final report - JudgeHandler: Assesses evidence sufficiency

Features: - Tracks iterations, time, budget - Supports graph execution (use_graph=True) and agent chains (use_graph=False) - Iterates until research complete or constraints met

Usage:

DeepResearchFlow

File: src/orchestrator/research_flow.py

Pattern: Planner → Parallel iterative loops per section → Synthesizer

Agents Used: - PlannerAgent: Breaks query into report sections - IterativeResearchFlow: Per-section research (parallel) - LongWriterAgent or ProofreaderAgent: Final synthesis

Features: - Uses WorkflowManager for parallel execution - Budget tracking per section and globally - State synchronization across parallel loops - Supports graph execution and agent chains

Usage:

Graph Orchestrator

File: src/orchestrator/graph_orchestrator.py

Purpose: Graph-based execution using Pydantic AI agents as nodes

Features: - Uses graph execution (use_graph=True) or agent chains (use_graph=False) as fallback - Routes based on research mode (iterative/deep/auto) - Streams AgentEvent objects for UI - Uses GraphExecutionContext to manage execution state

Node Types: - Agent Nodes: Execute Pydantic AI agents - State Nodes: Update or read workflow state - Decision Nodes: Make routing decisions - Parallel Nodes: Execute multiple nodes concurrently

Edge Types: - Sequential Edges: Always traversed - Conditional Edges: Traversed based on condition - Parallel Edges: Used for parallel execution branches

Special Node Handling:

The GraphOrchestrator has special handling for certain nodes:

  • execute_tools node: State node that uses search_handler to execute searches and add evidence to workflow state
  • parallel_loops node: Parallel node that executes IterativeResearchFlow instances for each section in deep research mode
  • synthesizer node: Agent node that calls LongWriterAgent.write_report() directly with ReportDraft instead of using agent.run()
  • writer node: Agent node that calls WriterAgent.write_report() directly with findings instead of using agent.run()

GraphExecutionContext:

The orchestrator uses GraphExecutionContext to manage execution state: - Tracks current node, visited nodes, and node results - Manages workflow state and budget tracker - Provides methods to store and retrieve node execution results

Orchestrator Factory

File: src/orchestrator_factory.py

Purpose: Factory for creating orchestrators

Modes: - Simple: Legacy orchestrator (backward compatible) - Advanced: Magentic orchestrator (requires OpenAI API key) - Auto-detect: Chooses based on API key availability

Usage:

Magentic Orchestrator

File: src/orchestrator_magentic.py

Purpose: Multi-agent coordination using Microsoft Agent Framework

Features: - Uses agent-framework-core - ChatAgent pattern with internal LLMs per agent - MagenticBuilder with participants: - searcher: SearchAgent (wraps SearchHandler) - hypothesizer: HypothesisAgent (generates hypotheses) - judge: JudgeAgent (evaluates evidence) - reporter: ReportAgent (generates final report) - Manager orchestrates agents via chat client (OpenAI or HuggingFace) - Event-driven: converts Magentic events to AgentEvent for UI streaming via _process_event() method - Supports max rounds, stall detection, and reset handling

Event Processing:

The orchestrator processes Magentic events and converts them to AgentEvent: - MagenticOrchestratorMessageEventAgentEvent with type based on message content - MagenticAgentMessageEventAgentEvent with type based on agent name - MagenticAgentDeltaEventAgentEvent for streaming updates - MagenticFinalResultEventAgentEvent with type "complete"

Requirements: - agent-framework-core package - OpenAI API key or HuggingFace authentication

Hierarchical Orchestrator

File: src/orchestrator_hierarchical.py

Purpose: Hierarchical orchestrator using middleware and sub-teams

Features: - Uses SubIterationMiddleware with ResearchTeam and LLMSubIterationJudge - Adapts Magentic ChatAgent to SubIterationTeam protocol - Event-driven via asyncio.Queue for coordination - Supports sub-iteration patterns for complex research tasks

Legacy Simple Mode

File: src/legacy_orchestrator.py

Purpose: Linear search-judge-synthesize loop

Features: - Uses SearchHandlerProtocol and JudgeHandlerProtocol - Generator-based design yielding AgentEvent objects - Backward compatibility for simple use cases

State Initialization

All orchestrators must initialize workflow state:

Event Streaming

All orchestrators yield AgentEvent objects:

Event Types: - started: Research started - searching: Search in progress - search_complete: Search completed - judging: Evidence evaluation in progress - judge_complete: Evidence evaluation completed - looping: Iteration in progress - hypothesizing: Generating hypotheses - analyzing: Statistical analysis in progress - analysis_complete: Statistical analysis completed - synthesizing: Synthesizing results - complete: Research completed - error: Error occurred - streaming: Streaming update (delta events)

Event Structure:

See Also

\ No newline at end of file + Orchestrators - The DETERMINATOR
Skip to content

Orchestrators Architecture

DeepCritical supports multiple orchestration patterns for research workflows.

Research Flows

IterativeResearchFlow

File: src/orchestrator/research_flow.py

Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete

Agents Used: - KnowledgeGapAgent: Evaluates research completeness - ToolSelectorAgent: Selects tools for addressing gaps - ThinkingAgent: Generates observations - WriterAgent: Creates final report - JudgeHandler: Assesses evidence sufficiency

Features: - Tracks iterations, time, budget - Supports graph execution (use_graph=True) and agent chains (use_graph=False) - Iterates until research complete or constraints met

Usage:

DeepResearchFlow

File: src/orchestrator/research_flow.py

Pattern: Planner → Parallel iterative loops per section → Synthesizer

Agents Used: - PlannerAgent: Breaks query into report sections - IterativeResearchFlow: Per-section research (parallel) - LongWriterAgent or ProofreaderAgent: Final synthesis

Features: - Uses WorkflowManager for parallel execution - Budget tracking per section and globally - State synchronization across parallel loops - Supports graph execution and agent chains

Usage:

Graph Orchestrator

File: src/orchestrator/graph_orchestrator.py

Purpose: Graph-based execution using Pydantic AI agents as nodes

Features: - Uses graph execution (use_graph=True) or agent chains (use_graph=False) as fallback - Routes based on research mode (iterative/deep/auto) - Streams AgentEvent objects for UI - Uses GraphExecutionContext to manage execution state

Node Types: - Agent Nodes: Execute Pydantic AI agents - State Nodes: Update or read workflow state - Decision Nodes: Make routing decisions - Parallel Nodes: Execute multiple nodes concurrently

Edge Types: - Sequential Edges: Always traversed - Conditional Edges: Traversed based on condition - Parallel Edges: Used for parallel execution branches

Special Node Handling:

The GraphOrchestrator has special handling for certain nodes:

  • execute_tools node: State node that uses search_handler to execute searches and add evidence to workflow state
  • parallel_loops node: Parallel node that executes IterativeResearchFlow instances for each section in deep research mode
  • synthesizer node: Agent node that calls LongWriterAgent.write_report() directly with ReportDraft instead of using agent.run()
  • writer node: Agent node that calls WriterAgent.write_report() directly with findings instead of using agent.run()

GraphExecutionContext:

The orchestrator uses GraphExecutionContext to manage execution state: - Tracks current node, visited nodes, and node results - Manages workflow state and budget tracker - Provides methods to store and retrieve node execution results

Orchestrator Factory

File: src/orchestrator_factory.py

Purpose: Factory for creating orchestrators

Modes: - Simple: Legacy orchestrator (backward compatible) - Advanced: Magentic orchestrator (requires OpenAI API key) - Auto-detect: Chooses based on API key availability

Usage:

Magentic Orchestrator

File: src/orchestrator_magentic.py

Purpose: Multi-agent coordination using Microsoft Agent Framework

Features: - Uses agent-framework-core - ChatAgent pattern with internal LLMs per agent - MagenticBuilder with participants: - searcher: SearchAgent (wraps SearchHandler) - hypothesizer: HypothesisAgent (generates hypotheses) - judge: JudgeAgent (evaluates evidence) - reporter: ReportAgent (generates final report) - Manager orchestrates agents via chat client (OpenAI or HuggingFace) - Event-driven: converts Magentic events to AgentEvent for UI streaming via _process_event() method - Supports max rounds, stall detection, and reset handling

Event Processing:

The orchestrator processes Magentic events and converts them to AgentEvent: - MagenticOrchestratorMessageEventAgentEvent with type based on message content - MagenticAgentMessageEventAgentEvent with type based on agent name - MagenticAgentDeltaEventAgentEvent for streaming updates - MagenticFinalResultEventAgentEvent with type "complete"

Requirements: - agent-framework-core package - OpenAI API key or HuggingFace authentication

Hierarchical Orchestrator

File: src/orchestrator_hierarchical.py

Purpose: Hierarchical orchestrator using middleware and sub-teams

Features: - Uses SubIterationMiddleware with ResearchTeam and LLMSubIterationJudge - Adapts Magentic ChatAgent to SubIterationTeam protocol - Event-driven via asyncio.Queue for coordination - Supports sub-iteration patterns for complex research tasks

Legacy Simple Mode

File: src/legacy_orchestrator.py

Purpose: Linear search-judge-synthesize loop

Features: - Uses SearchHandlerProtocol and JudgeHandlerProtocol - Generator-based design yielding AgentEvent objects - Backward compatibility for simple use cases

State Initialization

All orchestrators must initialize workflow state:

Event Streaming

All orchestrators yield AgentEvent objects:

Event Types: - started: Research started - searching: Search in progress - search_complete: Search completed - judging: Evidence evaluation in progress - judge_complete: Evidence evaluation completed - looping: Iteration in progress - hypothesizing: Generating hypotheses - analyzing: Statistical analysis in progress - analysis_complete: Statistical analysis completed - synthesizing: Synthesizing results - complete: Research completed - error: Error occurred - streaming: Streaming update (delta events)

Event Structure:

See Also

\ No newline at end of file diff --git a/site/architecture/services/index.html b/site/architecture/services/index.html index 31b4851..f09467c 100644 --- a/site/architecture/services/index.html +++ b/site/architecture/services/index.html @@ -27,4 +27,4 @@ if settings.has_openai_key: # Use OpenAI embeddings for RAG pass -

See Also

\ No newline at end of file +

See Also

\ No newline at end of file diff --git a/site/architecture/tools/index.html b/site/architecture/tools/index.html index e535c5c..afaefe0 100644 --- a/site/architecture/tools/index.html +++ b/site/architecture/tools/index.html @@ -16,4 +16,4 @@ # Execute search result = await search_handler.execute("query", max_results_per_tool=10) -

See Also

\ No newline at end of file +

See Also

\ No newline at end of file diff --git a/site/architecture/workflow-diagrams/index.html b/site/architecture/workflow-diagrams/index.html index e9729db..5f53016 100644 --- a/site/architecture/workflow-diagrams/index.html +++ b/site/architecture/workflow-diagrams/index.html @@ -485,4 +485,4 @@ Formatting :r3, after r2, 10s section Manager Synthesis - Final synthesis :f1, after r3, 10s

Key Differences from Original Design

Aspect Original (Judge-in-Loop) New (Magentic)
Control Flow Fixed sequential phases Dynamic agent selection
Quality Control Separate Judge Agent Manager assessment built-in
Retry Logic Phase-level with feedback Agent-level with adaptation
Flexibility Rigid 4-phase pipeline Adaptive workflow
Complexity 5 agents (including Judge) 4 agents (no Judge)
Progress Tracking Manual state management Built-in round/stall detection
Agent Coordination Sequential handoff Manager-driven dynamic selection
Error Recovery Retry same phase Try different agent or replan

Simplified Design Principles

  1. Manager is Intelligent: LLM-powered manager handles planning, selection, and quality assessment
  2. No Separate Judge: Manager's assessment phase replaces dedicated Judge Agent
  3. Dynamic Workflow: Agents can be called multiple times in any order based on need
  4. Built-in Safety: max_round_count (15) and max_stall_count (3) prevent infinite loops
  5. Event-Driven UI: Real-time streaming updates to Gradio interface
  6. MCP-Powered Tools: All external capabilities via Model Context Protocol
  7. Shared Context: Centralized state accessible to all agents
  8. Progress Awareness: Manager tracks what's been done and what's needed

Legend


Implementation Highlights

Simple 4-Agent Setup:

Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations)

No separate Judge Agent needed - manager does it all!


Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT

See Also

\ No newline at end of file + Final synthesis :f1, after r3, 10s

Key Differences from Original Design

Aspect Original (Judge-in-Loop) New (Magentic)
Control Flow Fixed sequential phases Dynamic agent selection
Quality Control Separate Judge Agent Manager assessment built-in
Retry Logic Phase-level with feedback Agent-level with adaptation
Flexibility Rigid 4-phase pipeline Adaptive workflow
Complexity 5 agents (including Judge) 4 agents (no Judge)
Progress Tracking Manual state management Built-in round/stall detection
Agent Coordination Sequential handoff Manager-driven dynamic selection
Error Recovery Retry same phase Try different agent or replan

Simplified Design Principles

  1. Manager is Intelligent: LLM-powered manager handles planning, selection, and quality assessment
  2. No Separate Judge: Manager's assessment phase replaces dedicated Judge Agent
  3. Dynamic Workflow: Agents can be called multiple times in any order based on need
  4. Built-in Safety: max_round_count (15) and max_stall_count (3) prevent infinite loops
  5. Event-Driven UI: Real-time streaming updates to Gradio interface
  6. MCP-Powered Tools: All external capabilities via Model Context Protocol
  7. Shared Context: Centralized state accessible to all agents
  8. Progress Awareness: Manager tracks what's been done and what's needed

Legend


Implementation Highlights

Simple 4-Agent Setup:

Manager handles quality assessment in its instructions: - Checks hypothesis quality (testable, novel, clear) - Validates search results (relevant, authoritative, recent) - Assesses analysis soundness (methodology, evidence, conclusions) - Ensures report completeness (all sections, proper citations)

No separate Judge Agent needed - manager does it all!


Document Version: 2.0 (Magentic Simplified) Last Updated: 2025-11-24 Architecture: Microsoft Magentic Orchestration Pattern Agents: 4 (Hypothesis, Search, Analysis, Report) + 1 Manager License: MIT

See Also

\ No newline at end of file diff --git a/site/configuration/index.html b/site/configuration/index.html index bbcb61c..a6801b4 100644 --- a/site/configuration/index.html +++ b/site/configuration/index.html @@ -121,4 +121,4 @@ # Web search is configured pass

API Key Retrieval

Get the API key for the configured provider:

For OpenAI-specific operations (e.g., Magentic mode):

Configuration Usage in Codebase

The configuration system is used throughout the codebase:

LLM Factory

The LLM factory uses settings to create appropriate models:

Embedding Service

The embedding service uses local embedding model configuration:

Orchestrator Factory

The orchestrator factory uses settings to determine mode:

Environment Variables Reference

Required (at least one LLM)

LLM Configuration Variables

Embedding Configuration Variables

Web Search Configuration Variables

PubMed Configuration Variables

Agent Configuration Variables

Budget Configuration Variables

RAG Configuration Variables

ChromaDB Configuration Variables

External Services Variables

Logging Configuration Variables

Validation

Settings are validated on load using Pydantic validation:

Validation Examples

The max_iterations field has range validation:

The llm_provider field has literal validation:

Error Handling

Configuration errors raise ConfigurationError from src/utils/exceptions.py:

```22:25:src/utils/exceptions.py class ConfigurationError(DeepCriticalError): """Raised when configuration is invalid."""

pass
-

```

Error Handling Example

python from src.utils.config import settings from src.utils.exceptions import ConfigurationError try: api_key = settings.get_api_key() except ConfigurationError as e: print(f"Configuration error: {e}")

Common Configuration Errors

  1. Missing API Key: When get_api_key() is called but the required API key is not set
  2. Invalid Provider: When llm_provider is set to an unsupported value
  3. Out of Range: When numeric values exceed their min/max constraints
  4. Invalid Literal: When enum fields receive unsupported values

Configuration Best Practices

  1. Use .env File: Store sensitive keys in .env file (add to .gitignore)
  2. Check Availability: Use properties like has_openai_key before accessing API keys
  3. Handle Errors: Always catch ConfigurationError when calling get_api_key()
  4. Validate Early: Configuration is validated on import, so errors surface immediately
  5. Use Defaults: Leverage sensible defaults for optional configuration

Future Enhancements

The following configurations are planned for future phases:

  1. Additional LLM Providers: DeepSeek, OpenRouter, Gemini, Perplexity, Azure OpenAI, Local models
  2. Model Selection: Reasoning/main/fast model configuration
  3. Service Integration: Additional service integrations and configurations
\ No newline at end of file +

```

Error Handling Example

python from src.utils.config import settings from src.utils.exceptions import ConfigurationError try: api_key = settings.get_api_key() except ConfigurationError as e: print(f"Configuration error: {e}")

Common Configuration Errors

  1. Missing API Key: When get_api_key() is called but the required API key is not set
  2. Invalid Provider: When llm_provider is set to an unsupported value
  3. Out of Range: When numeric values exceed their min/max constraints
  4. Invalid Literal: When enum fields receive unsupported values

Configuration Best Practices

  1. Use .env File: Store sensitive keys in .env file (add to .gitignore)
  2. Check Availability: Use properties like has_openai_key before accessing API keys
  3. Handle Errors: Always catch ConfigurationError when calling get_api_key()
  4. Validate Early: Configuration is validated on import, so errors surface immediately
  5. Use Defaults: Leverage sensible defaults for optional configuration

Future Enhancements

The following configurations are planned for future phases:

  1. Additional LLM Providers: DeepSeek, OpenRouter, Gemini, Perplexity, Azure OpenAI, Local models
  2. Model Selection: Reasoning/main/fast model configuration
  3. Service Integration: Additional service integrations and configurations
\ No newline at end of file diff --git a/site/contributing/code-quality/index.html b/site/contributing/code-quality/index.html index a7d5d11..932d4fe 100644 --- a/site/contributing/code-quality/index.html +++ b/site/contributing/code-quality/index.html @@ -9,4 +9,4 @@ # Serve documentation locally (http://127.0.0.1:8000) uv run mkdocs serve -

The documentation site is published at: https://deepcritical.github.io/GradioDemo/

Docstrings

Example:

Code Comments

See Also

\ No newline at end of file +

The documentation site is published at: https://deepcritical.github.io/GradioDemo/

Docstrings

Example:

Code Comments

See Also

\ No newline at end of file diff --git a/site/contributing/code-style/index.html b/site/contributing/code-style/index.html index 8b09f8f..df91363 100644 --- a/site/contributing/code-style/index.html +++ b/site/contributing/code-style/index.html @@ -20,4 +20,4 @@ uv run mypy src

This ensures commands run in the correct virtual environment managed by uv.

Type Safety

Pydantic Models

Async Patterns

loop = asyncio.get_running_loop()
 result = await loop.run_in_executor(None, cpu_bound_function, args)
-

Common Pitfalls

  1. Blocking the event loop: Never use sync I/O in async functions
  2. Missing type hints: All functions must have complete type annotations
  3. Global mutable state: Use ContextVar or pass via parameters
  4. Import errors: Lazy-load optional dependencies (magentic, modal, embeddings)

See Also

\ No newline at end of file +

Common Pitfalls

  1. Blocking the event loop: Never use sync I/O in async functions
  2. Missing type hints: All functions must have complete type annotations
  3. Global mutable state: Use ContextVar or pass via parameters
  4. Import errors: Lazy-load optional dependencies (magentic, modal, embeddings)

See Also

\ No newline at end of file diff --git a/site/contributing/error-handling/index.html b/site/contributing/error-handling/index.html index b5e359e..e94f2a2 100644 --- a/site/contributing/error-handling/index.html +++ b/site/contributing/error-handling/index.html @@ -6,4 +6,4 @@ result = await api_call() except httpx.HTTPError as e: raise SearchError(f"API call failed: {e}") from e -

See Also

\ No newline at end of file +

See Also

\ No newline at end of file diff --git a/site/contributing/implementation-patterns/index.html b/site/contributing/implementation-patterns/index.html index eb36b7a..59825fb 100644 --- a/site/contributing/implementation-patterns/index.html +++ b/site/contributing/implementation-patterns/index.html @@ -7,4 +7,4 @@ async def search(self, query: str, max_results: int = 10) -> list[Evidence]: # Implementation return evidence_list -

Judge Handlers

Agent Factory Pattern

State Management

Singleton Pattern

Use @lru_cache(maxsize=1) for singletons:

See Also

\ No newline at end of file +

Judge Handlers

Agent Factory Pattern

State Management

Singleton Pattern

Use @lru_cache(maxsize=1) for singletons:

See Also

\ No newline at end of file diff --git a/site/contributing/index.html b/site/contributing/index.html index 3d16a3f..d84a115 100644 --- a/site/contributing/index.html +++ b/site/contributing/index.html @@ -49,4 +49,4 @@ uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
  1. Commit and push:
git commit -m "Description of changes"
 git push origin yourname-feature-name
-
  1. Create a pull request on GitHub

Development Guidelines

Code Style

Error Handling

Testing

Implementation Patterns

Prompt Engineering

Code Quality

MCP Integration

MCP Tools

Gradio MCP Server

Common Pitfalls

  1. Blocking the event loop: Never use sync I/O in async functions
  2. Missing type hints: All functions must have complete type annotations
  3. Hallucinated citations: Always validate references
  4. Global mutable state: Use ContextVar or pass via parameters
  5. Import errors: Lazy-load optional dependencies (magentic, modal, embeddings)
  6. Rate limiting: Always implement for external APIs
  7. Error chaining: Always use from e when raising exceptions

Key Principles

  1. Type Safety First: All code must pass mypy --strict
  2. Async Everything: All I/O must be async
  3. Test-Driven: Write tests before implementation
  4. No Hallucinations: Validate all citations
  5. Graceful Degradation: Support free tier (HF Inference) when no API keys
  6. Lazy Loading: Don't require optional dependencies at import time
  7. Structured Logging: Use structlog, never print()
  8. Error Chaining: Always preserve exception context

Pull Request Process

  1. Ensure all checks pass: uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
  2. Update documentation if needed
  3. Add tests for new features
  4. Update CHANGELOG if applicable
  5. Request review from maintainers
  6. Address review feedback
  7. Wait for approval before merging

Project Structure

Questions?

Thank you for contributing to The DETERMINATOR!

\ No newline at end of file +
  1. Create a pull request on GitHub

Development Guidelines

Code Style

Error Handling

Testing

Implementation Patterns

Prompt Engineering

Code Quality

MCP Integration

MCP Tools

Gradio MCP Server

Common Pitfalls

  1. Blocking the event loop: Never use sync I/O in async functions
  2. Missing type hints: All functions must have complete type annotations
  3. Hallucinated citations: Always validate references
  4. Global mutable state: Use ContextVar or pass via parameters
  5. Import errors: Lazy-load optional dependencies (magentic, modal, embeddings)
  6. Rate limiting: Always implement for external APIs
  7. Error chaining: Always use from e when raising exceptions

Key Principles

  1. Type Safety First: All code must pass mypy --strict
  2. Async Everything: All I/O must be async
  3. Test-Driven: Write tests before implementation
  4. No Hallucinations: Validate all citations
  5. Graceful Degradation: Support free tier (HF Inference) when no API keys
  6. Lazy Loading: Don't require optional dependencies at import time
  7. Structured Logging: Use structlog, never print()
  8. Error Chaining: Always preserve exception context

Pull Request Process

  1. Ensure all checks pass: uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
  2. Update documentation if needed
  3. Add tests for new features
  4. Update CHANGELOG if applicable
  5. Request review from maintainers
  6. Address review feedback
  7. Wait for approval before merging

Project Structure

Questions?

Thank you for contributing to The DETERMINATOR!

\ No newline at end of file diff --git a/site/contributing/prompt-engineering/index.html b/site/contributing/prompt-engineering/index.html index b1298ff..96e541a 100644 --- a/site/contributing/prompt-engineering/index.html +++ b/site/contributing/prompt-engineering/index.html @@ -1 +1 @@ - Prompt Engineering & Citation Validation - The DETERMINATOR
Skip to content

Prompt Engineering & Citation Validation

This document outlines prompt engineering guidelines and citation validation rules.

Judge Prompts

  • System prompt in src/prompts/judge.py
  • Format evidence with truncation (1500 chars per item)
  • Handle empty evidence case separately
  • Always request structured JSON output
  • Use format_user_prompt() and format_empty_evidence_prompt() helpers

Hypothesis Prompts

  • Use diverse evidence selection (MMR algorithm)
  • Sentence-aware truncation (truncate_at_sentence())
  • Format: Drug → Target → Pathway → Effect
  • System prompt emphasizes mechanistic reasoning
  • Use format_hypothesis_prompt() with embeddings for diversity

Report Prompts

  • Include full citation details for validation
  • Use diverse evidence selection (n=20)
  • CRITICAL: Emphasize citation validation rules
  • Format hypotheses with support/contradiction counts
  • System prompt includes explicit JSON structure requirements

Citation Validation

  • ALWAYS validate references before returning reports
  • Use validate_references() from src/utils/citation_validator.py
  • Remove hallucinated citations (URLs not in evidence)
  • Log warnings for removed citations
  • Never trust LLM-generated citations without validation

Citation Validation Rules

  1. Every reference URL must EXACTLY match a provided evidence URL
  2. Do NOT invent, fabricate, or hallucinate any references
  3. Do NOT modify paper titles, authors, dates, or URLs
  4. If unsure about a citation, OMIT it rather than guess
  5. Copy URLs exactly as provided - do not create similar-looking URLs

Evidence Selection

  • Use select_diverse_evidence() for MMR-based selection
  • Balance relevance vs diversity (lambda=0.7 default)
  • Sentence-aware truncation preserves meaning
  • Limit evidence per prompt to avoid context overflow

See Also

\ No newline at end of file + Prompt Engineering & Citation Validation - The DETERMINATOR
Skip to content

Prompt Engineering & Citation Validation

This document outlines prompt engineering guidelines and citation validation rules.

Judge Prompts

  • System prompt in src/prompts/judge.py
  • Format evidence with truncation (1500 chars per item)
  • Handle empty evidence case separately
  • Always request structured JSON output
  • Use format_user_prompt() and format_empty_evidence_prompt() helpers

Hypothesis Prompts

  • Use diverse evidence selection (MMR algorithm)
  • Sentence-aware truncation (truncate_at_sentence())
  • Format: Drug → Target → Pathway → Effect
  • System prompt emphasizes mechanistic reasoning
  • Use format_hypothesis_prompt() with embeddings for diversity

Report Prompts

  • Include full citation details for validation
  • Use diverse evidence selection (n=20)
  • CRITICAL: Emphasize citation validation rules
  • Format hypotheses with support/contradiction counts
  • System prompt includes explicit JSON structure requirements

Citation Validation

  • ALWAYS validate references before returning reports
  • Use validate_references() from src/utils/citation_validator.py
  • Remove hallucinated citations (URLs not in evidence)
  • Log warnings for removed citations
  • Never trust LLM-generated citations without validation

Citation Validation Rules

  1. Every reference URL must EXACTLY match a provided evidence URL
  2. Do NOT invent, fabricate, or hallucinate any references
  3. Do NOT modify paper titles, authors, dates, or URLs
  4. If unsure about a citation, OMIT it rather than guess
  5. Copy URLs exactly as provided - do not create similar-looking URLs

Evidence Selection

  • Use select_diverse_evidence() for MMR-based selection
  • Balance relevance vs diversity (lambda=0.7 default)
  • Sentence-aware truncation preserves meaning
  • Limit evidence per prompt to avoid context overflow

See Also

\ No newline at end of file diff --git a/site/contributing/testing/index.html b/site/contributing/testing/index.html index b840011..077d9ed 100644 --- a/site/contributing/testing/index.html +++ b/site/contributing/testing/index.html @@ -34,4 +34,4 @@ assert len(results) <= 3

Test Coverage

Terminal Coverage Report

uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
 

This shows coverage with missing lines highlighted in the terminal output.

HTML Coverage Report

uv run pytest --cov=src --cov-report=html -p no:logfire
-

This generates an HTML coverage report in htmlcov/index.html. Open this file in your browser to see detailed coverage information.

Coverage Goals

See Also

\ No newline at end of file +

This generates an HTML coverage report in htmlcov/index.html. Open this file in your browser to see detailed coverage information.

Coverage Goals

See Also

\ No newline at end of file diff --git a/site/getting-started/examples/index.html b/site/getting-started/examples/index.html index 88d4916..8f577a4 100644 --- a/site/getting-started/examples/index.html +++ b/site/getting-started/examples/index.html @@ -46,4 +46,4 @@ MAX_ITERATIONS=20 DEFAULT_TOKEN_LIMIT=200000 USE_GRAPH_EXECUTION=true -

Next Steps

\ No newline at end of file +

Next Steps

\ No newline at end of file diff --git a/site/getting-started/installation/index.html b/site/getting-started/installation/index.html index c702d58..d3673d6 100644 --- a/site/getting-started/installation/index.html +++ b/site/getting-started/installation/index.html @@ -24,4 +24,4 @@

See the Configuration Guide for all available options.

6. Verify Installation

Run the application:

uv run gradio run src/app.py
 

Open your browser to http://localhost:7860 to verify the installation.

Development Setup

For development, install dev dependencies:

uv sync --all-extras --dev
 

Install pre-commit hooks:

uv run pre-commit install
-

Troubleshooting

Common Issues

Import Errors: - Ensure you've installed all required dependencies - Check that Python 3.11+ is being used

API Key Errors: - Verify your .env file is in the project root - Check that API keys are correctly formatted - Ensure at least one LLM provider is configured

Module Not Found: - Run uv sync or pip install -e . again - Check that you're in the correct virtual environment

Port Already in Use: - Change the port in src/app.py or use environment variable - Kill the process using port 7860

Next Steps

\ No newline at end of file +

Troubleshooting

Common Issues

Import Errors: - Ensure you've installed all required dependencies - Check that Python 3.11+ is being used

API Key Errors: - Verify your .env file is in the project root - Check that API keys are correctly formatted - Ensure at least one LLM provider is configured

Module Not Found: - Run uv sync or pip install -e . again - Check that you're in the correct virtual environment

Port Already in Use: - Change the port in src/app.py or use environment variable - Kill the process using port 7860

Next Steps

\ No newline at end of file diff --git a/site/getting-started/mcp-integration/index.html b/site/getting-started/mcp-integration/index.html index 9c98dcd..f0c0db2 100644 --- a/site/getting-started/mcp-integration/index.html +++ b/site/getting-started/mcp-integration/index.html @@ -32,4 +32,4 @@ } } } -

Next Steps

\ No newline at end of file +

Next Steps

\ No newline at end of file diff --git a/site/getting-started/quick-start/index.html b/site/getting-started/quick-start/index.html index f3fadf4..2f6bd2d 100644 --- a/site/getting-started/quick-start/index.html +++ b/site/getting-started/quick-start/index.html @@ -41,4 +41,4 @@

Complex Query

Review the evidence for using metformin as an anti-aging intervention, 
 including clinical trials, mechanisms of action, and safety profile.
 

Clinical Trial Query

What are the active clinical trials investigating Alzheimer's disease treatments?
-

Next Steps

\ No newline at end of file +

Next Steps

\ No newline at end of file diff --git a/site/index.html b/site/index.html index 5e2f78d..8d8741f 100644 --- a/site/index.html +++ b/site/index.html @@ -13,4 +13,4 @@ # Start the Gradio app uv run gradio run src/app.py -

Open your browser to http://localhost:7860.

For detailed installation and setup instructions, see the Getting Started Guide.

Architecture

The DETERMINATOR uses a Vertical Slice Architecture:

  1. Search Slice: Retrieving evidence from multiple sources (web, PubMed, ClinicalTrials.gov, Europe PMC, RAG) based on query analysis
  2. Judge Slice: Evaluating evidence quality using LLMs
  3. Orchestrator Slice: Managing the research loop and UI

The system supports three main research patterns:

Learn more about the Architecture.

Documentation

\ No newline at end of file +

Open your browser to http://localhost:7860.

For detailed installation and setup instructions, see the Getting Started Guide.

Architecture

The DETERMINATOR uses a Vertical Slice Architecture:

  1. Search Slice: Retrieving evidence from multiple sources (web, PubMed, ClinicalTrials.gov, Europe PMC, RAG) based on query analysis
  2. Judge Slice: Evaluating evidence quality using LLMs
  3. Orchestrator Slice: Managing the research loop and UI

The system supports three main research patterns:

Learn more about the Architecture.

Documentation

\ No newline at end of file diff --git a/site/license/index.html b/site/license/index.html index a352fec..74f9f7f 100644 --- a/site/license/index.html +++ b/site/license/index.html @@ -1 +1 @@ - License - The DETERMINATOR
Skip to content

License

DeepCritical is licensed under the MIT License.

MIT License

Copyright (c) 2024 DeepCritical Team

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

\ No newline at end of file + License - The DETERMINATOR
Skip to content

License

DeepCritical is licensed under the MIT License.

MIT License

Copyright (c) 2024 DeepCritical Team

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

\ No newline at end of file diff --git a/site/overview/architecture/index.html b/site/overview/architecture/index.html index b7a56c1..d2b148b 100644 --- a/site/overview/architecture/index.html +++ b/site/overview/architecture/index.html @@ -1 +1 @@ - Architecture Overview - The DETERMINATOR
Skip to content

Architecture Overview

The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations). The system automatically determines if medical knowledge sources are needed and adapts its search strategy accordingly. It supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.

Core Architecture

Orchestration Patterns

  1. Graph Orchestrator (src/orchestrator/graph_orchestrator.py):
  2. Graph-based execution using Pydantic AI agents as nodes
  3. Supports both iterative and deep research patterns
  4. Node types: Agent, State, Decision, Parallel
  5. Edge types: Sequential, Conditional, Parallel
  6. Conditional routing based on knowledge gaps, budget, and iterations
  7. Parallel execution for concurrent research loops
  8. Event streaming via AsyncGenerator[AgentEvent] for real-time UI updates
  9. Fallback to agent chains when graph execution is disabled

  10. Deep Research Flow (src/orchestrator/research_flow.py):

  11. Pattern: Planner → Parallel Iterative Loops (one per section) → Synthesis
  12. Uses PlannerAgent to break query into report sections
  13. Runs IterativeResearchFlow instances in parallel per section via WorkflowManager
  14. Synthesizes results using LongWriterAgent or ProofreaderAgent
  15. Supports both graph execution (use_graph=True) and agent chains (use_graph=False)
  16. Budget tracking per section and globally
  17. State synchronization across parallel loops

  18. Iterative Research Flow (src/orchestrator/research_flow.py):

  19. Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete
  20. Uses KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent, WriterAgent
  21. JudgeHandler assesses evidence sufficiency
  22. Iterates until research complete or constraints met (iterations, time, tokens)
  23. Supports graph execution and agent chains

  24. Magentic Orchestrator (src/orchestrator_magentic.py):

  25. Multi-agent coordination using agent-framework-core
  26. ChatAgent pattern with internal LLMs per agent
  27. Uses MagenticBuilder with participants: searcher, hypothesizer, judge, reporter
  28. Manager orchestrates agents via OpenAIChatClient
  29. Requires OpenAI API key (function calling support)
  30. Event-driven: converts Magentic events to AgentEvent for UI streaming
  31. Supports long-running workflows with max rounds and stall/reset handling

  32. Hierarchical Orchestrator (src/orchestrator_hierarchical.py):

  33. Uses SubIterationMiddleware with ResearchTeam and LLMSubIterationJudge
  34. Adapts Magentic ChatAgent to SubIterationTeam protocol
  35. Event-driven via asyncio.Queue for coordination
  36. Supports sub-iteration patterns for complex research tasks

  37. Legacy Simple Mode (src/legacy_orchestrator.py):

  38. Linear search-judge-synthesize loop
  39. Uses SearchHandlerProtocol and JudgeHandlerProtocol
  40. Generator-based design yielding AgentEvent objects
  41. Backward compatibility for simple use cases

Long-Running Task Support

The system is designed for long-running research tasks with comprehensive state management and streaming:

  1. Event Streaming:
  2. All orchestrators yield AgentEvent objects via AsyncGenerator
  3. Real-time UI updates through Gradio chat interface
  4. Event types: started, searching, search_complete, judging, judge_complete, looping, synthesizing, hypothesizing, complete, error
  5. Metadata includes iteration numbers, tool names, result counts, durations

  6. Budget Tracking (src/middleware/budget_tracker.py):

  7. Per-loop and global budget management
  8. Tracks: tokens, time (seconds), iterations
  9. Budget enforcement at decision nodes
  10. Token estimation (~4 chars per token)
  11. Early termination when budgets exceeded
  12. Budget summaries for monitoring

  13. Workflow Manager (src/middleware/workflow_manager.py):

  14. Coordinates parallel research loops
  15. Tracks loop status: pending, running, completed, failed, cancelled
  16. Synchronizes evidence between loops and global state
  17. Handles errors per loop (doesn't fail all if one fails)
  18. Supports loop cancellation and timeout handling
  19. Evidence deduplication across parallel loops

  20. State Management (src/middleware/state_machine.py):

  21. Thread-safe isolation using ContextVar for concurrent requests
  22. WorkflowState tracks: evidence, conversation history, embedding service
  23. Evidence deduplication by URL
  24. Semantic search via embedding service
  25. State persistence across long-running workflows
  26. Supports both iterative and deep research patterns

  27. Gradio UI (src/app.py):

  28. Real-time streaming of research progress
  29. Accordion-based UI for pending/done operations
  30. OAuth integration (HuggingFace)
  31. Multiple backend support (API keys, free tier)
  32. Handles long-running tasks with progress indicators
  33. Event accumulation for pending operations

Graph Architecture

The graph orchestrator (src/orchestrator/graph_orchestrator.py) implements a flexible graph-based execution model:

Node Types:

  • Agent Nodes: Execute Pydantic AI agents (e.g., KnowledgeGapAgent, ToolSelectorAgent)
  • State Nodes: Update or read workflow state (evidence, conversation)
  • Decision Nodes: Make routing decisions (research complete?, budget exceeded?)
  • Parallel Nodes: Execute multiple nodes concurrently (parallel research loops)

Edge Types:

  • Sequential Edges: Always traversed (no condition)
  • Conditional Edges: Traversed based on condition (e.g., if research complete → writer, else → tool selector)
  • Parallel Edges: Used for parallel execution branches

Graph Patterns:

  • Iterative Graph: [Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?] → [Tool Selector] or [Writer]
  • Deep Research Graph: [Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]

Execution Flow:

  1. Graph construction from nodes and edges
  2. Graph validation (no cycles, all nodes reachable)
  3. Graph execution from entry node
  4. Node execution based on type
  5. Edge evaluation for next node(s)
  6. Parallel execution via asyncio.gather()
  7. State updates at state nodes
  8. Event streaming for UI

Key Components

  • Orchestrators: Multiple orchestration patterns (src/orchestrator/, src/orchestrator_*.py)
  • Research Flows: Iterative and deep research patterns (src/orchestrator/research_flow.py)
  • Graph Builder: Graph construction utilities (src/agent_factory/graph_builder.py)
  • Agents: Pydantic AI agents (src/agents/, src/agent_factory/agents.py)
  • Search Tools: Neo4j knowledge graph, PubMed, ClinicalTrials.gov, Europe PMC, Web search, RAG (src/tools/)
  • Judge Handler: LLM-based evidence assessment (src/agent_factory/judges.py)
  • Embeddings: Semantic search & deduplication (src/services/embeddings.py)
  • Statistical Analyzer: Modal sandbox execution (src/services/statistical_analyzer.py)
  • Multimodal Processing: Image OCR and audio STT/TTS services (src/services/multimodal_processing.py, src/services/audio_processing.py)
  • Middleware: State management, budget tracking, workflow coordination (src/middleware/)
  • MCP Tools: Claude Desktop integration (src/mcp_tools.py)
  • Gradio UI: Web interface with MCP server and streaming (src/app.py)

Research Team & Parallel Execution

The system supports complex research workflows through:

  1. WorkflowManager: Coordinates multiple parallel research loops
  2. Creates and tracks ResearchLoop instances
  3. Runs loops in parallel via asyncio.gather()
  4. Synchronizes evidence to global state
  5. Handles loop failures gracefully

  6. Deep Research Pattern: Breaks complex queries into sections

  7. Planner creates report outline with sections
  8. Each section runs as independent iterative research loop
  9. Loops execute in parallel
  10. Evidence shared across loops via global state
  11. Final synthesis combines all section results

  12. State Synchronization: Thread-safe evidence sharing

  13. Evidence deduplication by URL
  14. Global state accessible to all loops
  15. Semantic search across all collected evidence
  16. Conversation history tracking per iteration

Configuration & Modes

  • Orchestrator Factory (src/orchestrator_factory.py):
  • Auto-detects mode: "advanced" if OpenAI key available, else "simple"
  • Supports explicit mode selection: "simple", "magentic" (alias for "advanced"), "advanced", "iterative", "deep", "auto"
  • Lazy imports for optional dependencies

  • Orchestrator Modes (selected in UI or via factory):

  • simple: Legacy linear search-judge loop (Free Tier)
  • advanced or magentic: Multi-agent coordination using Microsoft Agent Framework (requires OpenAI API key)
  • iterative: Knowledge-gap-driven research with single loop (Free Tier)
  • deep: Parallel section-based research with planning (Free Tier)
  • auto: Intelligent mode detection based on query complexity (Free Tier)

  • Graph Research Modes (used within graph orchestrator, separate from orchestrator mode):

  • iterative: Single research loop pattern
  • deep: Multi-section parallel research pattern
  • auto: Auto-detect pattern based on query complexity

  • Execution Modes:

  • use_graph=True: Graph-based execution (parallel, conditional routing)
  • use_graph=False: Agent chains (sequential, backward compatible)

Note: The UI provides separate controls for orchestrator mode and graph research mode. When using graph-based orchestrators (iterative/deep/auto), the graph research mode determines the specific pattern used within the graph execution.

\ No newline at end of file + Architecture Overview - The DETERMINATOR
Skip to content

Architecture Overview

The DETERMINATOR is a powerful generalist deep research agent system that uses iterative search-and-judge loops to comprehensively investigate any research question. It stops at nothing until finding precise answers, only stopping at configured limits (budget, time, iterations). The system automatically determines if medical knowledge sources are needed and adapts its search strategy accordingly. It supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.

Core Architecture

Orchestration Patterns

  1. Graph Orchestrator (src/orchestrator/graph_orchestrator.py):
  2. Graph-based execution using Pydantic AI agents as nodes
  3. Supports both iterative and deep research patterns
  4. Node types: Agent, State, Decision, Parallel
  5. Edge types: Sequential, Conditional, Parallel
  6. Conditional routing based on knowledge gaps, budget, and iterations
  7. Parallel execution for concurrent research loops
  8. Event streaming via AsyncGenerator[AgentEvent] for real-time UI updates
  9. Fallback to agent chains when graph execution is disabled

  10. Deep Research Flow (src/orchestrator/research_flow.py):

  11. Pattern: Planner → Parallel Iterative Loops (one per section) → Synthesis
  12. Uses PlannerAgent to break query into report sections
  13. Runs IterativeResearchFlow instances in parallel per section via WorkflowManager
  14. Synthesizes results using LongWriterAgent or ProofreaderAgent
  15. Supports both graph execution (use_graph=True) and agent chains (use_graph=False)
  16. Budget tracking per section and globally
  17. State synchronization across parallel loops

  18. Iterative Research Flow (src/orchestrator/research_flow.py):

  19. Pattern: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete
  20. Uses KnowledgeGapAgent, ToolSelectorAgent, ThinkingAgent, WriterAgent
  21. JudgeHandler assesses evidence sufficiency
  22. Iterates until research complete or constraints met (iterations, time, tokens)
  23. Supports graph execution and agent chains

  24. Magentic Orchestrator (src/orchestrator_magentic.py):

  25. Multi-agent coordination using agent-framework-core
  26. ChatAgent pattern with internal LLMs per agent
  27. Uses MagenticBuilder with participants: searcher, hypothesizer, judge, reporter
  28. Manager orchestrates agents via OpenAIChatClient
  29. Requires OpenAI API key (function calling support)
  30. Event-driven: converts Magentic events to AgentEvent for UI streaming
  31. Supports long-running workflows with max rounds and stall/reset handling

  32. Hierarchical Orchestrator (src/orchestrator_hierarchical.py):

  33. Uses SubIterationMiddleware with ResearchTeam and LLMSubIterationJudge
  34. Adapts Magentic ChatAgent to SubIterationTeam protocol
  35. Event-driven via asyncio.Queue for coordination
  36. Supports sub-iteration patterns for complex research tasks

  37. Legacy Simple Mode (src/legacy_orchestrator.py):

  38. Linear search-judge-synthesize loop
  39. Uses SearchHandlerProtocol and JudgeHandlerProtocol
  40. Generator-based design yielding AgentEvent objects
  41. Backward compatibility for simple use cases

Long-Running Task Support

The system is designed for long-running research tasks with comprehensive state management and streaming:

  1. Event Streaming:
  2. All orchestrators yield AgentEvent objects via AsyncGenerator
  3. Real-time UI updates through Gradio chat interface
  4. Event types: started, searching, search_complete, judging, judge_complete, looping, synthesizing, hypothesizing, complete, error
  5. Metadata includes iteration numbers, tool names, result counts, durations

  6. Budget Tracking (src/middleware/budget_tracker.py):

  7. Per-loop and global budget management
  8. Tracks: tokens, time (seconds), iterations
  9. Budget enforcement at decision nodes
  10. Token estimation (~4 chars per token)
  11. Early termination when budgets exceeded
  12. Budget summaries for monitoring

  13. Workflow Manager (src/middleware/workflow_manager.py):

  14. Coordinates parallel research loops
  15. Tracks loop status: pending, running, completed, failed, cancelled
  16. Synchronizes evidence between loops and global state
  17. Handles errors per loop (doesn't fail all if one fails)
  18. Supports loop cancellation and timeout handling
  19. Evidence deduplication across parallel loops

  20. State Management (src/middleware/state_machine.py):

  21. Thread-safe isolation using ContextVar for concurrent requests
  22. WorkflowState tracks: evidence, conversation history, embedding service
  23. Evidence deduplication by URL
  24. Semantic search via embedding service
  25. State persistence across long-running workflows
  26. Supports both iterative and deep research patterns

  27. Gradio UI (src/app.py):

  28. Real-time streaming of research progress
  29. Accordion-based UI for pending/done operations
  30. OAuth integration (HuggingFace)
  31. Multiple backend support (API keys, free tier)
  32. Handles long-running tasks with progress indicators
  33. Event accumulation for pending operations

Graph Architecture

The graph orchestrator (src/orchestrator/graph_orchestrator.py) implements a flexible graph-based execution model:

Node Types:

  • Agent Nodes: Execute Pydantic AI agents (e.g., KnowledgeGapAgent, ToolSelectorAgent)
  • State Nodes: Update or read workflow state (evidence, conversation)
  • Decision Nodes: Make routing decisions (research complete?, budget exceeded?)
  • Parallel Nodes: Execute multiple nodes concurrently (parallel research loops)

Edge Types:

  • Sequential Edges: Always traversed (no condition)
  • Conditional Edges: Traversed based on condition (e.g., if research complete → writer, else → tool selector)
  • Parallel Edges: Used for parallel execution branches

Graph Patterns:

  • Iterative Graph: [Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?] → [Tool Selector] or [Writer]
  • Deep Research Graph: [Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]

Execution Flow:

  1. Graph construction from nodes and edges
  2. Graph validation (no cycles, all nodes reachable)
  3. Graph execution from entry node
  4. Node execution based on type
  5. Edge evaluation for next node(s)
  6. Parallel execution via asyncio.gather()
  7. State updates at state nodes
  8. Event streaming for UI

Key Components

  • Orchestrators: Multiple orchestration patterns (src/orchestrator/, src/orchestrator_*.py)
  • Research Flows: Iterative and deep research patterns (src/orchestrator/research_flow.py)
  • Graph Builder: Graph construction utilities (src/agent_factory/graph_builder.py)
  • Agents: Pydantic AI agents (src/agents/, src/agent_factory/agents.py)
  • Search Tools: Neo4j knowledge graph, PubMed, ClinicalTrials.gov, Europe PMC, Web search, RAG (src/tools/)
  • Judge Handler: LLM-based evidence assessment (src/agent_factory/judges.py)
  • Embeddings: Semantic search & deduplication (src/services/embeddings.py)
  • Statistical Analyzer: Modal sandbox execution (src/services/statistical_analyzer.py)
  • Multimodal Processing: Image OCR and audio STT/TTS services (src/services/multimodal_processing.py, src/services/audio_processing.py)
  • Middleware: State management, budget tracking, workflow coordination (src/middleware/)
  • MCP Tools: Claude Desktop integration (src/mcp_tools.py)
  • Gradio UI: Web interface with MCP server and streaming (src/app.py)

Research Team & Parallel Execution

The system supports complex research workflows through:

  1. WorkflowManager: Coordinates multiple parallel research loops
  2. Creates and tracks ResearchLoop instances
  3. Runs loops in parallel via asyncio.gather()
  4. Synchronizes evidence to global state
  5. Handles loop failures gracefully

  6. Deep Research Pattern: Breaks complex queries into sections

  7. Planner creates report outline with sections
  8. Each section runs as independent iterative research loop
  9. Loops execute in parallel
  10. Evidence shared across loops via global state
  11. Final synthesis combines all section results

  12. State Synchronization: Thread-safe evidence sharing

  13. Evidence deduplication by URL
  14. Global state accessible to all loops
  15. Semantic search across all collected evidence
  16. Conversation history tracking per iteration

Configuration & Modes

  • Orchestrator Factory (src/orchestrator_factory.py):
  • Auto-detects mode: "advanced" if OpenAI key available, else "simple"
  • Supports explicit mode selection: "simple", "magentic" (alias for "advanced"), "advanced", "iterative", "deep", "auto"
  • Lazy imports for optional dependencies

  • Orchestrator Modes (selected in UI or via factory):

  • simple: Legacy linear search-judge loop (Free Tier)
  • advanced or magentic: Multi-agent coordination using Microsoft Agent Framework (requires OpenAI API key)
  • iterative: Knowledge-gap-driven research with single loop (Free Tier)
  • deep: Parallel section-based research with planning (Free Tier)
  • auto: Intelligent mode detection based on query complexity (Free Tier)

  • Graph Research Modes (used within graph orchestrator, separate from orchestrator mode):

  • iterative: Single research loop pattern
  • deep: Multi-section parallel research pattern
  • auto: Auto-detect pattern based on query complexity

  • Execution Modes:

  • use_graph=True: Graph-based execution (parallel, conditional routing)
  • use_graph=False: Agent chains (sequential, backward compatible)

Note: The UI provides separate controls for orchestrator mode and graph research mode. When using graph-based orchestrators (iterative/deep/auto), the graph research mode determines the specific pattern used within the graph execution.

\ No newline at end of file diff --git a/site/overview/features/index.html b/site/overview/features/index.html index de754ff..3fdb0fa 100644 --- a/site/overview/features/index.html +++ b/site/overview/features/index.html @@ -1 +1 @@ - Features - The DETERMINATOR
Skip to content

Features

The DETERMINATOR provides a comprehensive set of features for AI-assisted research:

Core Features

  • General Web Search: Search general knowledge sources for any domain
  • Neo4j Knowledge Graph: Search structured knowledge graph for papers and disease relationships
  • PubMed: Search peer-reviewed biomedical literature via NCBI E-utilities (automatically used when medical knowledge needed)
  • ClinicalTrials.gov: Search interventional clinical trials (automatically used when medical knowledge needed)
  • Europe PMC: Search preprints and peer-reviewed articles (includes bioRxiv/medRxiv)
  • RAG: Semantic search within collected evidence using LlamaIndex
  • Automatic Source Selection: Automatically determines which sources are needed based on query analysis

MCP Integration

  • Model Context Protocol: Expose search tools via MCP server
  • Claude Desktop: Use The DETERMINATOR tools directly from Claude Desktop
  • MCP Clients: Compatible with any MCP-compatible client

Authentication

  • REQUIRED: Authentication is mandatory before using the application
  • HuggingFace OAuth: Sign in with HuggingFace account for automatic API token usage (recommended)
  • Manual API Keys: Support for HuggingFace API keys via environment variables (HF_TOKEN or HUGGINGFACE_API_KEY)
  • Free Tier Support: Automatic fallback to HuggingFace Inference API (public models) when no API key is available
  • Authentication Check: The application will display an error message if authentication is not provided

Secure Code Execution

  • Modal Sandbox: Secure execution of AI-generated statistical code
  • Isolated Environment: Network isolation and package version pinning
  • Safe Execution: Prevents malicious code execution

Semantic Search & RAG

  • LlamaIndex Integration: Advanced RAG capabilities
  • Vector Storage: ChromaDB for embedding storage
  • Semantic Deduplication: Automatic detection of similar evidence
  • Embedding Service: Local sentence-transformers (no API key required)

Orchestration Patterns

  • Graph-Based Execution: Flexible graph orchestration with conditional routing
  • Parallel Research Loops: Run multiple research tasks concurrently
  • Iterative Research: Single-loop research with search-judge-synthesize cycles that continues until precise answers are found
  • Deep Research: Multi-section parallel research with planning and synthesis
  • Magentic Orchestration: Multi-agent coordination using Microsoft Agent Framework (alias: "advanced" mode)
  • Stops at Nothing: Only stops at configured limits (budget, time, iterations), otherwise continues until finding precise answers

Orchestrator Modes: - simple: Legacy linear search-judge loop - advanced (or magentic): Multi-agent coordination (requires OpenAI API key) - iterative: Knowledge-gap-driven research with single loop - deep: Parallel section-based research with planning - auto: Intelligent mode detection based on query complexity

Graph Research Modes (used within graph orchestrator): - iterative: Single research loop pattern - deep: Multi-section parallel research pattern - auto: Auto-detect pattern based on query complexity

Execution Modes: - use_graph=True: Graph-based execution with parallel and conditional routing - use_graph=False: Agent chains with sequential execution (backward compatible)

Real-Time Streaming

  • Event Streaming: Real-time updates via AsyncGenerator[AgentEvent]
  • Progress Tracking: Monitor research progress with detailed event metadata
  • UI Integration: Seamless integration with Gradio chat interface

Budget Management

  • Token Budget: Track and limit LLM token usage
  • Time Budget: Enforce time limits per research loop
  • Iteration Budget: Limit maximum iterations
  • Per-Loop Budgets: Independent budgets for parallel research loops

State Management

  • Thread-Safe Isolation: ContextVar-based state management
  • Evidence Deduplication: Automatic URL-based deduplication
  • Conversation History: Track iteration history and agent interactions
  • State Synchronization: Share evidence across parallel loops

Multimodal Input & Output

  • Image Input (OCR): Upload images and extract text using optical character recognition
  • Audio Input (STT): Record or upload audio files and transcribe to text using speech-to-text
  • Audio Output (TTS): Generate audio responses with text-to-speech synthesis
  • Configurable Settings: Enable/disable multimodal features via sidebar settings
  • Voice Selection: Choose from multiple TTS voices (American English: af_, am_)
  • Speech Speed Control: Adjust TTS speech speed (0.5x to 2.0x)
  • Multimodal Processing Service: Integrated service for processing images and audio files

Advanced Features

Agent System

  • Pydantic AI Agents: Type-safe agent implementation
  • Structured Output: Pydantic models for agent responses
  • Agent Factory: Centralized agent creation with fallback support
  • Specialized Agents: Knowledge gap, tool selector, writer, proofreader, and more

Search Tools

  • Rate Limiting: Built-in rate limiting for external APIs
  • Retry Logic: Automatic retry with exponential backoff
  • Query Preprocessing: Automatic query enhancement and synonym expansion
  • Evidence Conversion: Automatic conversion to structured Evidence objects

Error Handling

  • Custom Exceptions: Hierarchical exception system
  • Error Chaining: Preserve exception context
  • Structured Logging: Comprehensive logging with structlog
  • Graceful Degradation: Fallback handlers for missing dependencies

Configuration

  • Pydantic Settings: Type-safe configuration management
  • Environment Variables: Support for .env files
  • Validation: Automatic configuration validation
  • Flexible Providers: Support for multiple LLM and embedding providers

Testing

  • Unit Tests: Comprehensive unit test coverage
  • Integration Tests: Real API integration tests
  • Mock Support: Extensive mocking utilities
  • Coverage Reports: Code coverage tracking

UI Features

Gradio Interface

  • Real-Time Chat: Interactive chat interface with multimodal support
  • Streaming Updates: Live progress updates
  • Accordion UI: Organized display of pending/done operations
  • OAuth Integration: Seamless HuggingFace authentication
  • Multimodal Input: Support for text, images, and audio input in the same interface
  • Sidebar Settings: Configuration accordions for research, multimodal, and audio settings

MCP Server

  • RESTful API: HTTP-based MCP server
  • Tool Discovery: Automatic tool registration
  • Request Handling: Async request processing
  • Error Responses: Structured error responses

Development Features

Code Quality

  • Type Safety: Full type hints with mypy strict mode
  • Linting: Ruff for code quality
  • Formatting: Automatic code formatting
  • Pre-commit Hooks: Automated quality checks

Documentation

  • Comprehensive Docs: Detailed documentation for all components
  • Code Examples: Extensive code examples
  • Architecture Diagrams: Visual architecture documentation
  • API Reference: Complete API documentation
\ No newline at end of file + Features - The DETERMINATOR
Skip to content

Features

The DETERMINATOR provides a comprehensive set of features for AI-assisted research:

Core Features

  • General Web Search: Search general knowledge sources for any domain
  • Neo4j Knowledge Graph: Search structured knowledge graph for papers and disease relationships
  • PubMed: Search peer-reviewed biomedical literature via NCBI E-utilities (automatically used when medical knowledge needed)
  • ClinicalTrials.gov: Search interventional clinical trials (automatically used when medical knowledge needed)
  • Europe PMC: Search preprints and peer-reviewed articles (includes bioRxiv/medRxiv)
  • RAG: Semantic search within collected evidence using LlamaIndex
  • Automatic Source Selection: Automatically determines which sources are needed based on query analysis

MCP Integration

  • Model Context Protocol: Expose search tools via MCP server
  • Claude Desktop: Use The DETERMINATOR tools directly from Claude Desktop
  • MCP Clients: Compatible with any MCP-compatible client

Authentication

  • REQUIRED: Authentication is mandatory before using the application
  • HuggingFace OAuth: Sign in with HuggingFace account for automatic API token usage (recommended)
  • Manual API Keys: Support for HuggingFace API keys via environment variables (HF_TOKEN or HUGGINGFACE_API_KEY)
  • Free Tier Support: Automatic fallback to HuggingFace Inference API (public models) when no API key is available
  • Authentication Check: The application will display an error message if authentication is not provided

Secure Code Execution

  • Modal Sandbox: Secure execution of AI-generated statistical code
  • Isolated Environment: Network isolation and package version pinning
  • Safe Execution: Prevents malicious code execution

Semantic Search & RAG

  • LlamaIndex Integration: Advanced RAG capabilities
  • Vector Storage: ChromaDB for embedding storage
  • Semantic Deduplication: Automatic detection of similar evidence
  • Embedding Service: Local sentence-transformers (no API key required)

Orchestration Patterns

  • Graph-Based Execution: Flexible graph orchestration with conditional routing
  • Parallel Research Loops: Run multiple research tasks concurrently
  • Iterative Research: Single-loop research with search-judge-synthesize cycles that continues until precise answers are found
  • Deep Research: Multi-section parallel research with planning and synthesis
  • Magentic Orchestration: Multi-agent coordination using Microsoft Agent Framework (alias: "advanced" mode)
  • Stops at Nothing: Only stops at configured limits (budget, time, iterations), otherwise continues until finding precise answers

Orchestrator Modes: - simple: Legacy linear search-judge loop - advanced (or magentic): Multi-agent coordination (requires OpenAI API key) - iterative: Knowledge-gap-driven research with single loop - deep: Parallel section-based research with planning - auto: Intelligent mode detection based on query complexity

Graph Research Modes (used within graph orchestrator): - iterative: Single research loop pattern - deep: Multi-section parallel research pattern - auto: Auto-detect pattern based on query complexity

Execution Modes: - use_graph=True: Graph-based execution with parallel and conditional routing - use_graph=False: Agent chains with sequential execution (backward compatible)

Real-Time Streaming

  • Event Streaming: Real-time updates via AsyncGenerator[AgentEvent]
  • Progress Tracking: Monitor research progress with detailed event metadata
  • UI Integration: Seamless integration with Gradio chat interface

Budget Management

  • Token Budget: Track and limit LLM token usage
  • Time Budget: Enforce time limits per research loop
  • Iteration Budget: Limit maximum iterations
  • Per-Loop Budgets: Independent budgets for parallel research loops

State Management

  • Thread-Safe Isolation: ContextVar-based state management
  • Evidence Deduplication: Automatic URL-based deduplication
  • Conversation History: Track iteration history and agent interactions
  • State Synchronization: Share evidence across parallel loops

Multimodal Input & Output

  • Image Input (OCR): Upload images and extract text using optical character recognition
  • Audio Input (STT): Record or upload audio files and transcribe to text using speech-to-text
  • Audio Output (TTS): Generate audio responses with text-to-speech synthesis
  • Configurable Settings: Enable/disable multimodal features via sidebar settings
  • Voice Selection: Choose from multiple TTS voices (American English: af_, am_)
  • Speech Speed Control: Adjust TTS speech speed (0.5x to 2.0x)
  • Multimodal Processing Service: Integrated service for processing images and audio files

Advanced Features

Agent System

  • Pydantic AI Agents: Type-safe agent implementation
  • Structured Output: Pydantic models for agent responses
  • Agent Factory: Centralized agent creation with fallback support
  • Specialized Agents: Knowledge gap, tool selector, writer, proofreader, and more

Search Tools

  • Rate Limiting: Built-in rate limiting for external APIs
  • Retry Logic: Automatic retry with exponential backoff
  • Query Preprocessing: Automatic query enhancement and synonym expansion
  • Evidence Conversion: Automatic conversion to structured Evidence objects

Error Handling

  • Custom Exceptions: Hierarchical exception system
  • Error Chaining: Preserve exception context
  • Structured Logging: Comprehensive logging with structlog
  • Graceful Degradation: Fallback handlers for missing dependencies

Configuration

  • Pydantic Settings: Type-safe configuration management
  • Environment Variables: Support for .env files
  • Validation: Automatic configuration validation
  • Flexible Providers: Support for multiple LLM and embedding providers

Testing

  • Unit Tests: Comprehensive unit test coverage
  • Integration Tests: Real API integration tests
  • Mock Support: Extensive mocking utilities
  • Coverage Reports: Code coverage tracking

UI Features

Gradio Interface

  • Real-Time Chat: Interactive chat interface with multimodal support
  • Streaming Updates: Live progress updates
  • Accordion UI: Organized display of pending/done operations
  • OAuth Integration: Seamless HuggingFace authentication
  • Multimodal Input: Support for text, images, and audio input in the same interface
  • Sidebar Settings: Configuration accordions for research, multimodal, and audio settings

MCP Server

  • RESTful API: HTTP-based MCP server
  • Tool Discovery: Automatic tool registration
  • Request Handling: Async request processing
  • Error Responses: Structured error responses

Development Features

Code Quality

  • Type Safety: Full type hints with mypy strict mode
  • Linting: Ruff for code quality
  • Formatting: Automatic code formatting
  • Pre-commit Hooks: Automated quality checks

Documentation

  • Comprehensive Docs: Detailed documentation for all components
  • Code Examples: Extensive code examples
  • Architecture Diagrams: Visual architecture documentation
  • API Reference: Complete API documentation
\ No newline at end of file diff --git a/site/overview/quick-start/index.html b/site/overview/quick-start/index.html index 4a1744a..effa59f 100644 --- a/site/overview/quick-start/index.html +++ b/site/overview/quick-start/index.html @@ -19,4 +19,4 @@ } } } -

  • Restart Claude Desktop

  • Use DeepCritical tools directly from Claude Desktop
  • Available Tools

    Note: The application automatically uses all available search tools (Neo4j, PubMed, ClinicalTrials.gov, Europe PMC, Web search, RAG) based on query analysis. Neo4j knowledge graph search is included by default for biomedical queries.

    Next Steps

    \ No newline at end of file +

  • Restart Claude Desktop

  • Use DeepCritical tools directly from Claude Desktop
  • Available Tools

    Note: The application automatically uses all available search tools (Neo4j, PubMed, ClinicalTrials.gov, Europe PMC, Web search, RAG) based on query analysis. Neo4j knowledge graph search is included by default for biomedical queries.

    Next Steps

    \ No newline at end of file diff --git a/site/team/index.html b/site/team/index.html index eda114d..2465d34 100644 --- a/site/team/index.html +++ b/site/team/index.html @@ -1 +1 @@ - Team - The DETERMINATOR
    Skip to content

    Team

    DeepCritical is developed by a team of researchers and developers working on AI-assisted research.

    Team Members

    ZJ

    Mario Aderman

    Joseph Pollack

    Virat Chauran

    Anna Bossler

    About

    The DeepCritical team met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for AI-assisted research to turn the vast amounts of clinical data into cures.

    Contributing

    We welcome contributions! See the Contributing Guide for details.

    \ No newline at end of file + Team - The DETERMINATOR
    Skip to content

    Team

    DeepCritical is developed by a team of researchers and developers working on AI-assisted research.

    Team Members

    ZJ

    Mario Aderman

    Joseph Pollack

    Virat Chauran

    Anna Bossler

    About

    The DeepCritical team met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for AI-assisted research to turn the vast amounts of clinical data into cures.

    Contributing

    We welcome contributions! See the Contributing Guide for details.

    \ No newline at end of file