Skip to content

[Issue]: unable to get DocAgent working with azure openai endpoints #2174

@Vkodthiv

Description

@Vkodthiv

Describe the issue

Our corporate network does not allow access to OpenAI endpoints. We use azure openai models instead. The DocAgent sample code during ingestion is trying to connect to Huggingfac.co and again that is blocked on corporate network . How do i get full control over the document ingestion, embedding process for DocAgent?

EXECUTED FUNCTION data_ingest_task...
Call ID: call_mr5PXRJcAbxeqLDFf5eK6nIn
Input arguments: {}
Output:
Data Ingestion Task Failed, Error (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/ds4sd/docling-layout-heron/revision/main (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))"), '(Request ID: 80ae4cfa-9f07-49e0-ba54-96c2fe9746b8)'): 'UMNwriteup.pdf'
_Group_Tool_Executor (to chat_manager):

***** Response from calling tool (call_mr5PXRJcAbxeqLDFf5eK6nIn) *****
Data Ingestion Task Failed, Error (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/ds4sd/docling-layout-heron/revision/main (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))"), '(Request ID: 80ae4cfa-9f07-49e0-ba54-96c2fe9746b8)'): 'UMNwriteup.pdf'


Steps to reproduce

run the sample code at https://docs.ag2.ai/0.10.0/docs/use-cases/notebooks/notebooks/agents_docagent/ but with the following azure llm configs

llm_config = autogen.LLMConfig(
api_type="azure",
api_version="2025-01-01-preview",
base_url="<<azure_oai_endpoint>>",
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
model="gpt-4o",
cache_seed=None
)

Screenshots and logs

Can you ingest UMNwriteup.pdf and tell me the name of the patient?


_User (to chat_manager):

Can you ingest UMNwriteup.pdf and tell me the name of the patient?


Next speaker: DocumentTriageAgent

DocumentTriageAgent (to chat_manager):

{"ingestions":[{"path_or_url":"UMNwriteup.pdf"}],"queries":[{"query_type":"RAG_QUERY","query":"What is the name of the patient mentioned in UMNwriteup.pdf?"}]}


Next speaker: TaskManagerAgent

USING AUTO REPLY...
TaskManagerAgent (to chat_manager):

***** Suggested tool call (call_N0NGqbphUD0O30zCD2EhFe3H): initiate_tasks *****
Arguments:
{"task_init_info":{"ingestions":[{"path_or_url":"UMNwriteup.pdf"}],"queries":[{"query_type":"RAG_QUERY","query":"What is the name of the patient mentioned in UMNwriteup.pdf?"}]}}



Next speaker: _Group_Tool_Executor

EXECUTING FUNCTION initiate_tasks...
Call ID: call_N0NGqbphUD0O30zCD2EhFe3H
Input arguments: {'task_init_info': {'ingestions': [{'path_or_url': 'UMNwriteup.pdf'}], 'queries': [{'query_type': 'RAG_QUERY', 'query': 'What is the name of the patient mentioned in UMNwriteup.pdf?'}]}}

EXECUTED FUNCTION initiate_tasks...
Call ID: call_N0NGqbphUD0O30zCD2EhFe3H
Input arguments: {'task_init_info': {'ingestions': [{'path_or_url': 'UMNwriteup.pdf'}], 'queries': [{'query_type': 'RAG_QUERY', 'query': 'What is the name of the patient mentioned in UMNwriteup.pdf?'}]}}
Output:
Updated context variables with task decisions
_Group_Tool_Executor (to chat_manager):

***** Response from calling tool (call_N0NGqbphUD0O30zCD2EhFe3H) *****
Updated context variables with task decisions



Next speaker: TaskManagerAgent

TaskManagerAgent (to chat_manager):

[Handing off to DoclingDocIngestAgent]


Next speaker: DoclingDocIngestAgent

USING AUTO REPLY...
DoclingDocIngestAgent (to chat_manager):

Ingesting the document "UMNwriteup.pdf" to extract the required information.
***** Suggested tool call (call_mr5PXRJcAbxeqLDFf5eK6nIn): data_ingest_task *****
Arguments:
{}



Next speaker: _Group_Tool_Executor

EXECUTING FUNCTION data_ingest_task...
Call ID: call_mr5PXRJcAbxeqLDFf5eK6nIn
Input arguments: {}
INFO [autogen.agents.experimental.document_agent.document_utils] Detected file. Returning file path...
INFO [docling.datamodel.document] detected formats: [<InputFormat.PDF: 'pdf'>]
INFO [docling.document_converter] Going to convert document batch...
INFO [docling.document_converter] Initializing pipeline for StandardPdfPipeline with options hash 154138fdf3f99bf9804fd62accd0504f
INFO [docling.models.factories.base_factory] Loading plugin 'docling_defaults'
INFO [docling.models.factories] Registered picture descriptions: ['vlm', 'api']
INFO [docling.models.factories.base_factory] Loading plugin 'docling_defaults'
INFO [docling.models.factories] Registered ocr engines: ['auto', 'easyocr', 'ocrmac', 'rapidocr', 'tesserocr', 'tesseract']
INFO [docling.models.auto_ocr_model] Auto OCR model selected ocrmac.
INFO [docling.utils.accelerator_utils] Accelerator device: 'mps'

EXECUTED FUNCTION data_ingest_task...
Call ID: call_mr5PXRJcAbxeqLDFf5eK6nIn
Input arguments: {}
Output:
Data Ingestion Task Failed, Error (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/ds4sd/docling-layout-heron/revision/main (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))"), '(Request ID: 80ae4cfa-9f07-49e0-ba54-96c2fe9746b8)'): 'UMNwriteup.pdf'
_Group_Tool_Executor (to chat_manager):

***** Response from calling tool (call_mr5PXRJcAbxeqLDFf5eK6nIn) *****
Data Ingestion Task Failed, Error (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/ds4sd/docling-layout-heron/revision/main (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))"), '(Request ID: 80ae4cfa-9f07-49e0-ba54-96c2fe9746b8)'): 'UMNwriteup.pdf'



Next speaker: ErrorManagerAgent

USING AUTO REPLY...
ErrorManagerAgent (to chat_manager):

The following error(s) have occurred:

  • Data Ingestion Task Failed, Error (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/ds4sd/docling-layout-heron/revision/main (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))"), '(Request ID: 80ae4cfa-9f07-49e0-ba54-96c2fe9746b8)')

TERMINATING RUN (a3d05752-a775-461d-9700-cd08e9fee659): No next speaker selected
doc_agent (to user):

The following error(s) have occurred:

  • Data Ingestion Task Failed, Error (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/ds4sd/docling-layout-heron/revision/main (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))"), '(Request ID: 80ae4cfa-9f07-49e0-ba54-96c2fe9746b8)')

TERMINATING RUN (182a18b9-a24f-40bf-9161-893111804b78): Maximum turns (1) reached
-> Cannot close object, library is destroyed. This may cause a memory leak!

Additional Information

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions