Feature Request
docling.datamodel.settings.settings is a process-wide singleton (AppSettings) and the only way to apply non-default perf / debug / inference configuration is to mutate the singleton's typed sub-models in place. There is no per-converter override hook, no scoped-settings context manager, and no documented reset path.
This forces every embedder of Docling that needs anything other than library defaults to:
- Mutate
docling.datamodel.settings.settings.perf / .debug directly (or via model_validate(model_dump(mode="json")) on the typed sub-model classes),
- Live with the fact that the next batch — or any downstream code that reads the singleton — sees that mutation,
- Hand-roll snapshot / restore around each conversion run if process-state hygiene matters.
Concrete impact
- Per-batch / per-converter configuration. A
DocumentConverter constructed with one set of perf/debug intents cannot coexist in the same process with another that wants different intents — the second DocumentConverter build wins because it overwrites the singleton.
- Process hygiene after errors. When a batch errors, the next batch (or any subsequent code path) sees the previous batch's mutation. There is no
reset() or defaults() accessor.
- Test isolation. A test harness that runs Docling alongside an isolated reference converter cannot rely on pristine library defaults between cases — every test that touches
perf / debug must save and restore manually.
- Multi-tenant embedding. A web service that converts documents on behalf of multiple tenants with different policies (e.g. different
page_batch_size, different profile_pipeline_timings) cannot serve them concurrently from one Python process.
- Any embedder that wants to surface its own config UI. The shape of
BatchConcurrencySettings / DebugSettings is fine; it's the lifecycle that is missing.
Proposed API
A scoped context manager — analogous to decimal.localcontext() or numpy.errstate() — would be the smallest, most idiomatic fix:
from docling.datamodel import settings as docling_settings
with docling_settings.scoped(
perf=BatchConcurrencySettings(page_batch_size=8),
debug=DebugSettings(profile_pipeline_timings=True),
):
result = converter.convert(input_path)
# After the with-block: docling_settings.settings.perf and .debug
# are exactly what they were before.
Sketch implementation:
from contextlib import contextmanager
@contextmanager
def scoped(*, perf: BatchConcurrencySettings | None = None,
debug: DebugSettings | None = None,
inference: InferenceSettings | None = None):
saved = {
"perf": settings.perf,
"debug": settings.debug,
"inference": settings.inference,
}
try:
if perf is not None:
settings.perf = perf
if debug is not None:
settings.debug = debug
if inference is not None:
settings.inference = inference
yield settings
finally:
settings.perf = saved["perf"]
settings.debug = saved["debug"]
settings.inference = saved["inference"]
Optional but useful: a settings.defaults() accessor that returns a fresh AppSettings() for callers that want to reset rather than scope.
Why current state is a footgun
The model classes are correctly typed and Pydantic-validated, which means embedders trust them. The footgun is the lifecycle: there is no signal from the API that a mutation persists for the rest of the process. A reasonable embedder reading AppSettings' attribute names assumes there is a per-instance way to apply them, since that is the dominant pattern with Pydantic settings classes.
We currently work around this in docling-machine with a hand-rolled isolated_docling_globals() context manager that does exactly what the sketch above does — snapshot via model_dump(mode="json"), apply, revert. Every embedder will need to write the same code.
Reproduction
from docling.datamodel.settings import settings, BatchConcurrencySettings
original_size = settings.perf.page_batch_size
print(f"Default page_batch_size: {original_size}") # → 4
# Mutation persists for the rest of the process:
settings.perf = BatchConcurrencySettings(page_batch_size=99)
print(f"After mutation: {settings.perf.page_batch_size}") # → 99
# There is no public API to revert this.
# Every subsequent DocumentConverter and every downstream reader of
# settings.perf sees 99, even after the original embedder has finished.
Suggested rollout
- Add the
scoped() context manager to docling.datamodel.settings.
- Document it in the user guide as the recommended way to apply non-default settings.
- Keep the existing direct-mutation API working — it stays useful for "set once at process start" callers (CLI / scripts).
The change is purely additive: no breaking changes, no migration burden.
Environment
- docling version: 2.93.0
- Python: 3.14 (Apple Silicon / macOS)
Related embedder workaround
For context, our anchor + ADR for this workaround:
Feature Request
docling.datamodel.settings.settingsis a process-wide singleton (AppSettings) and the only way to apply non-defaultperf/debug/inferenceconfiguration is to mutate the singleton's typed sub-models in place. There is no per-converter override hook, no scoped-settings context manager, and no documented reset path.This forces every embedder of Docling that needs anything other than library defaults to:
docling.datamodel.settings.settings.perf/.debugdirectly (or viamodel_validate(model_dump(mode="json"))on the typed sub-model classes),Concrete impact
DocumentConverterconstructed with one set of perf/debug intents cannot coexist in the same process with another that wants different intents — the secondDocumentConverterbuild wins because it overwrites the singleton.reset()ordefaults()accessor.perf/debugmust save and restore manually.page_batch_size, differentprofile_pipeline_timings) cannot serve them concurrently from one Python process.BatchConcurrencySettings/DebugSettingsis fine; it's the lifecycle that is missing.Proposed API
A scoped context manager — analogous to
decimal.localcontext()ornumpy.errstate()— would be the smallest, most idiomatic fix:Sketch implementation:
Optional but useful: a
settings.defaults()accessor that returns a freshAppSettings()for callers that want to reset rather than scope.Why current state is a footgun
The model classes are correctly typed and Pydantic-validated, which means embedders trust them. The footgun is the lifecycle: there is no signal from the API that a mutation persists for the rest of the process. A reasonable embedder reading
AppSettings' attribute names assumes there is a per-instance way to apply them, since that is the dominant pattern with Pydantic settings classes.We currently work around this in docling-machine with a hand-rolled
isolated_docling_globals()context manager that does exactly what the sketch above does — snapshot viamodel_dump(mode="json"), apply, revert. Every embedder will need to write the same code.Reproduction
Suggested rollout
scoped()context manager todocling.datamodel.settings.The change is purely additive: no breaking changes, no migration burden.
Environment
Related embedder workaround
For context, our anchor + ADR for this workaround:
WORKAROUND-DOCLING-GLOBAL-SETTINGSin docling-machine ADR 0002. The ADR's "Removal criteria" lists "docling-core ships a context-managed scoped-settings API" as condition docs: Update links, add GH repository to metadata #1 for retiring our workaround.