Requested feature
While working with PPTX files, I came across a formatting issue that could use some enhancement. Specifically, when a slide contains multiple subheadings, each with their own bullet points, the parsed output doesn’t maintain the correct grouping of bullet points under their respective subheadings.
current version : Docling 2.28.4
For example, consider a slide like this:
Currently, the extracted output looks like this:
As shown in the attached screenshot, all bullet points are getting grouped under the first subheading, and the second subheading appears without its associated content.
Suggested Enhancement:
It would be helpful to enhance the PPTX parsing logic to:
- Maintain bullet point association with the correct subheading
- Possibly use text box position, text style, or slide structure hierarchy to infer grouping
Requested feature
While working with PPTX files, I came across a formatting issue that could use some enhancement. Specifically, when a slide contains multiple subheadings, each with their own bullet points, the parsed output doesn’t maintain the correct grouping of bullet points under their respective subheadings.
current version : Docling 2.28.4
For example, consider a slide like this:
Currently, the extracted output looks like this:
As shown in the attached screenshot, all bullet points are getting grouped under the first subheading, and the second subheading appears without its associated content.
Suggested Enhancement:
It would be helpful to enhance the PPTX parsing logic to: