-
|
Hello! I'm converting PDF files through the DocumentConverter interface and would like to get bounding boxes for detected tables. Diving through the source code and documentation, I can see you can get row and column bounding boxes from the TableData class in docling_core, but I would like to get full bounding boxes without the faff of combining these row and column bboxes. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
Hi @alex-gdv! I'm Dosu and I’m helping the docling team. There isn't a direct method in TableData or TableItem for getting the full bounding box of a table. The recommended approach is to collect all non-null cell bounding boxes from your TableData instance and use the BoundingBox.enclosing_bbox method to compute the minimal rectangle that contains the entire table. For example: from docling_core.types.doc.base import BoundingBox
cell_bboxes = [cell.bbox for cell in table_data.table_cells if cell.bbox is not None]
if cell_bboxes:
table_bbox = BoundingBox.enclosing_bbox(cell_bboxes)This is currently the best practice—there's no higher-level API for this yet in docling-core. The BoundingBox.enclosing_bbox method is designed for exactly this use case and keeps things simple and efficient. To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
|
The bot is not wrong, but there is a simpler solution if you're using ConversionResult: you can get a list of TableItem, which has a ProvenanceItem with both bbox and page number: for table in result.document.tables: |
Beta Was this translation helpful? Give feedback.
The bot is not wrong, but there is a simpler solution if you're using ConversionResult: you can get a list of TableItem, which has a ProvenanceItem with both bbox and page number:
`
converter = DocumentConverter()
result = converter.convert(source)
for table in result.document.tables:
page_num = table.prov[0].page_no
bbox = table.prov[0].bbox
`