Welcome to OCRmyPDF Discussions! #1160
Replies: 2 comments
-
|
I’m working with a scanned PDF that contains a table with two columns, where each column has two lines of text. When I convert the scanned PDF using OCRmyPDF, I’m encountering an issue with the resulting content. Tesseract processes the text line by line, but this causes OCRmyPDF to generate separate spans for each piece of content. Specifically, it creates a span for row 1, cell 1, then another span for row 1, cell 2, followed by separate spans for row 2, cell 1, and row 2, cell 2. This results in accessibility problems for screen readers, as the content is not structured properly. Is there any way to resolve this issue and ensure the table is interpreted correctly by screen readers? |
Beta Was this translation helpful? Give feedback.
-
|
Im trying to convert some scanned files to searchable pdf and after trying a lot of things, asking chatgpt, also downloading, ghostscript, pdftoocr, and tesseract, i am still unable to convert the same, i really want help regarding this, if someone can help me regarding this, that will be very helpful and kind of you guys. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
👋 Welcome!
We’re using Discussions as a place to connect with other members of our community. We hope that you:
build together 💪.
To get started, comment below with an introduction of yourself and tell us about what you do with this community.
Beta Was this translation helpful? Give feedback.
All reactions