We would like to use mupdf.js to make pdf-files accessible. While mupdf.js allows us to create / update the pdf objects that provide the structure for screen readers, there is (afaik) currently no api to to access the MCID elements in the content stream.
In the content stream, marked content is wrapped with BDC / EMC commands and labeled with an id with an MCID command. e.g.
/H1 << /Lang (en-US) /MCID 100 >> BDC
(Heading Text) Tj
EMC
Ideally, mupdf would support basic tag-creation, so that the elements in an unmarked pdf can be accessed via MCID-numbers.
But, more importantly, mupdf.js should provide API access to the MCID-numbers via the StructuredTextWalker-API. Currently, there only seems to be very low-level API-Support in the PDFProcessor (https://mupdf.readthedocs.io/en/latest/reference/javascript/types/PDFProcessor.html#marked-content).
In the future, we would also welcome a high-level API to work with structured text, i.e. to create/modify the /StructTreeRoot, and its underlying elements /RoleMap, /ClassMap and /IDTree. Nevertheless, it is already possible to implement these functions with mupdf.js, whereas the MCID-tagging of the content streams remains inaccessible with the current APIs.
We would like to use mupdf.js to make pdf-files accessible. While mupdf.js allows us to create / update the pdf objects that provide the structure for screen readers, there is (afaik) currently no api to to access the MCID elements in the content stream.
In the content stream, marked content is wrapped with BDC / EMC commands and labeled with an id with an MCID command. e.g.
Ideally, mupdf would support basic tag-creation, so that the elements in an unmarked pdf can be accessed via MCID-numbers.
But, more importantly, mupdf.js should provide API access to the MCID-numbers via the StructuredTextWalker-API. Currently, there only seems to be very low-level API-Support in the PDFProcessor (https://mupdf.readthedocs.io/en/latest/reference/javascript/types/PDFProcessor.html#marked-content).
In the future, we would also welcome a high-level API to work with structured text, i.e. to create/modify the /StructTreeRoot, and its underlying elements /RoleMap, /ClassMap and /IDTree. Nevertheless, it is already possible to implement these functions with mupdf.js, whereas the MCID-tagging of the content streams remains inaccessible with the current APIs.