I am exploring the capabilities of PDFTron for a project and need some specific information. Specifically, I need to understand how PDFTron can interact with imported PDFs in the following ways:
Element Selection: Does PDFTron allow for the selection of individual elements within a PDF? For example, can I select specific shapes, text, or other objects within a PDF?
Element Identification: Once an element is selected, does PDFTron provide functionality to identify and retrieve information about that element? For instance, can it provide details such as object type, coordinates, or other metadata?
Custom Information Display: Is it possible to display custom information when an element is selected? For example, if I have a 2D drawing of a BIM (Building Information Modeling) project, can PDFTron show specific details or annotations related to the selected part of the drawing?
Additionally, I have encountered an issue with parsing a PDF file. The PDF contains vectors and text, but when I try to process it using PDFTron, I only get an image output instead of the parsed vectors and text elements. I have attached the PDF I am working with for reference.
Could you please provide insights on how to correctly parse vectors and text from a PDF using PDFTron? Any specific APIs or examples would be very helpful.
Your insights on these capabilities would be greatly appreciated. Additionally, if there are any specific APIs or examples that demonstrate these features, kindly point me to them.
I understand shapes are not selectable being annotations. So do you render each shape as an individual annotation? In that case, would I get a quad for each annotation representing a particular shape from the input pdf.
What are your thoughts on using CV (computer vision) techniques on the raster pdf to identify elements ? Do you have any learnings ?
I removed the document screenshots from my last post.
All shapes are generally represented as annotations to make changes or use the annotation functionalities.
For text based annotations, we use the quad property, whereas other annotations will use the Rect property:
But please note that this will require you to either save the annotations (externally as XFDF data) or have it embedded within the input PDF.
For WebViewer, we do not have capabilities for extracting shapes from a document without annotations. This can be possible with our Core SDKs, please see this forum post for more information: