Capabilities of PDFTron for Interacting with Imported PDFs

Hello,

I am exploring the capabilities of PDFTron for a project and need some specific information. Specifically, I need to understand how PDFTron can interact with imported PDFs in the following ways:

  1. Element Selection: Does PDFTron allow for the selection of individual elements within a PDF? For example, can I select specific shapes, text, or other objects within a PDF?
  2. Element Identification: Once an element is selected, does PDFTron provide functionality to identify and retrieve information about that element? For instance, can it provide details such as object type, coordinates, or other metadata?
  3. Custom Information Display: Is it possible to display custom information when an element is selected? For example, if I have a 2D drawing of a BIM (Building Information Modeling) project, can PDFTron show specific details or annotations related to the selected part of the drawing?

Additionally, I have encountered an issue with parsing a PDF file. The PDF contains vectors and text, but when I try to process it using PDFTron, I only get an image output instead of the parsed vectors and text elements. I have attached the PDF I am working with for reference.

Could you please provide insights on how to correctly parse vectors and text from a PDF using PDFTron? Any specific APIs or examples would be very helpful.

Your insights on these capabilities would be greatly appreciated. Additionally, if there are any specific APIs or examples that demonstrate these features, kindly point me to them.

Thank you!

Thank you for posting your question to our forum. We will provide you with an update as soon as possible.

Hi there,

WebViewer supports text selection. You can get specific quads from the text that is selected:

Symbols and characters such as math symbols work with the above text selection:

Shapes are generally annotations, if they are embedded/flattened into the document they won’t be able to be selected.

To show custom information, if the element is an annotation, we have the annotationSelected event that you can leverage to show custom information:

Regarding the specific parsing errors you are mentioning, I am unable to reproduce as this is the resulting view on our showcase

removed
and on Chrome browser:
removed

You can check our API for additional rasterizer options you can apply:
https://docs.apryse.com/api/web/Core.Document.html#updateRasterizerOptions

If you are working with CAD files, you can check our CAD support showcase:

and the BIM showcase (3D only)

Best regards,
Kevin Kim

Hi Kevin,

Two questions around your comment above,

  1. I understand shapes are not selectable being annotations. So do you render each shape as an individual annotation? In that case, would I get a quad for each annotation representing a particular shape from the input pdf.
  2. What are your thoughts on using CV (computer vision) techniques on the raster pdf to identify elements ? Do you have any learnings ?

Atul

@kkim could you please remove this PDF as it is sensitive information we don’t want to display on public forum. thanks!

Hi there,

I removed the document screenshots from my last post.

All shapes are generally represented as annotations to make changes or use the annotation functionalities.
For text based annotations, we use the quad property, whereas other annotations will use the Rect property:

But please note that this will require you to either save the annotations (externally as XFDF data) or have it embedded within the input PDF.

For WebViewer, we do not have capabilities for extracting shapes from a document without annotations. This can be possible with our Core SDKs, please see this forum post for more information:

and

Best regards,
Kevin Kim

Thanks @kkim How do i rendered the extracted vector data from PDF to custom canvas. and make each element selectable this is our requirement.

Could you please clarify what your use-case?
Could you please provide a sample input PDF and the expected results you are trying to achieve?

Best regards,
Kevin Kim