WebViewer - select data in PDF and pass to python script

Product: WebViewer

Product Version: trial

Please give a brief summary of your issue:
WebViewer - select data in PDF and pass to python script.

Please describe your issue and provide steps to reproduce it:
Could you please let me know if WebViewer can meet requirement of application I’m building?

Text Extraction: I need the ability to allow users to select text from a PDF document, extract that selected text, and pass it to a custom Python function for further processing. After the processing, I would need to provide feedback to the users based on outcome of analysis by python script. Is your tool facilitating this kind of interaction?

Table Extraction: Another aspect of my application involves extracting tables from PDF documents.
I understand that webviewer can’t extract tables in pdf but is it possible to allow my python script to do table extraction?
These extracted tables would then be passed to a Python script for analysis.? Is there a possibility to integrate this with custom Python scripts?

Could you give me some guidance how it can be implemented (passing above data into my custom python script).
I learned from your customer support on discord that Django is not common solution here, could you please give me idea what other tools can help?

Thank you in advance for your help!

1 Like

Hello,

Thank you for reaching out to us about WebViewer. For your requirements,

Text Extraction:
WebViewer have APIs for extracting text from individual pages and for extracting selected text. You can find out more about it in the following guide

Table Extraction:
This is a bit tricky to do for PDF document because PDF don’t have a concept of table (it only keep location of texts with no concept of tables) . However we do have a sample of using OCR to detect tables and extract data from it. You can find more in the links below

Please let me know if the above helped or if you want me to clarify anything

Best Regards,
Andrew Yip
Web Developer
Apryse

1 Like

Hi Andrew.

thank for reply. I am aware that there is API for text extraction, but how to pass extracted text to python script for processing?

In regards to tables, as written above my python script is able to extract tables from particular pdfs which will be viewed in webviewer. The questions is if I can incorporate custom table extraction tool in webviewer which will use my python script (pdfplumber)?

Thank you in advance for your help.

1 Like

Hello mgrz,

We currently do not have the ability to add custom table extraction into WebViewer. This is a complex integration, so I can provide some helpful links:

You could add an endpoint to a webserver which calls the python script as a sub-process. However anything past this point would not be webviewer related.

Best regards,
Tyler

1 Like