Vertical Text Annotations Misalignment on Webviewer + Missing text Highlights

WebViewer Version: 11.4.0

Do you have an issue with a specific file(s)? No
Can you reproduce using one of our samples or online demos? Yes
Are you using the WebViewer server? No
Does the issue only happen on certain browsers? No
Is your issue related to a front-end framework? Yes/maybe - ReactJS
Is your issue related to annotations? Yes

Please give a brief summary of your issue: Vertical text annotations are misaligned - Words not recognized by text hightlights/annotations
(Think of this as an email subject)

Please describe your issue and provide steps to reproduce it:
(The more descriptive your answer, the faster we are able to help you)
When trying to highlight/annotate vertical text that is italics, the highlight on top of the text seems to be pretty misaligned. On top of that, if some text is fairly small, close together, and vertical, text seems to get missed when trying to automatically create an annotation on them. Would this be an issue inside of webviewer, or would it be an issue with the PDF viewer its build upon?

Steps to reproduce:

  1. Go into your measurement sample here JavaScript PDF Viewer Measurement Demo
  2. Upload the document 00000001.pdf (65.7 KB), which includes small vertical text along the right, that spans multiple lines with low line-height.
  3. Try to highlight said vertical text
  4. Observe that the highlighting seems to highlight horizontally across multiple lines of vertical text, rather than vertical with the lines.

I’d like to think this may be just an issue with the underlying PDF viewer interpreting that close together vertical lines are not proper text, but a bunch of closer together horizontal lines, as it seems to highlight in line with that idea. It may also just be the fuzziness of the text in general.
It would also make sense as if I try to programmatically highlight words that are within those vertical lines, its not able to find them within the document to highlight them.

Missing words not highlighting:

  1. Take the PDF found here
  2. Put it into the measurement tool sample linked above
  3. Observe that a lot of the text isn’t highlightable. Namely on the second page, many of the coordinate texts aren’t highlightable.

Would this be some structure missing from the PDF data?

With these issues, I am mostly looking for some kinda of direction on whether this issue stems from the PDF data, Webviewer, or just the underlying PDF viewer in general. If you have any direction or input, that would be great, thank you!

Please provide a link to a minimal sample where the issue is reproducible:

1 Like

Hello, I’m AiRon, an automated tech support bot :robot:
While you wait for one of our customer support representatives, I can help you with your question.

From your description, it seems like there might be a few factors influencing the behavior you’re observing with vertical text annotations in WebViewer. The issues with text misalignment and missing highlights often relate to how the PDF content is structured and how the WebViewer interprets it.

Some things you can try:


  • Check the PDF text structure: Ensure that the PDF text is properly recognized as text objects. Sometimes, text that appears visually aligned might not be structured correctly in the PDF.

  • Use OCR: If the text is not recognized, consider applying OCR (Optical Character Recognition) to convert the text into a searchable and highlightable format.

  • Adjust the annotation alignment: Customize the rendering of annotations by adjusting properties like vertical alignment or rotation to better fit the text orientation.

  • Debug with WebViewer APIs: Use WebViewer’s APIs to programmatically inspect the text content and annotations to understand how they are being interpreted.




Here are some resources that might help you:

  • FreeText Annotation Documentation - Learn about the different properties and alignment options available for text annotations.

  • OCR Guide - Understand how to apply OCR to PDFs to improve text recognition and highlight capabilities.

  • Annotation Guide - Explore how annotations work in WebViewer and how you can customize them.

  • Text Extraction - Learn how to extract and interact with text programmatically.




Hopefully, this helps you solve your problem while you wait for a human to get back to you.

1 Like

Hello nealet,

Thank you for reaching out, it looks like the document you provided has incorrect text quads, thus the selection would be incorrect. Adobe has similar text selection behaviour.

I suggest running this document through OCR to try to get more accurate text quads.

Best regards,
Tyler

1 Like