Text highlight is not consistent for office document files.

Product: @pdftron/webviewer

Product Version: 11.8.0

Please give a brief summary of your issue:
Text selection inconsistency in office document files (tested on .docx) - need alternative to loadAsPDF due to licensing requirements

Please describe your issue and provide steps to reproduce it:
Issue:

Text selection in document files is inconsistent. When selecting text, some characters are not highlighted and are missing from getSelectedText().

Current Workaround:

Setting loadAsPDF: true fixes the selection issue, but it requires additional licensing (Office add-on), which shows an Apryse watermark. We need a solution that works with our current WebViewer license.

Environment:

  • WebViewer version: 11.8.0

  • Framework: Angular 20.2.2

  • Browser: Chrome/Edge (tested)

  • File types affected: DOCX, DOC, XLSX, XLS, PPTX, PPT

Steps to Reproduce:

  1. Initialize WebViewer with a standard license key (without Office add-on)
  2. Load a DOCX file without loadAsPDF: true
  3. Enable text selection mode
  4. Select text in the document, especially words with multiple characters
  5. Observe: some characters are not highlighted visually
  6. Check documentViewer.getSelectedText(): some characters are missing from the returned string

Please provide a link to a minimal sample where the issue is reproducible:
Highlight_Issue.docx (24.5 KB)

Hello, I’m AiRon, an automated tech support bot :robot: While you wait for one of our customer support representatives, I can help you with your question.

It seems you’re experiencing issues with text selection in office documents using WebViewer without the loadAsPDF option. This is a common issue when the viewer is not able to correctly interpret the text layout of certain document formats, leading to inconsistent text highlighting and selection.

Some things you can try:

  • Consider using the latest version of WebViewer as improvements and bug fixes are continuously being made.

  • Ensure that the document is in a standard format and not corrupted, as non-standard formatting can affect text selection.

  • Check if there are specific fonts or styles used in the document that might not be fully supported by the viewer.

  • Explore the use of custom text extraction methods or modify the way text selection is handled through the API.



Here are some resources that might help you:


Hopefully, this helps you solve your problem while you wait for a human to get back to you.

I’m already using the latest version. The issue I am facing has been long reported in this ticket - Text selection highlight is broken up - #3 by mark.aziz, however the workaround provided (loadAsPDF: true) in the ticket needs extra licensing which I want to avoid. I need a proper solution instead of a workaround.

1 Like