Good day What is the proper way to extract plain text from highlight annotation object or from its xfdf?
1 Like
Hi there,
Thanks for reaching out to WebViewer forums,
To get the text from the highlight Annotation, you can typically extract the text under a highlight annotation by accessing the quads of the annotation and then using the documentViewer.getSelectedText()
method.
const { annotationManager, documentViewer } = instance.Core;
// Assuming you have an annotation object
const highlightAnnotation = annotationManager.getSelectedAnnotations()[0]
const quads = highlightAnnotation.getQuads();
const textsUnderHighlight = quads.map(quad => {
const selectionStartPoint = { x: quad.x1, y: quad.y1, pageNumber: highlightAnnotation.PageNumber };
const selectionEndPoint = { x: quad.x3, y: quad.y3, pageNumber: highlightAnnotation.PageNumber };
documentViewer.select(selectionStartPoint, selectionEndPoint);
return documentViewer.getSelectedText();
})
console.log(textsUnderHighlight)
To get text from the highlight annotation’s XFDF, it would be in the tags:
<?xml version="1.0" encoding="UTF-8" ?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"><pdf-info xmlns="http://www.pdftron.com/pdfinfo"
version="2" import-version="4" />
<fields />
<annots>
<highlight page="0" rect="123.566,239.305,392.054,310.801" color="#FFCD45" flags="print"
name="f51aff45-5560-77bf-e8c0-c637b0e91522" title="Guest" subject="Highlight" date="D:20231124140457-05'00'"
creationdate="D:20231124140457-05'00'"
coords="123.566,310.8009,392.05400000000003,310.8009,123.566,239.3049000000001,392.05400000000003,239.3049000000001">
<trn-custom-data
bytes="{"trn-annot-preview":"mportant F","trn-associated-number":"1"}" />
</highlight>
</annots>
<pages>
<defmtx matrix="1,0,0,-1,0,792" />
</pages>
</xfdf>
Best regards,
Kevin Kim
2 Likes
Also better not forget to documentViewer.clearSelection()
after text is extracted, otherwise the selected text will stay selected in the viewer.
1 Like