How to extract content tags / marked content?

Aaron_Gravesdale · November 14, 2006, 3:59am

Q:
PDFNet is an excelent product which i nave't seen before.

but i have one problem that how can i capture the tags of the pdf
document in to my console application im using c #.Net 2.0. Could you
please tel me the way how can i capture the tags of the content.
---

A:
I assume that you need to extract marked content within PDF page
content?

You can use ElementReader
(http://www.pdftron.com/net/samplecode.html#ElementReader) to extract
various PDF Element from the page. In particular, you will be
interested in the following element types:

e_marked_content_begin - marks the beginning of marked content
sequence (BMC, BDC)
e_marked_content_end - marks the end of marked content sequence (EMC)
e_marked_content_point - designate a marked-content point (MP, DP)

If you encounter e_marked_content_begin element, you can obtain BMC
dictionary using element.GetMCProperyDict(). There is also
element.GetMCTag() method, in case you encounter a marked content point.