Q:
I have a question on the new improvements:
ContentReplacer can search and replace strings on a PDF page with user defined patterns.
With this change, will it be possible to perform a search and replace on a wildcard string [i.e., a social security number or phone number pattern – nnn-nnn-nnnn or (nnn) nnn-nnnn]? We often have to redact out this type of personally identifiable information (PII) from documents.
A:
ContentReplacer is the wrong tool for this task. It can’t match regular expressions, and is not intended for redaction.
A better solution would be to perform a text search to find the bounding boxes of text matching a regular expression, as shown in the TextSearch sample code:
http://www.pdftron.com/pdfnet/samplecode.html#TextSearch
Then, to correctly redact the text, use the PDF Redactor add-on. The following sample code shows how: