How to delete a specific digital signature field?

Ryan · June 4, 2019, 4:58pm

Question:
How to completely remove a single Digital Signature field, while leaving other exiting Digital Signature fields alone.

Answer:

Note, if any of the signature, except perhaps the target one, are already signed, then this action will invalidate the other signed signature(s). If they other signatures are not signed, then this action should be fine.

This is a four step process.

Find the field/annotation
Remove the Widget annotation from the page, which is the visual appearance of the field.
Remove the Field from the list of fields.
Save the PDF

Find the field.
Assuming you have a name, you can do the following.

Field field = pdfdoc.GetField(fieldName);
Obj fieldObj = field.GetSDFObj();
Widget widget = new Widget(fieldObj);
if(field == null || !field.IsIndirect()) throw new Exception("Cannot delete field");
int fieldObjNum = fieldObj.GetObjNum();

Remove the widget.

// first check if we have valid page reference.
Page page = widget.GetPage();
int widgetIndex = -1;
if (page.IsValid()) {
  // double check back link is correct, often it is not.
  widgetIndex = GetIndexOfAnnotOnPage(page, fieldObjNum);
}
if (widgetIndex < 0) {
  // either there was no back link, or it was a bad one
  for (PageIterator itr = doc.GetPageIterator(); itr.HasNext(); itr.Next()) {
    widgetIndex = GetIndexOfAnnotOnPage(itr.Current(), fieldObjNum);
    if (widgetIndex >= 0) {
      page = itr.Current();
      break;
    }
  }
}
if (widgetIndex >= 0 && page.IsValid()) {
  page.AnnotRemove(widgetIndex); // if we don't get here, then there is no visual representation of the field
}

Remove the field

Obj formsObj = null;
Obj acroformsObj = pdfdoc.GetAcroForm();
if (acroformsObj != null && acroformsObj.IsDict()) {
  formsObj = acroformsObj.FindObj("Fields");
}
if (formsObj != null && formsObj.IsArray()) {
  for (int i = 0; i < formsObj.Size(); ++i) {
    Obj formObj = formsObj.GetAt(i);
    if (formObj.GetObjNum() == fieldObjNum) {
      formsObj.EraseAt(i);
      break;
    }
  }
}

Save to commit change

pdfdoc.Save(filePath, SDFDoc.SaveOptions.e_linearized);

Utility function

// return -1 on not found
int GetIndexOfAnnotOnPage(Page page, int objNum) {
  int num_annots = page.GetNumAnnots();
  for (int i = 0; i < num_annots; ++i) {
    Annot annot = page.GetAnnot(i);
    if (annot.GetSDFObj().GetObjNum() == objNum) return i;
  }
  return -1;
}

mglimm · November 25, 2022, 12:46am

There is no entry “Forms” in the AcroForm dict… do you mean “Fields”?

Ryan · November 25, 2022, 5:55pm

Thank you for reporting the typo. It is fixed now.

mglimm · November 25, 2022, 8:17pm

No problem, but I’ve been having trouble deleting problem/corrupt fields with this method…
(using the javascript, pdfnet-node)

Can you explain why when I iterate with doc.getFieldItrBegin, and call field.isValid and keep count, I get:
fieldItrCounts: { valid: 792, invalid: 5 }

Versus iterating over the acroForm “Fields” object, create a field with Field.create(fieldObj) and call field.isValid I get:
acroFormCounts: { valid: 34, invalid: 54 }

Why would these numbers be so different?

I have an open ticket regarding this issue, where the pdf is attached, number: #39692

Really all I’m trying to do is detect bad fields and delete them.

Thanks

Ryan · November 25, 2022, 10:51pm

I would need to the specific PDF file in question to answer properly.

But some general observations would be.

a) It is useful to think of FieldIterator as actually a Widget Annotation iterator. That is, if a Field is represented in the PDF pages by more 2+ Widget Annotations, then the FieldIterator will hit the same “field” 2+ times (equal to the Widget count).

b) The Fields object is actually a tree structure (in the form of nested arrays). You may or may not be parsing it properly.

c) As for field.create(fieldObj) that also depends on where you are getting this object from. Again, the Fields nested arrays are probably a bit more complicated then you expect (there is inheritance also). So better to stick with the higher level FieldIterator API, rather than lower level SDF.Obj API if you can avoid it.

mglimm · December 5, 2022, 6:00pm

Thanks for your input Ryan, I’ll keep that in mind moving forward. I definitely thought FieldIterator was just going over the fields, not widgets as well.

The issue on the form in question was actually created by Adobe DC. In some circumstances, the Adobe itself will generate ‘bad’ fields when creating forms; ones that partially appear on the page when in “Prepare Form” mode, but don’t actually make it to the list of fields on the side (ie properly created within the AcroFields array I assume). My team’s solution (within DC) was to completely remake the page that had the bad fields… so as a general solution to these bad fields, I will probably do something to that effect with PDFTron as well.

mglimm · December 5, 2022, 11:54pm

Still working on this…but I have a more info:

If I find an invalid field by iterating with the FieldIterator API, and call Field.isValid, I can get an object number from that invalid object… but when I iterate over pages’ widgets (1) and the acroform fields (2) like you do above, I can’t find any object number that matches the invalid object:

because the field is invalid, a widget created from its object doesn’t have a valid page
I don’t come across the invalid object number in the ‘Fields’ array.

This is a tricky one…

Is there a way to delete an object from a document using just an object number?

Ryan · December 6, 2022, 7:30pm

No, not directly. When you save the PDF file using e_linearized or e_remove_unused, then the SDK will delete any unused objects.

You might find this tool useful in visualizing the actual SDF objects in a PDF file.

mglimm · December 8, 2022, 10:23pm

I tried both of those save options, but refreshFieldApperances still segfaults.

I had a look at the tool, found my object numbers, but I’m not sure how it ‘should’ look in that table, ie what makes Field.isValid() false. I think it’s probably because they’re dangling (no ‘Parent’ entry).

In the end, my team solved this issue by recreating the form manually, but the mystery remains…how do we delete an invalid field that was created by Adobe DC?

Thanks anyway @Ryan , I might come back to this if we encounter it again