Why do some annotations report they are on page zero?

Question:

I am using the following code to parse the annotations.

for (PageIterator itr = pdfDoc.getPageIterator(); itr.hasNext(); ) {
Page page = (Page) (itr.next());
for (int i=0; i < page.getNumAnnots(); i++) {
Annot annot = page.getAnnot(i);
int page_num = annot.GetPage().GetIndex();
}
}

Sometimes page_num is zero, and not the correct page number.

Answer:

The reason you see page number zero for Page.GetIndex is because the annotation is missing a reference back to the page that contains it. GetIndex returns zero if the Page object is invalid, or the page is not yet inserted into the page tree.

You can detect this case with the following code.

annot.GetPage().IsValid(); // returns false is the annotation doesn't have a valid page reference.

Even this is not that good a test, as sometimes annots have a reference to a valid, but wrong page. Essentially Annot.GetPage can’t really be trusted, and other parts of PDFNet, such as XFDF export, do not rely on it.

Using Page object to determine the annotations page number is the correct way to determine the page an annotation is on.

Note, if you are doing XFDF exporting, the XFDF page numbers start at zero, while PDF page numbers start at 1.