Using JBIG2 Threshold and SharePages paramaters for PDF Compression

Q: I have been playing with these two features in PDFNet JBIG2 compression filter and have couple of questions:

  1. JBIG2 Threshold – I tried to set the threshold, but did not observe any change in the generated file size. Not sure whether I am setting it the right way. Here is the code that I use

SDF::ObjSet hint_set;

SDF::Obj enc = hint_set.CreateArray();

enc.PushBackName(“JBIG2”);

enc.PushBackName(“Threshold”);

enc.PushBackNumber(0.4);

PDF::Page page = doc.PageCreate(PDF::Rect(0, 0, imgWidth, imgHeight)); // Start a new page

elemWrtr.Begin(page); // Begin writing to this page

PDF::Image img = PDF::Image::Create(doc, (imgBufsVec[indx].get()), (unsigned int)imgBufSize, imgWidth, imgHeight, 1, PDF::ColorSpace::CreateDeviceGray(), enc);

PDF::Element element = elemBldr.CreateImage(img, Common::Matrix2D(imgWidth, 0, 0, imgHeight, 0, 0));

elemWrtr.WritePlacedElement(element);

elemWrtr.End();

doc.PagePushBack(page);

  1. SharePages – is there a way to specify the share page for a set of pages. I mean if I am putting 4 images into PDF using JBIG2 compression, I would like the compression algorithm to use same symbol table for pages ‘2 and 3’ and and not for pages ‘1 & 4’. Is the symbol table global for all pages in the PDF?

A:

Regarding Threshold param, the supported values are in the rage 0.4-0.9. These control the amount of acceptable ‘data-loss’. Due to the way JBIG2 compression works (with symbol dictionaries etc) using higher compression (i.e. lower threshold) factor can lead to surprising problems (see http://www.xerox.com/assets/pdf/ScanningQAincludingAppendixA.pdf, http://www.xerox.com/assets/pdf/ScanningQAincludingAppendixA.pdf).

For relevant discussion (related to PDFNet, posted before Xerox scandal) see: https://groups.google.com/d/msg/pdfnet-sdk/KIr_61lP8fw/w5oDs9JuzrkJ

    1. SharePages – is there a way to specify the share page for a set of pages. I mean if I am putting 4 images into PDF using JBIG2 compression, I would like the compression algorithm to use same symbol table for pages ‘2 and 3’ and and not for pages ‘1 & 4’.

You can achieve this by creating temporary PDFDoc-s then merging all pages into final PDF (with pdfdoc.InsertPages() as shown in PDFPage sample - http://www.pdftron.com/pdfnet/samplecode.html#PDFPage).

  • Is the symbol table global for all pages in the PDF?

PDFNet creates a shared symbol table for a set of pages (but not necessarily the entire page set). The size of this set can be controlled with /SharePages hint parameter which is used to specify the maximum number of pages sharing a common ‘JBIG2Globals’ segment stream. For systems will small amount of memory you will probably want to keep SharePages to minimum (e.g. 1-5).