How do I extract images from PDF with the alpha channel?

Q:

How do I extract images from PDF with the alpha channel?

A:

Images in PDF do not have an explicit alpha channel. Instead there may be a soft or image mask associated with the base image (that may or may not be of the same dimensions as the base image; for more info please see Section 8.9.5 in PDF Reference: http://xodo.com/view/#/c0c11968-ee14-478e-9b09-6dc5635c0915). You can learn more about SoftMasks from PDFNet KB: https://groups.google.com/forum/#!searchin/pdfnet-sdk/SMask

If you want to extract image with alpha channel you may need to extract both the base image and the soft mask (just a single channel gray image) then merge then into one image. To get image data you can use image.GetBitmap(). This would work in most cases but you will need to make some assumptions it image dimensions do not match (e.g. upscale image to the larger imageā€¦ then merge channels).

If you want to rasterize a PDF page with transparent background (rather than solid white paper background, please see https://groups.google.com/d/msg/pdfnet-sdk/GFqayLaJdSU/J_0gqGvcAmAJ