How do I extract all paths of a given color?

Q:

I want to extract all paths of a specific color. How can I do so?

A:

To see, in general, how to edit PDF documents, please see the ElementEdit code sample at http://www.pdftron.com/pdfnet/samplecode.html#ElementEdit.

In your case, you can keep all elements with the color “foo” by changing the sample’s “ProcessElements” function to the following:

static void ProcessElements(ElementReader& reader, ElementWriter& writer, XObjSet& visited)
{
Element element;
while (element = reader.Next()) // Read page contents
{
switch (element.GetType())
{
case Element::e_path:
{
GState gs = element.GetGState();

if ((gs.GetFillColorSpace().GetType() == ColorSpace::Type::e_separation &&
std::string(gs.GetFillColorSpace().GetSDFObj().GetAt(1).GetName()) == "foo") ||
(gs.GetStrokeColorSpace().GetType() == ColorSpace::Type::e_separation &&
std::string(gs.GetStrokeColorSpace().GetSDFObj().GetAt(1).GetName()) == "foo"))
{
writer.WriteElement(element);
}

break;
}
case Element::e_form:
{
writer.WriteElement(element); // write Form XObject reference to current stream

Obj form_obj = element.GetXObject();
if (visited.find(form_obj.GetObjNum()) == visited.end()) // if this XObject has not been processed
{
// recursively process the Form XObject
visited.insert(form_obj.GetObjNum());
ElementWriter new_writer;

reader.FormBegin();
new_writer.Begin(form_obj);
ProcessElements(reader, new_writer, visited);
new_writer.End();
reader.End();
}
break;
}
default:
break;
}
}
}

Kindly see the attached PDF for the output given by this function.

Note that this only extracts paths with color set to “foo”; it does not extract paths with color “All”. You can simply change this sample to account for the “All” color as desired. You may also want to check “DeviceN” color spaces for the color as well.