Q:
These are my first attempts at editing existing PDF pages. Trying to move/shift all text and images found within a rectangle by some DeltaX and DeltaY.
My code is based on ElementEdit sample that comes as part of the SDK
http://www.pdftron.com/pdfnet/samplecode.html#ElementEdit
Top half of the page it leaves text untouched / Midpoint to 3.47 inches from bottom of page is changed to blue (reminiscent of the original example)
But everything below 3.47 inches from the bottom is shifted.
The shift is 1/4" to the right (DeltaX) and down 3/4" (DeltaY).
To do this your ProcessElements Sub is recorded as:
Private DeltaX As Double = 0.25 * 72.0
Private DeltaY As Double = -0.75 * 72.0
Sub ProcessElements(ByVal reader As ElementReader, ByVal writer As ElementWriter)
Dim element As Element = reader.Next()
While Not IsNothing(element)
If element.GetType() = element.Type.e_text Then
Dim bbox As New Rect
element.GetBBox(bbox)
If bbox.y2 < 250 Then
Dim mtx As Matrix2D = element.GetGState.GetTransform
mtx.Concat(1, 0, 0, 1, DeltaX, DeltaY)
element.GetGState.SetTransform(mtx)
writer.WritePlacedElement(element)
ElseIf bbox.y2 < 396 Then
Dim gs As GState = element.GetGState()
gs.SetFillColorSpace(ColorSpace.CreateDeviceRGB)
gs.SetFillColor(New ColorPt(0, 0, 1))
writer.WriteElement(element)
Else
writer.WriteElement(element)
End If
element = reader.Next()
ElseIf element.GetType() = element.Type.e_form Then
reader.FormBegin()
ProcessElements(reader, writer)
reader.End()
writer.WriteElement(element)
element = reader.Next()
ElseIf element.GetType() = element.Type.e_image Then
element = reader.Next()
ElseIf element.GetType() = element.Type.e_inline_image Then
element = reader.Next()
Else
writer.WriteElement(element)
element = reader.Next()
End If
End While
End Sub
The input and output PDF are attached.
although (slightly) visible, you can see all the elements bunched up into the lower left corner of the page, similar placement to what I am seeing.
Hoping taking from your sample, and providing input and output files will make it easier for you to repro, and visualize the result.
A:
For your application, WriteElement will give the wrong result, since each element will add an additional translation, resulting in a ‘staircase’ effect.
WritePlacedElement also gives the wrong result, because you end up discarding relevant GState information, such as selected font and text matrix. Instead, what you need is to only reset the GState transform, not the entire GState.
I wrote a test implementation in python that does just that. It should be relatively straightforward to map the implementation to VB:
def ProcessElements(reader, writer):
element = reader.Next() # Read page contents
#We will store the inverse to our translation, so we can undo it later
inverse_transform = Matrix2D(1,0,0,1,0,0)
while element != None:
#Apply the inverse transform to undo the translation
mtx = element.GetGState().GetTransform()
mtx = inverse_transform * mtx
element.GetGState().SetTransform(mtx)
#There is no longer a translation to inverse, so we set inverse_transform back to identity
inverse_transform = Matrix2D(1,0,0,1,0,0)
type = element.GetType()
if type == Element.e_text or type == Element.e_image:
#We want to translate text and images, so here we go:
mtx.Concat(1,0,0,1,0,-150)
element.GetGState().SetTransform(mtx)
writer.WriteElement(element)
#We now need to set the inverse transform:
inverse_transform = Matrix2D(1,0,0,1,0,150)
elif type == Element.e_form: # Recursively process form XObjects
writer.WriteElement(element)
reader.FormBegin()
ProcessElements(reader, writer)
reader.End()
else:
writer.WriteElement(element)
element = reader.Next()
To help you with debugging, I would recommend you use a tool such as CosEdit, which allows you to browse the internal structure of a pdf document: http://www.pdftron.com/pdfcosedit
With this tool, and a good understanding of the PDF specification, you can easily visualize how different routines are modifying the PDF, which should make your development process more productive.