I have two questions:
- I am currently trying to work out x/y/width/height/rotation of a
text block, how would I go about doing that?
- On e_quadto pathsegment, I couldnt find in the documentation which
point is the end point (x1/y1 or x2/y2 or x3/y3)
A: You could use PDFNet to implement a custom PDF to XML converter
(along the lines or ElementReaderAdv or TextExtract sample). PDF2SVG
itself is built using PDFNet content extraction API.
The position of a text element is defined using the Current
Transformation Matrix (CTM) and Text Matrix (element.GetCTM() *
element.GetTextMatrix()). You could also use CharIterator (as shown
ElementeaderAdv) to obtain the positioning information (in text
for each character in the text element.
You can alternatively call element.GetBBox(ref rect) to obtain the
bonding-box for a given element (in PDF user space) but this will not
give you rotation, shear factor, etc (which can be obtained from the
The positioning of other elements on the page (e.g. images, paths,
form XObjects) is completely defined with the CTM (element.GetCTM
();you can also use element.GetBBox() to obtain the bounding boxes
- On e_quadto pathsegment, I couldn't find in the documentation which point is the end point (x1/y1 or x2/y2 or x3/y3)
The last point in the segment is always the last point in segment
(xy,y3). The other points are control points for the curve (PDF
only supports Bezier/cubic curves).