Accurate position info scraping vertical lines.

I am attempting to accurately scrape all horizontal and vertical lines from a PDF document. This of course is part of a larger endevour, but is a fundamental needed to accomplish the larger goal. I find I can get position information for horizontal lines fine. And height and width of vertical lines is correct, but the vertical position of vertical lines is consistently off. There must be some transform I need to take into account?

To demonstrate I boiled this down to a very simple application which attempts to draw the lines of source document page 23 on the screen, making the issue visually obvious. Code for main form below. For my purposes I convert all rectangles to a coordinate system with the origin at the upper left corner of the page, increasing ordinates downward and to the right, and scaled into inches.

The source document can be downloaded from here:
Any pointers you can provide would be greatly appreciated.

Imports pdftron
Imports pdftron.Common
Imports pdftron.PDF

Public Class Form1

Private PageBoundaries As RectangleF

Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click

Dim gr As Graphics = pb1.CreateGraphics`` ' pb1 is a picturebox on Form1 width=1109, height=908

gr.ScaleTransform(100, 100)

Dim PathToDoc As String = "C:\YourFolderName\MD_14_2_FS_v14.2.1.3.pdf"
Dim pdfdoc As PDFDoc
Dim page As Page
Dim reader As New ElementReader
Dim element As Element

pdfdoc = New PDFDoc(PathToDoc)
page = pdfdoc.GetPage(23)

PageBoundaries = PageInches(page.GetCropBox)



element = reader.Next
While element IsNot Nothing
If element.GetType = PDF.Element.Type.e_path Then

If element.IsFilled Then
Dim elementRct As New Rect
Dim elemInches As RectangleF = PageInches(elementRct)

Debug.WriteLine("Height=" & elemInches.Height.ToString & " Width=" & elemInches.Width.ToString & " Top=" & elemInches.Top.ToString & " Bot=" & elemInches.Bottom.ToString & " Lft=" & elemInches.Left.ToString & " Rgt=" & elemInches.Right.ToString)

gr.FillRectangle(Brushes.Black, elemInches.X, elemInches.Y, elemInches.Width, elemInches.Height)

End If

End If
element = reader.Next
End While


reader = Nothing

page = Nothing

pdfdoc = Nothing

gr = Nothing

End Sub

Public Function PageInches(ByVal srcRect As Rect) As RectangleF
Dim PageRect As New RectangleF(CSng(srcRect.x1 / 72.0), CSng(PageBoundaries.Bottom - (srcRect.y1 / 72.0)), CSng(srcRect.Width / 72.0), CSng(srcRect.Height / 72.0))
Return PageRect
End Function

End Class

Lee Gillie, CCP
Online Data Processing, Inc.

I was able to resolve this. My flaw was in the PageInches function. Can you spot it?
Hint: Reversal of the Y ordinates.

Glad to hear you were able to solve the issue on your own.

Also, be aware that the coordinates you get from Element.GetBBox() do not always match what you would get “on-screen”

In particular you should call Page.GetDefaultMatrix() and pass the bbox coordinates through. You see this most commonly with pages that are rotated, that is they would appear landspaced in a PDF viewer, but in the PDF themselves the page has profile dimensions.