Hey Jason,
Thanks for keeping me informed! Very interesting indeed, I’m wondering how it would look if you put those largest glyphs above each other, they would need to overlap. I did a small test but couldn’t make it happen; these glyphs are also smaller than the bbox in pdftron, so they’re probably not the ones stretching it:

Found it interesting though how adobe seems to line it up pixel perfect.
In our case specifically it wouldn’t really matter how tight the bbox is vertically. Even if it doesn’t contain all the glyphs and some letters/symbols stick out of the bbox, the text extraction would still count that intersect and extract the right results. Personally I think that offers a better user experience than accidentally selecting multiple lines.
Another way you might solve it is by using the coordinates for the next line. If you know how far below the next line is you might be able to calculate how much space there is in between, and cap out the bboxes so they don’t overlap.
Is this something the core team will be working on, or is not on a timeline yet?
Best regards
Hugo Pasman