I want to add text to a PDF page, using ElementBuilder, but when I try to add non-ascii text, unicode, it comes out as garbage characters.
Answer:
The issue is that the encoding of character codes does not match the Font being used. The following API should work regardless of what language you are using (C#, Java, SWIFT, etc.)
var text = "ﭳﭵﭷﭺﮒ"; // example unicode text.
var font = Font.Create(pdfdoc, name_of_font, text); // name_of_font would be something like "RobotoSans" or "Arial".
var element = elementBuilder.CreateTextBegin(font, font_size);
elementWriter.WriteElement(element);
element = elementBuilder.CreateUnicodeTextRun(text);
elementWriter.WriteElement(element);
elementWriter.WriteElement(elementBuilder.CreateTextEnd());
Exact code will depend on the exact language you use, but this is the essential pattern.
It is just the unicode values that you are going to need. The set of characters required. PDFNet uses this info to pick the best font matching the family name you gave, and the unicode values you want to use. So PDFNet might pick a second or third choice for font, if the first choice is missing some of the unicode values.
Look at section 9.7.5 of PDF 32000-1:2008, CMaps. char_set isn’t one of the names in that table?
Also:
How big can this string be?
What happens when the characters in this string are not in a single font? Are characters gathered from multiple fonts into this single font?
>Look at section 9.7.5 of PDF 32000-1:2008, CMaps. char_set isn’t one of the names in that table?
The Font.Create and ElementBuilder.CreateUnicodeTextRun are high level methods that hide PDF details, such as CID CMaps, for you, so you can concentrate on creating page content. PDFNet will take care of generating the required data structures in the generated PDF.
We also offer lots of other ways to generate PDF page content, such as GDI+, XAML, XPS, HTML, DOCX, and more.
>How big can this string be?
I am not sure what the implementation limit is. I believe passing 65,535 characters would be no problem, which I image is enough to cover everything needed from a single Font.
>What happens when the characters in this string are not in a single font? Are characters gathered from multiple fonts into this single font?
The method only returns one font, which in this case would be a reference to a single font from the OS/filesystem. In the end the “best” font will be picked.
Ideally you would have an idea of what is covered by the font that you want to use. You can always use multiple fonts on a page.
Thank you again for the feedback, we will look to improve the documentation/API.