Q:
We are using pdftron.PDF.Convert.ToHtml to convert a PDF to HTML. Conversion quality is good but it converts each PDF page in a separate HTML file. I am using below code
string strSourcePath = System.Configuration.ConfigurationSettings.AppSettings["SourcePath"];
string strDestinationPath = System.Configuration.ConfigurationSettings.AppSettings["DestinationPath"];
pdftron.PDF.Convert.HTMLOutputOptions o = new pdftron.PDF.Convert.HTMLOutputOptions();
pdftron.PDF.Convert.ToHtml(strSourcePath, strDestinationPath, o);
Is there any way get single HTML file using this tool ?
--------
A:
The PDFNet conversion module provides no built-in way to merge each page into a single HTML file. However, you could certainly do so yourself — the output is valid XML, so you could use any XML library to make changes.
Another option may be to use a master HTML page and iframes to inject individual HTML pages as show in the following samples:
http://www.pdftron.com/pdfnet/pdf2html/demo/viewer.html?d=/pdf2html/177.progworld&pages=3
http://www.pdftron.com/pdfnet/pdf2html/demo.html
You can generate the master HTML page (that references other HTML pages via iframes) using PDFNet (e.g. to get page # use PDFDoc.GetPageCount() etc).