PDF to DOCX conversion with font embedding

Product:
Apryse SDK with Structured Output module

Please give a brief summary of your issue:
When converting a PDF to DOCX fonts from PDF do not match fonts in DOCX

Please describe your issue and provide steps to reproduce it:
When converting a PDF to DOCX using Apryse SDK Structured Output module, the fonts in the resulting DOCX document do not match the fonts used in the original PDF, even though those fonts are installed on both the conversion machine and the machine used to open the DOCX.

Here discord thread that contains sample documents and a script to reproduce the issue Discord

Hello,

Thanks for reaching out! I am unable to access those files in the link. Are you able to upload the source document, resulting document, and code here? If not could you create a ticket at https://support.apryse.com and add them to the ticket?

Thanks for your patience and understanding.

Hi Nicholas,
here are the files:
Conversion_Test (1).pdf (60.6 KB)
Conversion_Test (1).docx (35.6 KB)

And the script(looks like I can only add 2 files):

#---------------------------------------------------------------------------------------
# Copyright (c) 2001-2023 by Apryse Software Inc. All Rights Reserved.
# Consult LICENSE.txt regarding license information.
#---------------------------------------------------------------------------------------
require 'date'
require './app/sdks/apryse/arm64/PDFNetC/Lib/PDFNetRuby'
require 'etc'
include PDFNetRuby
require 'open-uri'
require 'fileutils'

$stdout.sync = true
# Relative path to the folder containing the test files.
$inputPath = "../../Downloads/"
$outputPath = "../../Downloads/"

def font_installed?(font_name)
  `fc-list`.include?(font_name)
end

def main()
  font_name = 'Inter'
  if font_installed?(font_name)
    puts "Font '#{font_name}' is installed."
  else
    puts "Font '#{font_name}' is not installed."
  end
  # The first step in every application using PDFNet is to initialize the
  # library. The library is usually initialized only once, but calling
  # Initialize() multiple times is also fine.
  username = Etc.getlogin
  puts "The script is being run by: #{username}"

  PDFNet.Initialize("key")
  PDFNet.AddResourceSearchPath("app/sdks/apryse/arm64/PDFNetC/Lib/");

  if !StructuredOutputModule.IsModuleAvailable() then
    puts ""
    puts "Unable to run the sample: PDFTron SDK Structured Output module not available."
    puts "-----------------------------------------------------------------------------"
    puts "The Structured Output module is an optional add-on, available for download"
    puts "at https://docs.apryse.com/documentation/core/info/modules/. If you have already"
    puts "downloaded this module, ensure that the SDK is able to find the required files"
    puts "using the PDFNet::AddResourceSearchPath() function."
    puts ""
    return
  end

  begin
    # Convert PDF document to Word
    puts "Converting PDF to Word"

    $outputFile = $outputPath + "Conversion_Test.docx"

    Convert.ToWord($inputPath + "Conversion_Test.pdf", $outputFile)

    puts "Result saved in " + $outputFile
  rescue => error
    puts "Unable to convert PDF document to Word, error: " + error.message
  end

  PDFNet.Terminate
  puts "Done."
end

main()

Please let me know is there’s anything else I could help with.
Thank you!
Inga