PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

Generate pdf from images, word, xml...?
https://forum.pdfsharp.net/viewtopic.php?f=2&t=3478
Page 1 of 1

Author:  Daler [ Tue Oct 25, 2016 11:33 am ]
Post subject:  Generate pdf from images, word, xml...?

Hello,

First of all I do apologize if it has been asked before, I did search but I couldn't find any information which would confirm my concerns. Is it possible to create pdf converter with Migradoc/PDFSharp? By pdf converter I mean jpeg,png.. etc image types to pdf and vice versa, docx, doc, xlsx to pdf and vice versa.

As by reading the FAQ and general informations here on the web site, I know that it should be possible to generate pdf from images.. But from Microsoft Office Documents to pdf its not that easy part. I read that these products (Migradoc or PdfSharp not really sure which one) does support converting XML to PDF. Which means I could use Open XML lib to convert office documents to XML and then to PDF..

Is it possible to do it? I dont want to mess around with a lot of different open source libs, but I would like to stick with these softwares. Overall the support on this forum seems really good, which is very important for me.

The platform is going to be dotnet core.

Best regards,
Daler

Author:  Thomas Hoevel [ Wed Oct 26, 2016 11:40 am ]
Post subject:  Re: Generate pdf from images, word, xml...?

Hi!
Daler wrote:
I read that these products (Migradoc or PdfSharp not really sure which one) does support converting XML to PDF.
Where did you read that?

Daler wrote:
Which means I could use Open XML lib to convert office documents to XML and then to PDF.
With MigraDoc you add text with attributes (font name, font size, color, ...) to a document. If you know how to extract this information from a DOCX file or any XML file or any database then generating a PDF won't be a problem.

Author:  Daler [ Thu Oct 27, 2016 11:00 am ]
Post subject:  Re: Generate pdf from images, word, xml...?

Thanks for the reply.

Thomas Hoevel wrote:
Where did you read that?


From here: http://www.pdfsharp.net/Features.ashx

Quote:
Import data from various sources via XML files or direct interfaces (any data source that can be used with .NET)


and... I think I misunderstood the part about XML.. It talks about parsing XML to C# objects and then creating the pdf document out of it.., right?

It does require some time to create the pdf but how about the vice versa? Does any of these product able to parse the pdf document?

Quote:
Modify, merge, and split existing PDF files


Does it mean yes to my question?

Author:  Thomas Hoevel [ Thu Oct 27, 2016 11:37 am ]
Post subject:  Re: Generate pdf from images, word, xml...?

PDFsharp does not parse the code that draws a page.
There is some third-party code for extracting text from PDF files. https://www.nuget.org/packages/PdfTextract/
You can also find sample code on this forum.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/