PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

HTML TO PDF Conversion
https://forum.pdfsharp.net/viewtopic.php?f=2&t=1135
Page 1 of 1

Author:  uday [ Thu Apr 08, 2010 2:44 pm ]
Post subject:  HTML TO PDF Conversion

Hi I would like to know what would be the best strategy to convert an ASP.NET Server Control's rendered HTML to PDF using PDFSharp, the PDF should look exactly the same as the control looks in the browser.
I am banging my head against this since last 3 weeks :(

Author:  DaleStan [ Thu Apr 08, 2010 5:21 pm ]
Post subject:  Re: HTML TO PDF Conversion

You have set yourself an impossible task. PDF is designed for print output, and, as such explicitly specifies physical things like line breaks, page breaks, word spacing, and the like. HTML is designed for video display, and generally specifies "put line breaks wherever you [the browser, acting on the users instructions] think best", "Page breaks? What are page breaks?", "kern words however you [the browser again] see fit", and the like.

That said, the best strategy is probably to start with whatever input the ASP.NET Server Control uses, not the HTML it generates. Parsing Strict HTML is a pain; parsing Javascript is a terror.

Author:  uday [ Fri Apr 09, 2010 4:43 am ]
Post subject:  Re: HTML TO PDF Conversion

Thanks DaleStan,
But is it possible to get a look and feel as close as possible?
I typically would like to write a method:
SaveControlAsPdf(System.Web.UI.Control control)
And this method would save the control to pdf.
I have already written some code to get the HTML of the control.
Is there anything like HTMLParser of iTextSharp?.
Thanks in advance

Author:  olavxxx [ Fri Apr 09, 2010 10:20 pm ]
Post subject:  Re: HTML TO PDF Conversion

Hello,
I have too been wondering about how to archive web pages.
However I found a book which concerns the archiving of webpages.

I dont know what your reason for doing this is, but if you want PDF-A, you can not link to other directories from within the PDF (it all has to be embedded).

What you could do, is to take a screenshot of the webpage (can use the webbrowser in .net), then print the screenshot to the pdf.

But the thing abuot this, is that it is a picture.. which kindof sucks.. but you could add metadata from the HTML-file. If not, I guess you will have to make some custom code to "translate" or "parse" the HTML to some elements in the PDF. It should be doable, but I would then base it on a framework that supports PDF-A, as it would be a product you could sell.

Dont know of any frameworks that support PDF-A though.

Author:  uday [ Mon Apr 12, 2010 9:27 am ]
Post subject:  Re: HTML TO PDF Conversion

Hi,
I do not need any form of linking.
Is there a way then to achive this?
Can you please share the book if possible?

Author:  olavxxx [ Tue Apr 20, 2010 6:12 am ]
Post subject:  Re: HTML TO PDF Conversion

uday wrote:
Hi,
I do not need any form of linking.
Is there a way then to achive this?
Can you please share the book if possible?

Sorry, I havent have had time to read it yet, even though it is not a very large book.
If you still want to know the title, please pm me :-)

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/