PDFsharp & MigraDoc Foundation :: View topic - Detect headings and create bookmarks from existing pdf

PDFsharp & MigraDoc Foundation https://forum.pdfsharp.net/

Detect headings and create bookmarks from existing pdf https://forum.pdfsharp.net/viewtopic.php?f=2&t=3643	Page 1 of 1

Author:	Asugusto [ Sun Aug 20, 2017 3:26 am ]
Post subject:	Detect headings and create bookmarks from existing pdf
Hi, Maybe you can help me with the following problem... I recive a very simple pdf file, i read the file and sett some security settings and everyone is happy so far, but now i have to add bookmarks i was thinking in read the code of the html, try to interpretate what is a heading and start to make the tree myself, do you know a better way? Regards

Author:	TH-Soft [ Sun Aug 20, 2017 5:24 pm ]
Post subject:	Re: Detect headings and create bookmarks from existing pdf
Hi! Asugusto wrote: i was thinking in read the code of the html [...] PDF is close to PostScript and far from HTML. Extracting text from PDF is not trivial. Extracting text with font attributes is a bit more challenging.

Author:	Asugusto [ Sun Aug 20, 2017 8:19 pm ]
Post subject:	Re: Detect headings and create bookmarks from existing pdf
Hi, thanks for the answer I know that is challenging, but i think that i have no option... Let me explain the situacion: All i have is the html and a base url. So i'm rendering the html in a browser engine that obtains the styles and scripts, once are rendered i print a simple pdf Now i have to add bookmarks to all the Headings (H1...H9) but is complicated I discard the option of create the pdf by myself and then inject the images, graphics, etc. Because there are continuesly changing, are large and i guess it would not be equal to the original html... Any suggestion?

Author:	TH-Soft [ Sun Aug 20, 2017 9:13 pm ]
Post subject:	Re: Detect headings and create bookmarks from existing pdf
Asugusto wrote: All i have is the html and a base url. So i'm rendering the html in a browser engine that obtains the styles and scripts, once are rendered i print a simple pdf Open source? Any chance to intercept the process? Do you have control over the CSS? This could make things easier - e.g. by having distinct font sizes or by having minimal (invisible) variations of the text colour. With all PDFs coming from the same PDF generator, things will be a bit simpler.

Author:	Asugusto [ Sun Aug 20, 2017 9:28 pm ]
Post subject:	Re: Detect headings and create bookmarks from existing pdf
Yes, i'm using ChromiumWebBrowser from CefSharp (https://github.com/cefsharp/CefSharp), have to search but i think i can intercept it. About the control of the css yes, i could have it. But i'm not seeing how it will helpme, can you explain me? Regards

Page 1 of 1	All times are UTC
Powered by phpBB® Forum Software © phpBB Group https://www.phpbb.com/