PDFsharp & MigraDoc Foundation https://forum.pdfsharp.net/ |
|
Detect headings and create bookmarks from existing pdf https://forum.pdfsharp.net/viewtopic.php?f=2&t=3643 |
Page 1 of 1 |
Author: | Asugusto [ Sun Aug 20, 2017 3:26 am ] |
Post subject: | Detect headings and create bookmarks from existing pdf |
Hi, Maybe you can help me with the following problem... I recive a very simple pdf file, i read the file and sett some security settings and everyone is happy so far, but now i have to add bookmarks i was thinking in read the code of the html, try to interpretate what is a heading and start to make the tree myself, do you know a better way? Regards |
Author: | TH-Soft [ Sun Aug 20, 2017 5:24 pm ] |
Post subject: | Re: Detect headings and create bookmarks from existing pdf |
Hi! Asugusto wrote: i was thinking in read the code of the html [...] PDF is close to PostScript and far from HTML.Extracting text from PDF is not trivial. Extracting text with font attributes is a bit more challenging. |
Author: | Asugusto [ Sun Aug 20, 2017 8:19 pm ] |
Post subject: | Re: Detect headings and create bookmarks from existing pdf |
Hi, thanks for the answer I know that is challenging, but i think that i have no option... Let me explain the situacion: All i have is the html and a base url. So i'm rendering the html in a browser engine that obtains the styles and scripts, once are rendered i print a simple pdf Now i have to add bookmarks to all the Headings (H1...H9) but is complicated I discard the option of create the pdf by myself and then inject the images, graphics, etc. Because there are continuesly changing, are large and i guess it would not be equal to the original html... Any suggestion? |
Author: | TH-Soft [ Sun Aug 20, 2017 9:13 pm ] |
Post subject: | Re: Detect headings and create bookmarks from existing pdf |
Asugusto wrote: All i have is the html and a base url. Open source? Any chance to intercept the process?So i'm rendering the html in a browser engine that obtains the styles and scripts, once are rendered i print a simple pdf Do you have control over the CSS? This could make things easier - e.g. by having distinct font sizes or by having minimal (invisible) variations of the text colour. With all PDFs coming from the same PDF generator, things will be a bit simpler. |
Author: | Asugusto [ Sun Aug 20, 2017 9:28 pm ] |
Post subject: | Re: Detect headings and create bookmarks from existing pdf |
Yes, i'm using ChromiumWebBrowser from CefSharp (https://github.com/cefsharp/CefSharp), have to search but i think i can intercept it. About the control of the css yes, i could have it. But i'm not seeing how it will helpme, can you explain me? Regards |
Page 1 of 1 | All times are UTC |
Powered by phpBB® Forum Software © phpBB Group https://www.phpbb.com/ |