Hi,
I don't know if my idea is a good or bad one - or if there are applications that do this already.
Here is the issue:
Some old web-systems have been deposited to us.
We have to preserve the web-pages for future compatability.
Seeing as browsers change, one might also think that support for older web-standards might decline in the future.
Some people at my work, want to PDF-convert the entire directory-structure.
The different www-sites have different www-folders, so I would be inclined to have one pdf for each web-folder.
The only problem is that I would then have to:
1) read all web-pages and parse them (this is not so difficult via C# I think).
2) put the web-content, images, etc. in to the pdf-files
3) put metadata as per the PDF/A standard
I have yet to try PDFsharp conversion of webpage/stream, have anyone else tested this yet?
Also, is this a silly application to make? I know the PDF-variants will make it much harder to navigate, as you cant simply just do like in HTML with href, even though you can link to other documents.
I think there are rules about links in PDF/A too, but I have to read more up on the standard.
So, I have 2 ideas as of now:
1) Make the C# webPDF converter.
2) Make a "web tunnel" in PHP, which can convert the images to Tiff files (for archival storage), but display the Jpeg, gif, etc., as they where (via stream and html output - I guess).
So my "support"-case is more a general idea/tip feedback thing.. I didnt find any "general discussion" on this forum - which would have been a better place for this post.
If anyone else is developing something similar, eg. web->pdf (pdf/a), I would be interested in a dialogue
Ps. it's non commercial too, I work in a city archive.