PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Sun Jun 16, 2024 7:04 am

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: OCR PDF
PostPosted: Thu Mar 28, 2013 2:08 pm 
Offline

Joined: Mon Mar 25, 2013 5:07 pm
Posts: 6
Hi PDF Sharp Team,

Does the PDF Sharp has to capability for OCR the PDF file? If yes then how? in case No then is there any plan to release that in near future.

Thanks


Top
 Profile  
Reply with quote  
 Post subject: Re: OCR PDF
PostPosted: Thu Mar 28, 2013 4:44 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3100
Location: Cologne, Germany
Hi!

There are no plans to add OCR support (for PDF files that contain scanned pages).

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
 Post subject: Re: OCR PDF
PostPosted: Thu Mar 28, 2013 4:48 pm 
Offline

Joined: Mon Mar 25, 2013 5:07 pm
Posts: 6
Thomas Hoevel wrote:
Hi!

There are no plans to add OCR support (for PDF files that contain scanned pages).



Thanks for your reply.

I have a scenario where in I have a TIFF image that is OCR identified. How can i use PDF Sharp to create PDF file in a way that it embeds the TIFF image without losing OCR content.


Top
 Profile  
Reply with quote  
 Post subject: Re: OCR PDF
PostPosted: Tue Apr 02, 2013 3:26 pm 
Offline

Joined: Mon Mar 25, 2013 5:07 pm
Posts: 6
himanshu wrote:
Thomas Hoevel wrote:
Hi!

There are no plans to add OCR support (for PDF files that contain scanned pages).



Thanks for your reply.

I have a scenario where in I have a TIFF image that is OCR identified. How can i use PDF Sharp to create PDF file in a way that it embeds the TIFF image without losing OCR content.


Hi Thomas,

Any updates??


Top
 Profile  
Reply with quote  
 Post subject: Re: OCR PDF
PostPosted: Wed Apr 03, 2013 7:55 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3100
Location: Cologne, Germany
himanshu wrote:
How can i use PDF Sharp to create PDF file in a way that it embeds the TIFF image without losing OCR content.
PDFsharp only adds the image data (in PDF bitmap format).
PDFsharp uses OS functions to read TIFF files. I presume OCR content gets lost at that stage already. I would not know how to add OCR content to the PDF file (but the internals of PDF files are not my area of expertise).

I'm afraid retrieving the OCR content from the TIFF will require major changes to the current code. Making use of this OCR content will require new code.
Long story short: I don't know if this can be done - if it can be done, then (I'm afraid) it will be low on our wish list for future extensions.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 157 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group