PDFsharp & MigraDoc Forum

PDFsharp - A .NET library for processing PDF & MigraDoc - Creating documents on the fly
It is currently Thu Apr 02, 2026 8:42 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules

Also see our new Tailored Support & Services site.



Post new topic Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Thu Oct 16, 2014 1:54 am 
Offline

Joined: Thu Oct 16, 2014 1:48 am
Posts: 2
New to this API - haven't seen much documentation or other topics on basic manipulation of an existing file, with respect to the multitude of classes/objects provided.

I need to simply load an existing PDF (figured that out at least), and search for a word(s)... eventually returning a sentence or sentences containing that word.

I've played around with the different classes I suspect are involved.. Pdf.Content.Objects, but haven't gotten anywhere meaningful.

If I can achieve the above, being able to read a stream of text from a PDF and work with it in my own program, I would be happy. Any pointers appreciated

Thanks


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 16, 2014 7:33 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3144
Location: Cologne, Germany
PDFsharp was not designed to extract text.

Related posts you can look at:
viewtopic.php?p=1603#p1603
viewtopic.php?p=4010#p4010

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 16, 2014 2:56 pm 
Offline

Joined: Tue Jul 15, 2014 8:56 pm
Posts: 13
I use the pdfium library for text extraction, it takes some time and effort to setup as you'll get a c++ dll that needs some effort to use from .net code but once you have the interface in place it works very well. pdfsharp is there for creating/editing/splitting/saving but not viewing or unfortunately text extraction (the raw data is there but you need to be able to render the document in order to work out the positions of characters and then extract them in the correct order).


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 16, 2014 3:10 pm 
Offline

Joined: Thu Oct 16, 2014 1:48 am
Posts: 2
Ok.. thanks for the replies.

Anybody aware of any other open-source libraries with VB.net compatibility? I will look into pdfium... I have found a couple other libraries that require licenses.. if I can avoid that I would like to

Thanks


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: AhrefsBot and 260 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group