PDFsharp & MigraDoc Foundation • View topic - replace a string by another on the PDF

View unanswered posts | View active topics

Board index » PDFsharp & MigraDoc » Support

All times are UTC

Forum rules

Please read this before posting on this forum: Forum Rules

replace a string by another on the PDF

Moderator: Stefan Lange

Page 1 of 1

[ 8 posts ]

Print view

Previous topic | Next topic

Author

Message

abbd

Post subject: replace a string by another on the PDF

Posted: Tue Dec 07, 2010 3:09 pm

Joined: Mon Mar 09, 2009 11:19 am
Posts: 12

Hello,

I would replace a string by another on the PDF, it's possible ? thank you verry mutch.

Top

abbd

Post subject: Re: replace a string by another on the PDF

Posted: Wed Dec 08, 2010 1:28 pm

Joined: Mon Mar 09, 2009 11:19 am
Posts: 12

Hello,

I don't have an answer for my questions, it's possible ?.
Thank you verry mutch.

Top

Thomas Hoevel

Post subject: Re: replace a string by another on the PDF

Posted: Wed Dec 08, 2010 3:26 pm

PDFsharp Guru

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3109
Location: Cologne, Germany

Possible? Yes.
Complicated? Yes.

See here:
viewtopic.php?p=3816#p3816

_________________
Regards
Thomas Hoevel
PDFsharp Team

Top

abbd

Post subject: Re: replace a string by another on the PDF

Posted: Wed Dec 08, 2010 3:41 pm

Joined: Mon Mar 09, 2009 11:19 am
Posts: 12

Thomas Hoevel wrote:

Possible? Yes.
Complicated? Yes.

See here:
viewtopic.php?p=3816#p3816

Hello,
Thank you verry mutch for your answer, i don't have à problem to extract a text from PDF, my problem was to replace a string by another, for exemple i would to replace alle M. with Mme. on my PDF, it's possible ?
Thank you.

Top

jeffhare

Post subject: Re: replace a string by another on the PDF

Posted: Wed Dec 08, 2010 4:44 pm

Supporter

Joined: Thu May 27, 2010 7:40 pm
Posts: 59
Location: New Hampshire, USA

Hello,

I'm no PDF Expert... but I think...

In pdf, each letter in a word could be written with a different font or style and would become its own 'element' in the document stream even though it looks like a single word when rendered.
This could make it difficult to search for a specific word. In this case I think you might have to assemble the document to know which characters are rendered next to each other, and in which order to do this search/replace accurately. (especially if the document were edited more than once given the way the catalogs and edits get applied by Acrobat.) My guess here...

If you are in control over how the doc was created, then you could be reasonably successful in doing string replacements using PdfReader class and iterating over the document's contents I would think as long as all the search terms were written using the same font and style and didn't get manually edited afterwords.

I do something similar, but I'm processing Annotations (hyperlinks) which are a bit easier to find and update. I wish I had some code to pass along, but I don't.

Perhaps you should look at the document explorer example in the Samples and see how it parses out the contents of a word document for display and see if you can use these techniques to solve your problem. I'm not sure offhand what the name of that project file is, but if you load the Samples master solution, you should be able to find it.

Let us know what you find out!

-Jeff

Top

Marthalion

Post subject: Re: replace a string by another on the PDF

Posted: Tue Aug 09, 2011 2:24 pm

Joined: Tue Aug 09, 2011 2:00 pm
Posts: 1

Bit of a necro post coming up, but I never found the answer to this question on this forum myself.

Was stuck on this problem myself for a while but finally solved it using pfdSharp. The trick was to read out the stream in page.Contents.Elements.GetDictionary().Stream, convert the stream into a string. perform string.Replace() on all parts you need and then convert the new string back into a stream and save the new stream into your page.Contents.Elements.GetDictionary().Stream.Value.

Code:

byte[] inStream;
byte[] outStream;
string stringStream;

for (int i = 0; i < importDoc.PageCount; i++)
{
newPage = importDoc.Pages[i];
stringStream= "";

for (int j = 0; j < newPage.Contents.Elements.Count; j++)
{
PdfDictionary.PdfStream stream = newPage .Contents.Elements.GetDictionary(j).Stream;
inStream = stream.Value;
foreach (byte b in inStream)
stringStream += (char)b;

stringStream = stringStream.Replace("tag", stringStream);

outStream = PdfDictionary.PdfStream.RawEncode(stringStream);
newPage.Contents.Elements.GetDictionary(j).Stream.Value = outStream;
}

newPage = exportDoc.AddPage(newPage);
}

exportDoc.Save("Path.pdf");

Enjoy!

Top

RalphKoerber

Post subject: Re: replace a string by another on the PDF

Posted: Wed Oct 02, 2013 1:14 pm

Joined: Wed Oct 02, 2013 1:08 pm
Posts: 1

Hello

I am completely new to this topic and have to replace a placeholder in a PDF file.
Please could you make a sample project for me available (C# or VB.NET)?

Kindes regards,
Ralph Koerber

Top

quantumkev

Post subject: Re: replace a string by another on the PDF

Posted: Fri Jan 03, 2014 8:07 pm

Joined: Thu Jan 02, 2014 5:51 pm
Posts: 3

Ralph - I am looking to do this very same thing... anyone have any ideas..?

q-kev

Top

Page 1 of 1

[ 8 posts ]

Board index » PDFsharp & MigraDoc » Support

All times are UTC

Who is online

Users browsing this forum: Google [Bot] and 27 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum