PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

replace a string by another on the PDF
https://forum.pdfsharp.net/viewtopic.php?f=2&t=1464
Page 1 of 1

Author:  abbd [ Tue Dec 07, 2010 3:09 pm ]
Post subject:  replace a string by another on the PDF

Hello,

I would replace a string by another on the PDF, it's possible ? thank you verry mutch.

Author:  abbd [ Wed Dec 08, 2010 1:28 pm ]
Post subject:  Re: replace a string by another on the PDF

Hello,

I don't have an answer for my questions, it's possible ?.
Thank you verry mutch.

Author:  Thomas Hoevel [ Wed Dec 08, 2010 3:26 pm ]
Post subject:  Re: replace a string by another on the PDF

Possible? Yes.
Complicated? Yes.

See here:
viewtopic.php?p=3816#p3816

Author:  abbd [ Wed Dec 08, 2010 3:41 pm ]
Post subject:  Re: replace a string by another on the PDF

Thomas Hoevel wrote:
Possible? Yes.
Complicated? Yes.

See here:
viewtopic.php?p=3816#p3816


Hello,
Thank you verry mutch for your answer, i don't have à problem to extract a text from PDF, my problem was to replace a string by another, for exemple i would to replace alle M. with Mme. on my PDF, it's possible ?
Thank you.

Author:  jeffhare [ Wed Dec 08, 2010 4:44 pm ]
Post subject:  Re: replace a string by another on the PDF

Hello,

I'm no PDF Expert... but I think...

In pdf, each letter in a word could be written with a different font or style and would become its own 'element' in the document stream even though it looks like a single word when rendered.
This could make it difficult to search for a specific word. In this case I think you might have to assemble the document to know which characters are rendered next to each other, and in which order to do this search/replace accurately. (especially if the document were edited more than once given the way the catalogs and edits get applied by Acrobat.) My guess here...

If you are in control over how the doc was created, then you could be reasonably successful in doing string replacements using PdfReader class and iterating over the document's contents I would think as long as all the search terms were written using the same font and style and didn't get manually edited afterwords.

I do something similar, but I'm processing Annotations (hyperlinks) which are a bit easier to find and update. I wish I had some code to pass along, but I don't.

Perhaps you should look at the document explorer example in the Samples and see how it parses out the contents of a word document for display and see if you can use these techniques to solve your problem. I'm not sure offhand what the name of that project file is, but if you load the Samples master solution, you should be able to find it.

Let us know what you find out!

-Jeff

Author:  Marthalion [ Tue Aug 09, 2011 2:24 pm ]
Post subject:  Re: replace a string by another on the PDF

Bit of a necro post coming up, but I never found the answer to this question on this forum myself.

Was stuck on this problem myself for a while but finally solved it using pfdSharp. The trick was to read out the stream in page.Contents.Elements.GetDictionary().Stream, convert the stream into a string. perform string.Replace() on all parts you need and then convert the new string back into a stream and save the new stream into your page.Contents.Elements.GetDictionary().Stream.Value.

Code:

byte[] inStream;
byte[] outStream;
string stringStream;

for (int i = 0; i < importDoc.PageCount; i++)
{
newPage = importDoc.Pages[i];
stringStream= "";

for (int j = 0; j < newPage.Contents.Elements.Count; j++)
{
PdfDictionary.PdfStream stream = newPage .Contents.Elements.GetDictionary(j).Stream;
inStream = stream.Value;
foreach (byte b in inStream)
stringStream += (char)b;

stringStream = stringStream.Replace("tag", stringStream);

outStream = PdfDictionary.PdfStream.RawEncode(stringStream);
newPage.Contents.Elements.GetDictionary(j).Stream.Value = outStream;
}

newPage = exportDoc.AddPage(newPage);
}

exportDoc.Save("Path.pdf");

Enjoy!

Author:  RalphKoerber [ Wed Oct 02, 2013 1:14 pm ]
Post subject:  Re: replace a string by another on the PDF

Hello

I am completely new to this topic and have to replace a placeholder in a PDF file.
Please could you make a sample project for me available (C# or VB.NET)?

Kindes regards,
Ralph Koerber

Author:  quantumkev [ Fri Jan 03, 2014 8:07 pm ]
Post subject:  Re: replace a string by another on the PDF

Ralph - I am looking to do this very same thing... anyone have any ideas..?

q-kev

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/