PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

Get Text from a specific position
https://forum.pdfsharp.net/viewtopic.php?f=2&t=3738
Page 1 of 1

Author:  sri79 [ Thu Mar 08, 2018 12:26 pm ]
Post subject:  Get Text from a specific position

Hi,

How to get the text from the specific XRect position?

So far I have below code to add a text at specific XRect position. It works great. However, I would like to read the existing text from that position and based upon the existing text, I need to update the new text.

Code:
using (PdfDocument InputDocument = PdfReader.Open(filePath, PdfDocumentOpenMode.Modify))
            {
                for (int i = 0; i < InputDocument.Pages.Count; i++)
                {
                    PdfPage page = InputDocument.Pages[i];
                    XGraphics gfx = XGraphics.FromPdfPage(page);
                    XFont font = new XFont("Courier", 10, XFontStyle.Regular);
                    XTextFormatter tf = new XTextFormatter(gfx);

                    var rect = new XRect(new PointF(505, 38.5f), new SizeF(76, 10));

                    gfx.DrawRectangle(XBrushes.White, rect);
                    tf.Alignment = XParagraphAlignment.Left;
                    tf.DrawString("some text based upon existing text", font, XBrushes.Black, rect, XStringFormats.TopLeft);
                }

                InputDocument.Save("out.pdf");
            }


Please help.

Thanks

Author:  Thomas Hoevel [ Thu Mar 08, 2018 1:42 pm ]
Post subject:  Re: Get Text from a specific position

Hi!

PDFsharp was not designed to extract text.

You can search this site or the web for "text extract" and maybe look at this package:
https://www.nuget.org/packages/PdfTextract/

The task is simpler if you deal with PDF files coming from just one application.

Author:  sri79 [ Fri Mar 09, 2018 4:11 am ]
Post subject:  Re: Get Text from a specific position

Thomas Hoevel wrote:
Hi!

PDFsharp was not designed to extract text.

You can search this site or the web for "text extract" and maybe look at this package:
https://www.nuget.org/packages/PdfTextract/

The task is simpler if you deal with PDF files coming from just one application.

Ok thanks. Yes. The PDF files which we are trying to edit are coming from one source. The position is same on all pdf files. Basically I am trying to update the page number once multiple pdf files are appended. There might be prefix/suffix to the page numbers. So, I need to read the existing page number and update it accordingly.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/