Author |
Message |
Forum: Support Topic: find text and get its coordinates |
TH-Soft |
Posted: Wed Nov 06, 2024 5:15 pm
|
|
Replies: 2 Views: 2571
|
... same application, but can become very difficult if it should work with PDFs from any application. https://forum.pdfsharp.net/search.php?keywords=extracttext&terms=all&author=&sc=1&sf=all&sk=t&sd=d&sr=posts&st=0&ch=300&t=0&submit=Search |
|
|
Forum: Support Topic: ContentReader.ReadContent not working for special signs |
peter.pazurik |
Posted: Wed Dec 16, 2020 12:47 pm
|
|
Replies: 12 Views: 18484
|
PDFSharp was to only tool that provided me a way how to distinguish between table cells. At least in my PDF documents... In my ExtractText method I just replaced all Tj operators with '|' and when I get for example " |xxx| " I know that I have a text from one cell. Of course ... |
|
|
Forum: Support Topic: ContentReader.ReadContent not working for special signs |
peter.pazurik |
Posted: Wed Dec 16, 2020 12:27 pm
|
|
Replies: 12 Views: 18484
|
... = PdfReader.Open(filePath, PdfDocumentOpenMode.ReadOnly)) { var result = new StringBuilder(); foreach (PdfPage page in _document.Pages) { ExtractText(ContentReader.ReadContent(page), result); result.AppendLine(); } } Within the ExtractText method I am just parsing through elements in CSequence ... |
|
|
Forum: Support Topic: Please some help to extract text from PDF page. |
Tassadar |
Posted: Tue Jun 25, 2019 7:46 am
|
|
Replies: 7 Views: 39307
|
... stream = SamplePage.Contents.Elements.GetDictionary(0).Stream; var content = ContentReader.ReadContent(SamplePage); var text = PdfSharpExtensions.ExtractText(content); And also this one: PdfDocument SamplePdf = PdfReader.Open(@"T:\samplepdf.pdf", PdfDocumentOpenMode.ReadOnly); PdfPage ... |
|
|
Forum: Support Topic: PDF pages are blank after splitting with PDFSharp |
AgileDotnetter |
Posted: Fri Feb 05, 2016 3:26 pm
|
|
Replies: 7 Views: 11094
|
... GetEobFirstPageIndicesFromCombinedPdf(PdfDocument combinedPdf) { for (int i = 0; i < combinedPdf.PageCount; i++) { var text = combinedPdf.Pages[i].ExtractText(); if (text.Contains(newPageToken)) yield return i; } } private IEnumerable<EobPdf> SplitIntoEobPackets(PdfDocument combinedPdf, int[] firstPageIndices) ... |
|
|
Sort by: |