PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

Trouble reading PDF properties
https://forum.pdfsharp.net/viewtopic.php?f=2&t=3574
Page 1 of 1

Author:  swifty [ Tue Apr 18, 2017 12:48 pm ]
Post subject:  Trouble reading PDF properties

Hi All,

Apologies complete newbie here. I'm trying to read the Author properties of a PDF file which contains a semicolon separated list, but I'm having trouble getting all values and just wondered if I'm doing something wrong? (more than likely :) )

Example:
PDF Author property contains: UserA; UserB; UserC; UserD;

Using c#:
pdfDocument pdfDoc = PdfReader.Open(path);
string pdfAuthor = pdfDoc.Info.Author;
pdfAuthor returns only UserA

Any help would be greatly appreciated.

Author:  Thomas Hoevel [ Tue Apr 18, 2017 1:30 pm ]
Post subject:  Re: Trouble reading PDF properties

Hi!

It could be that the PDF file stores a short and a long author list at different locations and that you see the long list in Adobe Reader while PDFsharp retrieves the short version.

Just speculating as I don't have a PDF file to look at ...

Author:  swifty [ Tue Apr 18, 2017 2:46 pm ]
Post subject:  Re: Trouble reading PDF properties

Hi,

Thanks for the quick response.

Unfortunately I cant send the original file due to its content, but i do know it was created by Acrobat Distiller 9.0.

Also, i cant seem to replicate the Author properties using BullZip as this creates the author list with a set of double quotes around it which my code reads straight away without issue.

Would you know the location of the short and long author list?

Thanks

Author:  Thomas Hoevel [ Wed Apr 19, 2017 7:56 am ]
Post subject:  Re: Trouble reading PDF properties

swifty wrote:
Would you know the location of the short and long author list?
One field is part of the normal PDF structure. In your case it contains the short list. Adobe Reader shows quotes around this data if certain delimiters are contained in that field.

The other field is part of an XML file with meta data that is embedded in the PDF file (XMP). Adobe Reader does not add quotes when displaying this field. In your case it contains the long list.

At some stage those two fields got out of sync.

PDFsharp does not support XMP yet.

Open the PDF in WordPad or other editor and search for "UserA" to see where it occurs.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/