PDFsharp & MigraDoc Foundation :: View topic - Need help with Hexadecimal strings

PDFsharp & MigraDoc Foundation https://forum.pdfsharp.net/

Need help with Hexadecimal strings https://forum.pdfsharp.net/viewtopic.php?f=2&t=3519	Page 1 of 1

Author:	jma [ Wed Dec 28, 2016 3:15 pm ]
Post subject:	Need help with Hexadecimal strings
hi! PdfSharp is a very cool library but I have a problem about extracting texts. PDF reference explain this : "Strings may also be written in hexadecimal form". I get some hexadecimal form after extracting texts but these datas are differents of original text. For exemple : My PDF contains "Ajout d’une langue à un projet." and when i extract texts with PdfSharp, i get this : "00040169017D01B5019A0003011A035B01B50176011E0003016F01020176015001B5011E00030103000301B5017600030189018C017D0169011E019A0358". I can't find the original text. Indeed when I convert this to ASCII, it gives me "i}µ[µvovPµµv}iX" which is totaly different compared the original text.. Does anyone have any clue what is the issue and how to fix it? Thanks beforehand for your reply!

Author:	TH-Soft [ Wed Dec 28, 2016 4:13 pm ]
Post subject:	Re: Need help with Hexadecimal strings
Hi! PDF files often contain a subset of Unicode fonts and there should be a mapping table that allows you to translate the indexes from the hex string to the Unicode values. Can you copy the text to the clipboard using Adobe Reader?

Author:	jma [ Thu Dec 29, 2016 3:40 pm ]
Post subject:	Re: Need help with Hexadecimal strings
Yes I can copy the text to the clipboard using Adobe Reader. I finaly found the mapping table! I just have to use it to convert the hexa form. Thanks for your answer, it helped me

Page 1 of 1	All times are UTC
Powered by phpBB® Forum Software © phpBB Group https://www.phpbb.com/