PDFsharp & MigraDoc Foundation https://forum.pdfsharp.net/ |
|
Need help with Hexadecimal strings https://forum.pdfsharp.net/viewtopic.php?f=2&t=3519 |
Page 1 of 1 |
Author: | jma [ Wed Dec 28, 2016 3:15 pm ] |
Post subject: | Need help with Hexadecimal strings |
hi! PdfSharp is a very cool library but I have a problem about extracting texts. PDF reference explain this : "Strings may also be written in hexadecimal form". I get some hexadecimal form after extracting texts but these datas are differents of original text. For exemple : My PDF contains "Ajout d’une langue à un projet." and when i extract texts with PdfSharp, i get this : "00040169017D01B5019A0003011A035B01B50176011E0003016F01020176015001B5011E00030103000301B5017600030189018C017D0169011E019A0358". I can't find the original text. Indeed when I convert this to ASCII, it gives me "i}µ[µvovPµµv}iX" which is totaly different compared the original text.. Does anyone have any clue what is the issue and how to fix it? Thanks beforehand for your reply! |
Author: | TH-Soft [ Wed Dec 28, 2016 4:13 pm ] |
Post subject: | Re: Need help with Hexadecimal strings |
Hi! PDF files often contain a subset of Unicode fonts and there should be a mapping table that allows you to translate the indexes from the hex string to the Unicode values. Can you copy the text to the clipboard using Adobe Reader? |
Author: | jma [ Thu Dec 29, 2016 3:40 pm ] |
Post subject: | Re: Need help with Hexadecimal strings |
Yes I can copy the text to the clipboard using Adobe Reader. I finaly found the mapping table! I just have to use it to convert the hexa form. Thanks for your answer, it helped me |
Page 1 of 1 | All times are UTC |
Powered by phpBB® Forum Software © phpBB Group https://www.phpbb.com/ |