PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Wed Dec 13, 2017 10:23 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 13 posts ] 
Author Message
PostPosted: Mon Sep 05, 2016 3:00 pm 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
Hi there, I am currently using the following code to get the X and Y coordinates of a specific AcroField in a PDF document:

Quote:
PdfTextField imageField = (PdfTextField)inForm.Fields[elementName];
PdfRectangle rect = imageField.Elements.GetRectangle(PdfAnnotation.Keys.Rect);


This works fine if there is only 1 Field with the same name present in the PDF Document. However, if there are two fields both named "FirstName", even if they are on separate pages, this seems to remove the "/Rect" and "/P" flags, so I cannot use these to find the position or the page relevant to that field.

Is there any other way to get the position of a Field in the PDF, or any way to activate the "/Rect" and "/P" flags?

Thanks, Conor

I am using v.1.50.4000-Beta3B Nuget Package of PdfSharp.


Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 05, 2016 5:54 pm 
Offline

Joined: Tue Sep 30, 2014 12:29 pm
Posts: 17
Make sure to check the complete Field-Hierarchy (up and down) when searching for the /Rect and /P entries.
A single Field may have multiple Annotations.
e.g. You fill a Field on one page and another Field (an Annotation in this case) on a different page gets "automagically" filled, too.
That would be a single Field having 2 (or more) Annotations as childs.
These Annotations "should" have /Rect and /P entries.

HTH
Thomas


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 06, 2016 8:06 am 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
(void) wrote:
Make sure to check the complete Field-Hierarchy (up and down) when searching for the /Rect and /P entries.
A single Field may have multiple Annotations.
e.g. You fill a Field on one page and another Field (an Annotation in this case) on a different page gets "automagically" filled, too.
That would be a single Field having 2 (or more) Annotations as childs.
These Annotations "should" have /Rect and /P entries.

HTH
Thomas


Ahh that would make sense then, I would assume they get grouped under one entry, and each child has a different position. I'm not quite sure how to go about getting access to the child elements though? I currently have:
Code:
PdfTextField imageField = (PdfTextField)inForm.Fields[elementName];

And I can see from the debugger that this Field has two children (as there are two instances of it with the same name in the PDF). Do I use the "/kids" flag in some way?

Thanks, RBrNx


EDIT:
So I ended up managing to access the Kids elements with the following code:
Code:
PdfArray kids = (PdfArray)imageField.Elements["/Kids"];
PdfReference childRef = (PdfReference)kids.Elements[0];
PdfDictionary childDict = (PdfDictionary)childRef.Value;
rect = childDict.Elements.GetRectangle(PdfAnnotation.Keys.Rect);


But this seems quite long winded, is there a better way to get access to the elements of the children?


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 07, 2016 7:45 pm 
Offline

Joined: Tue Sep 30, 2014 12:29 pm
Posts: 17
Did myField.Fields not work for you ?

That should give you the children of myField.

In a different thread i posted a patch that partly does (read: "should do") what you're looking for.
(viewtopic.php?f=2&t=3369)

Excerpt from the code (PdfAcroField.cs):
Code:
        /// <summary>
        /// Gets a reference to the Page object this field belongs to
        /// </summary>
        public PdfReference PageReference
        {
            get
            {
                if (pageReference == null)
                    DeterminePage();
                return pageReference;
            }
        }
        private PdfReference pageReference;

        /// <summary>
        /// Gets the Page this Field is a member of
        /// </summary>
        public PdfPage Page
        {
            get { return PageReference != null ? (PdfPage)PageReference.Value : null; }
        }

        /// <summary>
        /// Tries to find the page reference object for this field
        /// </summary>
        protected internal void DeterminePage()
        {
            if (pageReference == null)
            {
                var pageRef = Elements.GetReference(Keys.Page);   // "/P" entry
                if (pageRef == null)
                {
                    var curField = Parent;
                    // first scan up in the hierarchy
                    while (curField != null && pageRef == null)
                    {
                        pageRef = curField.Elements.GetReference(Keys.Page);
                        if (pageRef == null)
                            curField = curField.Parent;
                    }
                    if (pageRef == null)
                    {
                        curField = this;
                        // now scan down the hierarchy
                        for (var i = 0; i < curField.Fields.Names.Length; i++)
                        {
                            curField = curField.Fields[i];
                            pageRef = FindPageRefInChilds(curField);
                            if (pageRef != null)
                                break;
                        }
                    }
                }
                if (pageRef != null)
                {
                    for (var i = 0; i < _document.PageCount; i++)
                    {
                        var page = _document.Pages[i];
                        if (page.ObjectID == pageRef.ObjectID)
                        {
                            pageRef = page.Reference;
                            break;
                        }
                    }
                }
                pageReference = pageRef;
            }
        }

        private PdfReference FindPageRefInChilds(PdfAcroField startField)
        {
            var pageRef = startField.Elements.GetReference(Keys.Page);
            if (pageRef != null)
                return pageRef;
            for (var i = 0; i < startField.Fields.Names.Length; i++)
            {
                var child = startField.Fields[i];
                pageRef = child.Elements.GetReference(Keys.Page);
                if (pageRef != null)
                    return pageRef;
                pageRef = FindPageRefInChilds(child);
                if (pageRef != null)
                    return pageRef;
            }
            return null;
        }

Note, that the above code does only find a single Annotation of a Field. (the first, it finds)
A more correct way would be to traverse the whole hierarchy and collect the found /P and /Rect entries.

I may be able to provide a more elaborate solution, if you could post a PDF having multiple Annotations per Field.

Regards,
Thomas


Top
 Profile  
Reply with quote  
PostPosted: Thu Sep 08, 2016 9:38 am 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
Thank you for your help! I can get the Rectangle of the kids much easier now!
Code:
for(int i = 0; i < currField.Fields.Count; ++i)
 {
   rect = currField.Fields[i].Rectangle;
   //do stuff with rect
 }


Speaking about the patch you posted, I cloned the Git Repo and applied the patch like you said. I tried to use the Flatten() function that you had written, however the program crashes. All I do is call the following code:
Code:
if (Flatten) {
  inPDF.AcroForm.Flatten();
}

inPDF.Save(outFilePath);
inPDF.Close();


I've attached a Copy of the Error message that I get. I built the Debug version of the PdfSharp.dll so that I could see what's happening and it seems to be a problem with 'Fields.Elements.Count' on line 634 of PdfAcroField.cs, the Fields object seems to be null.


Attachments:
File comment: Shows the null field issue.
NullFields.PNG
NullFields.PNG [ 134.84 KiB | Viewed 2505 times ]
File comment: Error shown when calling 'Flatten'
FlattenError.PNG
FlattenError.PNG [ 67.64 KiB | Viewed 2505 times ]
Top
 Profile  
Reply with quote  
PostPosted: Thu Sep 08, 2016 6:11 pm 
Offline

Joined: Tue Sep 30, 2014 12:29 pm
Posts: 17
Thanks for the info !
Are you able to post/send me the Pdf you used in your test ?
Maybe there is something with the document, the code does not handle yet.

Regards,
Thomas


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 09, 2016 8:03 am 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
(void) wrote:
Thanks for the info !
Are you able to post/send me the Pdf you used in your test ?
Maybe there is something with the document, the code does not handle yet.

Regards,
Thomas


I'll PM you a couple of PDFs. I've tried 3 different PDFs (including one I created in Adobe Acrobat myself which only had 1 form in it.) and they all produced the error.

Thanks, RBrNx


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 09, 2016 7:06 pm 
Offline

Joined: Tue Sep 30, 2014 12:29 pm
Posts: 17
I found 2 ways to fix the Exception.

1.
In PdfAcroField.cs, at the end of the "Flatten()"-Method, there is this code:
Code:
for (var i = 0; i < Fields.Elements.Count; i++)
{
    var field = Fields[i];
    field.Flatten();
}

Replace it with this code to fix the Exception:
Code:
if (HasKids)
{
    for (var i = 0; i < Fields.Elements.Count; i++)
    {
        var field = Fields[i];
        field.Flatten();
    }
}


2.
Also in PdfAcroField.cs, add the following constructor to the class "PdfAcroFieldCollection" :
Quote:
PdfAcroFieldCollection(PdfDocument document)
: base(document)
{ }


I personally prefer method 2, because it also prevents Exceptions if you accidentally access the Fields-Property without checking HasKids beforehand.


Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 12, 2016 9:55 am 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
(void) wrote:
I found 2 ways to fix the Exception.

1.
In PdfAcroField.cs, at the end of the "Flatten()"-Method, there is this code:
Code:
for (var i = 0; i < Fields.Elements.Count; i++)
{
    var field = Fields[i];
    field.Flatten();
}

Replace it with this code to fix the Exception:
Code:
if (HasKids)
{
    for (var i = 0; i < Fields.Elements.Count; i++)
    {
        var field = Fields[i];
        field.Flatten();
    }
}


2.
Also in PdfAcroField.cs, add the following constructor to the class "PdfAcroFieldCollection" :
Quote:
PdfAcroFieldCollection(PdfDocument document)
: base(document)
{ }


I personally prefer method 2, because it also prevents Exceptions if you accidentally access the Fields-Property without checking HasKids beforehand.


Brilliant, thank you for your help! Implemented your fix today and it works *almost* perfectly. I noticed there is still a bug with the Flatten function, it doesn't seem to handle children properly. If there is only 1 field with a specific name then the Flatten function works as intended, however if there is more than 1 field with the same name, the Flatten function will correctly delete the fields, but it does not seem to Flush the values present in the fields to the PDF so they just become blank.

I have attached a link to a Test PDF that produces the problem, along with two filled versions, one flattened and the other not flattened. You can see that the fields in "PDFSharp-UnFlat" are filled correctly, however once it has been flattened ("PDFSharp-Flat") you can see that 'TestBox1' and 'TestBox2' disappear as they existed on both Page 1 and Page 2, but 'TextBox3' does not disappear as there is only 1 copy of it.

http://www.mediafire . com/download/48vb6dpztoy6yuu/Flatten-Bug.zip

P.S. I only implemented Method 2 of your Flatten fix, I'll test to see if the bug still appears with Method 1 or Method 1+2 together.

EDIT
Tried Method 1 on it's own, along with Method 1 + 2 together, but neither fixed the issue.


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 13, 2016 8:36 am 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
So I dug further into the PDFSharp Source and I think I found the problem, in the Flatten function of PdfTextField.cs.

Code:
base.Flatten();

var rect = Rectangle;
if (!rect.IsEmpty)
{
   //Some more code here
}


This works for individual Fields fine as they have the "/Rect" flag, however when it comes to Fields with Children, the children have the "/Rect" flag instead of the Parent and this code does not account for that. I've not came up with a fix yet, but I'll try my best. Hopefully you manage to fix it in the meantime.

RBrNx


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 13, 2016 11:19 am 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
Right! Finally getting there. I have managed to add some code so that Child fields are now drawn onto the PDF. I have also added some code to use correct Font styling based on the Field properties. For instance, even if the Field had 'Arial Italic' as it's font, the flatten function would only draw the string with regular 'Arial' and not take into account the styling.

Here is my fix:

Add the following two Functions to PdfTextField.cs
Code:
internal void DrawToPDF(PdfRectangle rect, PdfPage elementPage, XFont font)
{
   if (!rect.IsEmpty)
   {
      using (var gfx = XGraphics.FromPdfPage(elementPage))
      {
          // Note: Page origin [0,0] is bottom left !
          var text = Text;
          if (text.Length > 0)
          {
             var xRect = new XRect(rect.X1, elementPage.Height.Point - rect.Y2, rect.Width, rect.Height);
             if ((Flags & PdfAcroFieldFlags.Comb) != 0 && MaxLength > 0)
             {
                var combWidth = xRect.Width / MaxLength;
                var format = XStringFormats.TopLeft;
                format.Comb = true;
                format.CombWidth = combWidth;
                gfx.Save();
                gfx.IntersectClip(xRect);
                gfx.DrawString(text, font, new XSolidBrush(ForeColor), xRect + new XPoint(0, 1.5), format);
                gfx.Restore();
              }
              else
              {
                 gfx.Save();
                 gfx.IntersectClip(xRect);
                 gfx.DrawString(text, font, new XSolidBrush(ForeColor), xRect + new XPoint(2, 2), XStringFormats.TopLeft);
                 gfx.Restore();
               }
            }
        }
    }
}

internal XFont GetFontFromElement(PdfAcroField element)
{
   string[] name = element.Font.FamilyName.Split(',');
   double size = element.Font.Size;
   XFontStyle style;

   if(name.Length > 1)
   {
      switch (name[1])
      {
         case "Bold":
            style = XFontStyle.Bold;
            break;
         case "Italic":
            style = XFontStyle.Italic;
            break;
         case "BoldItalic":
            style = XFontStyle.BoldItalic;
            break;
         default:
            style = XFontStyle.Regular;
            break;
         }
     }
     else
     {
        style = XFontStyle.Regular;
     }

   return new XFont(name[0], size, style);
}


and then replace the Flatten() function with the following function:


Code:
internal override void Flatten()
{
  base.Flatten();

  if (HasKids)
  {
     for(int i = 0; i < Fields.Elements.Count; i++)
     {
        var rect = Fields[i].Rectangle;
        var page = Fields[i].Page;
        var font = GetFontFromElement(Fields[i]);
        DrawToPDF(rect, page, font);
      }
   }
   else
   {
     var rect = Rectangle;
     var page = Page;
     DrawToPDF(rect, page, Font);
   }
}


There may be a more efficient or better way to draw the child elements, but this worked so I went with it, feel free to modify it. The only thing that this code does not take into account is the Alignment or Justification of the TextField. For example, if the TextField is Center Justified, the flatten function will not pay attention to it and simply draw the text as if it was left justified. I am stumped at the moment on how to get the Justification from the TextField, so if you help with that I would really appreciate it!

Thanks, RBrNx


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 14, 2016 9:45 am 
Offline

Joined: Mon Sep 05, 2016 7:54 am
Posts: 22
Woohoo! Think I've finally managed to get Alignment being read from the TextFields, It might not be the most efficient or best way but it's worked with all my tests, feel free to let me know if it doesn't work. So after you have made the changes in my above post, you will need to make the following changes too:

Step 1:
Change DrawToPDF function declaration from

Code:
internal void DrawToPDF(PdfRectangle rect, PdfPage elementPage, XFont font){ ... }

To
Code:
internal void DrawToPDF(PdfRectangle rect, PdfPage elementPage, XFont font, XStringFormat format){ ... }



Step 2:
Create a GetAlignment function like so:

Code:
internal XStringFormat GetAlignment(DictionaryElements dict)
{
   PdfItem item = dict.GetValue("/Q");
   if (item != null)
   {
      int alignment = Int32.Parse(item.ToString());

      switch (alignment)
      {
         case 0:
            return XStringFormats.TopLeft;
         case 1:
            return XStringFormats.TopCenter;
          case 2:
             return XStringFormats.TopRight;
           default:
             return XStringFormats.TopLeft;
       }
     }
     else
     {
        return XStringFormats.TopLeft;
      }
}



Step 3:
Add your calls to the GetAlignment function and pass it to the DrawToPDF function.

Code:
if (HasKids)
{
    for(int i = 0; i < Fields.Elements.Count; i++)
    {
       var rect = Fields[i].Rectangle;
       var page = Fields[i].Page;
       var font = GetFontFromElement(Fields[i]);
       XStringFormat format = GetAlignment(Fields[i].Elements);    //Add Call to Get Alignment
       DrawToPDF(rect, page, font, format);     //Pass format to DrawToPDF
     }
   }
   else
   {
        var rect = Rectangle;
        var page = Page;
        XStringFormat format = GetAlignment(Elements);    //Add Call to Get Alignment
        DrawToPDF(rect, page, Font, format);   //Pass format to DrawToPDF
     }
}


Step 4:
Remove this line from DrawToPDF

Code:
var format = XStringFormats.TopLeft;

And modify the following two lines to use the new parameter
Code:
gfx.DrawString(text, font, new XSolidBrush(ForeColor), xRect + new XPoint(0, 1.5), format);    //Pass format Parameter to DrawString

Code:
gfx.DrawString(text, font, new XSolidBrush(ForeColor), xRect + new XPoint(2, 2), format);     //Pass format Parameter to DrawString


Feel free to implement this is the PDFSharp Source!

RBrNx


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 29, 2017 7:43 pm 
Offline

Joined: Wed Mar 29, 2017 7:38 pm
Posts: 1
Can confirm that this solution worked for me, thanks guys. I was able to fill form fields then merge acroforms by flattening.

(void) and RBrNx perhaps it would be a good idea to make a pull request for this feature on the GitHub page? https://github.com/empira/PDFsharp/pulls

It would be great to see this feature added into the PDFSharp library, I'm sure many people are looking for it!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group