PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Mon Aug 10, 2020 10:49 am

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 14 posts ] 
Author Message
PostPosted: Wed Sep 30, 2009 4:35 pm 
Offline

Joined: Wed Sep 30, 2009 4:29 pm
Posts: 6
We have a PDF with form fields that was created with Acrobat. We use PDFsharp to fill the fields then save the PDF out.

On my machine, which I use PDF-XChange as my PDF viewer, the PDF opens and the fields are filled as expected.

When the same PDF is opened with Acrobat Reader, the fields appear empty. If you click inside one, a box appears that reads "Please Note: You cannot save a completed copy of this form on your computer." However, if a field has focus, the value we inserted shows up. Once that field loses focus, the value disappears again.

What am I doing wrong?

Code:

Code:
           

Dim d As PdfSharp.Pdf.PdfDocument = PdfSharp.Pdf.IO.PdfReader.Open(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) & "\test.pdf", PdfSharp.Pdf.IO.PdfDocumentOpenMode.Modify)

d.AcroForm.Fields.Item("FirstName").ReadOnly = False
d.AcroForm.Fields.Item("LastName").ReadOnly = False

d.AcroForm.Fields.Item("FirstName").Value = New PdfSharp.Pdf.PdfString(FirstName)
d.AcroForm.Fields.Item("LastName").Value = New PdfSharp.Pdf.PdfString(LastName)
           
d.Save(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) & "\test-filled.pdf")


Top
 Profile  
Reply with quote  
PostPosted: Fri Oct 02, 2009 6:17 pm 
Offline

Joined: Wed Sep 30, 2009 4:29 pm
Posts: 6
It seems that it should be something simple that I'm missing. Anyone have any sort of ideas?


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 06, 2009 9:13 pm 
Offline

Joined: Wed Sep 30, 2009 4:29 pm
Posts: 6
Does anyone have any suggestions please?


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 12, 2009 7:40 pm 
Offline

Joined: Mon Oct 12, 2009 7:37 pm
Posts: 6
I am having exactly the same problem, does anyone have some ideas?


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 13, 2009 7:37 pm 
Offline

Joined: Mon Oct 12, 2009 7:37 pm
Posts: 6
Just as an update, I made some progress on identifying why this is happening (I don't have a fix yet though). All of the PDF viewers I've tried seem to have this behavior built in where the default text for a text box will always display, unless the text box has focus, until the value is changed (that seems to make it stick, at least in the Acrobat readers). What is setting the text you see by default upon first loading the document is *not* the default value property - I found this out by making my c# code set the /DV element of one of my text boxes and to my surprise, that had no effect. I then resorted to opening the PDF that PDFsharp is generating in a hex editor, and found that what actually sets the text you see upon opening the PDF in Acrobat Reader is a reference to yet another indirect object (a reference made within each textbox itself) that contains a stream object containing the text to be shown. I am able to hack the document in a hex editor to get the results I require, and I am now trying to come up with a way to access this through the pdfsharp libraries. So far, it's not looking good.

As additional background info, I have a PDF document with a form and a couple of textboxes that was created in Acrobat Professional. I am then loading this document using the pdfsharp libs, changing the values of the text boxes (based on an external datasource that will be unavailable to the end consumer of this document), then saving the document to disk. I can see that once the PDF is created in memory, the textbox widgets correctly have references to the indirect objects containing the streams with the actual default display text ... the problem is, the stream data does not seem to be accessible through the libraries at all. As an example, one of the indirect objects containing one of these streams is defined in my PDF as:
Code:
283 0 obj
<<
/Subtype/Form
/Length 129
/Matrix[1 0 0 1 0 0]
/Resources
<<
/Font
<<
/FranklinGothicHeavy 284 0 R
>>
/ProcSet[/PDF/Text]
>>
/Type/XObject
/BBox[0 0 408.438 35.233]
/FormType 1
>>
stream
/Tx BMC
q
1 1 406.438 33.233 re
W
n
BT
/FranklinGothicHeavy 18 Tf
0 g
2 11.5332 Td
(date ) Tj
42.84 0 Td
(prepared) Tj
ET
Q
EMC
endstream
endobj

And I can see everything represented in the corresponding PDFSharp object up to, but not including, the actual stream data.

My guess is that this is just not implemented? If anyone has light to shed, it would be greatly appreciated. In the meantime I'll keep hacking away to see if I can come up with a halfway robust solution (I do not consider altering the contents of the file itself, after it is saved to disk, to be the least bit robust).

-Max


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 13, 2009 7:45 pm 
Offline

Joined: Mon Oct 12, 2009 7:37 pm
Posts: 6
Ah! Wouldn't you know it - as soon as I posted that, I went back to my code and found out that if I take the iref to the object referenced within one of my textboxes, then use the object number of that to look up the object as a pdfdictionary within the document.internals.allobjects collection, the stream property contains all of the stream data. This might be my ticket out, I will post back within 24 hours to let you all know the outcome. If I get it to work I'll post a code example.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 14, 2009 3:20 pm 
Offline

Joined: Mon Oct 12, 2009 7:37 pm
Posts: 6
I have this working now, as described above, though it is not the prettiest solution it is probably the best one that can be had for now.

I learned a little bit about stream operator notation before putting together that "streamTemplate" const. If you are familiar with the notation, it could be modified to suit specific needs as necessary.

Code:
                PdfDocument coverSheetDoc = PdfReader.Open(@"c:\path\to\some\document.pdf", PdfDocumentOpenMode.Import);

                //template for setting font, text position, and text
                const string streamTemplate = "BT\n" +
                    "/{1} {2} Tf\n" +
                    "{3} {4} Td\n" +
                    "({0}) Tj\n" +
                    "ET";
                const string fontName = "FranklinGothicHeavy";
                const int fontSize = 18;

                PdfTextField currentField = (PdfTextField)(coverSheetDoc.AcroForm.Fields["CustomerName"]);
                const string customerName = "Bill McTest";
                PdfString customerNamePdfStr = new PdfString(customerName);
                //set the value of this field
                currentField.Value = customerNamePdfStr;
                //set the default value of this field
                currentField.Elements["/DV"] = customerNamePdfStr;

                //construct stream data from the template above, using our customer name, font name
                //font size, and rendering position
                string customerNameStream =
                    String.Format(streamTemplate, customerName, fontName, fontSize, 2, 9.5752);

                //retrieve the object number of the iref used as an appearance stream for the text box
                int objectNumber = ((PdfSharp.Pdf.Advanced.PdfReference)(((PdfDictionary)(currentField.Elements["/AP"])).Elements["/N"])).ObjectNumber;
                //now set the stream data directly for indirect object we just looked up
                ((PdfDictionary)(coverSheetDoc.Internals.AllObjects[objectNumber - 1])).Stream.Value = Encoding.ASCII.GetBytes(customerNameStream);

                outputDoc = new PdfDocument();
                outputDoc.AddPage(coverSheetDoc.Pages[0]);
                outputDoc.Save(@"c:\path\to\some\newdocument.pdf");
                outputDoc.Close();  //unnecessary??
                coverSheetDoc.Close();


The only thing I really dislike about this solution (other than even needing to do this at all) is the need to set the text rending position in the stream data. If you do not set it, the text will render below the actual text box. I am going to add some more code to my own project to attempt to grab the text rendering position from the stream data of the source document (template, if you will) that I'm using, as those seem to be the values the produce correct results. That should just be a matter of apply a small regular expression to the string representation of the stream data though, so should be easy.

Anyway, I hope this has helped the original poster and maybe some others out there! I have literally just been working with pdfsharp for only a few days, so there could well be a cleaner solution available by utilizing portions of the pdfsharp library that I am unaware of - but I tried to thoroughly search the libraries beforehand to make sure I wasn't missing anything.

-Max


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 14, 2009 4:37 pm 
Offline

Joined: Wed Sep 30, 2009 4:29 pm
Posts: 6
I'm not sure if it is the fault of the original PDF I'm trying to fill in, but the /AP element is missing from all text fields. I tried adding the /AP dictionary with the /N stream directly to the field with no results.

Also, the Stream property of almost every object (I say almost, because I haven't tried everything in the PDF document) returns Nothing. I've also tried creating a stream to assign to it, with no results.

It is extremely frustrating that Adobe's product seems to be the only one that works (or doesn't work depending on how you look at it) properly with these documents.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 14, 2009 5:51 pm 
Offline

Joined: Mon Oct 12, 2009 7:37 pm
Posts: 6
mdurant wrote:
I'm not sure if it is the fault of the original PDF I'm trying to fill in, but the /AP element is missing from all text fields. I tried adding the /AP dictionary with the /N stream directly to the field with no results.

Also, the Stream property of almost every object (I say almost, because I haven't tried everything in the PDF document) returns Nothing. I've also tried creating a stream to assign to it, with no results.

It is extremely frustrating that Adobe's product seems to be the only one that works (or doesn't work depending on how you look at it) properly with these documents.


That is frustrating. I created my document using Acrobat Professional 8, and I tried saving as versions 6 through 8, with consistent behavior. Acrobat Professional 8 seems to always write out an /AP dict containing a /N element referencing an indirect stream object (which then contains whatever you put for a default value in Acrobat Professional for the text field). I suggest taking a look at your input PDF in a text editor - find where one of your text fields is defined in there and see what it has for properties. If you see AP dictionary there, then you have some different behavior with your pdfsharp libs than I do (I am using version 1.30, I am guessing you are too?). If you do not have the appearance stream (/AP with the /N iref) defined within your actual input document (viewing it in a text editor) then you should still be able to add it in code, I would think. When you try adding it in code, see what you are actually getting in the output doc (using a text editor, again).

Here is some example data from what I'm working on - these are the definitions of one of my text fields, directly followed by the definition of the appearance stream object that the text field references:
Code:
282 0 obj
<<
/Rect[167.681 276.559 576.119 311.792]
/Subtype/Widget
/F 132
/P 288 0 R
/T(DatePrepared)
/V(October 14, 2009)
/DA(/FranklinGothicHeavy 18 Tf 0 g)
/DV(date prepared)
/FT/Tx
/Type/Annot
/Ff 4194305
/MK
<<
>>
/AP
<<
/N 283 0 R
>>
>>
endobj
283 0 obj
<<
/Subtype/Form
/Length 67
/Matrix[1 0 0 1 0 0]
/Resources
<<
/Font
<<
/FranklinGothicHeavy 284 0 R
>>
/ProcSet[/PDF/Text]
>>
/Type/XObject
/BBox[0 0 408.438 35.233]
/FormType 1
>>
stream
BT
/FranklinGothicHeavy 18 Tf
2 11.5332 Td
(October 14, 2009) Tj
ET
endstream
endobj


I am an experienced developer, but still very new to dealing with the guts of PDFs like this. If any PDF gurus out there are watching, we could probably use your 2 cents. :)


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 12, 2009 11:58 pm 
Offline

Joined: Wed Sep 30, 2009 4:29 pm
Posts: 6
I still haven't been able to figure this out. Anyone else have any luck with this?


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 15, 2009 6:30 pm 
Offline

Joined: Wed Sep 30, 2009 4:29 pm
Posts: 6
I still have been unable to find a solution to this problem. In case anyone else has the same need, here is the workaround I'm using:

I assemble an FDF and save it out to disk. I Process.Start() the PDF Toolkit to merge and flatten the FDF to the PDF.

It appears that I'm unable to use PDFsharp for my needs in this case.


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 16, 2010 7:36 pm 
Offline

Joined: Tue Nov 16, 2010 7:17 pm
Posts: 1
As opposed to filling out the fields and trying to get them to display, I wrote the following to replace the fields with text in a new document. It's easily extensible to use the PdfTextFields as rectangles for images as well by simply testing for other object types in DrawItemForField() and calling gfx.DrawImage().

Code:
class Program
{
  static void Main()
  {
    var document = PdfReader.Open(@"C:/Form.pdf", PdfDocumentOpenMode.Import);

    var outputDoc = new PdfDocument();
    outputDoc.AddPage(document.Pages[0]);
    outputDoc.Version = 14;

    var page = outputDoc.Pages[0];

    var fields = new List<FormField>();

    // Remove all the form fields from the copied page
    foreach (var annotation in page.Annotations.ToList().OfType<PdfAnnotation>())
    {
      var annotParent = annotation.Elements[PdfAcroField.Keys.Parent] as PdfReference;
      var title = (annotParent != null)
        ? ((PdfDictionary)annotParent.Value).Elements[PdfAcroField.Keys.T].ToString()
        : annotation.Title;

      if (document.AcroForm.Fields.Names.Contains(title))
      {
        fields.Add(new FormField(title, annotation, page));
        page.Annotations.Remove(annotation);
      }
    }

    var gfx = XGraphics.FromPdfPage(page);

    foreach (var field in fields)
    {
      field.Field = document.AcroForm.Fields[field._title];
      DrawItemForField(gfx, field, "-" + field._title + "-");
    }

    const string filename = @"C:/Form2.pdf";
    outputDoc.Save(filename);
    outputDoc.Close();
    document.Dispose();

    Process.Start(filename);
  }

  private static void DrawItemForField(XGraphics gfx, FormField field, object obj)
  {
    var font = FontFromPdfFont(field._font);
    var brush = BrushFromPdfFont(field._font);

    var format = XStringFormats.TopLeft;
    if (field.Field.Flags != PdfAcroFieldFlags.Multiline)
    {
      format.LineAlignment = XLineAlignment.Center;
      format = AdjustFormat(format, field._alignment);
    }

    if (obj is string)
    {
      if (field.Field.Flags == PdfAcroFieldFlags.Multiline)
      {
        var tf = new XTextFormatter(gfx);
        tf.DrawString(obj as string, font, brush, field._rect, XStringFormats.TopLeft);
      }
      else
        gfx.DrawString(obj as string, font, brush, field._rect, format);
    }
  }

  private static XStringFormat AdjustFormat(XStringFormat format, PdfItem alignment)
  {
    if (alignment is PdfInteger)
    {
      switch (((PdfInteger)alignment).Value)
      {
        case 0: format.Alignment = XStringAlignment.Near; break;
        case 1: format.Alignment = XStringAlignment.Center; break;
        case 2: format.Alignment = XStringAlignment.Far; break;
      }
    }
    return format;
  }

  private static XFont FontFromPdfFont(PdfString font)
  {
    const string boldItalic = "Bold,Italic";
    const string bold = "Bold";
    const string italic = "Italic";
    const string fontHelv = "Helvetica";
    const string fontTiRo = "Times";
    const string fontCour = "Courier New";
    const string fontSymb = "Symbol";

    var options = new XPdfFontOptions(PdfFontEncoding.WinAnsi, PdfFontEmbedding.Default);

    if (font == null) return null;

    var pieces = font.Value.Split(' ');

    var family = pieces[0].Replace("/", string.Empty);
    double size;
    if (!double.TryParse(pieces[1], out size))
      return new XFont("Arial", 10);

    var split = family.Split(',');
    var searchfam = split[0];
    var modifier = string.Join(",", split.Reverse().Take(split.Count() - 1).Reverse());
    switch (family)
    {
      case "Helv": searchfam = fontHelv; break;
      case "HeBO": searchfam = fontHelv; modifier = boldItalic; break;
      case "HeBo": searchfam = fontHelv; modifier = bold; break;
      case "HeOb": searchfam = fontHelv; modifier = italic; break;
      case "TiRo": searchfam = fontTiRo; break;
      case "TiBI": searchfam = fontTiRo; modifier = boldItalic; break;
      case "TiIt": searchfam = fontTiRo; modifier = italic; break;
      case "TiBo": searchfam = fontTiRo; modifier = bold; break;
      case "Cour": searchfam = fontCour; break;
      case "CoBo": searchfam = fontCour; modifier = bold; break;
      case "CoOb": searchfam = fontCour; modifier = italic; break;
      case "CoBO": searchfam = fontCour; modifier = boldItalic; break;
      case "Symb": searchfam = fontSymb; break;
      case "CourierNew": searchfam = fontCour; break;
    }
    if (!string.IsNullOrEmpty(modifier))
      searchfam = string.Join(",", new[] { searchfam, modifier });

    try
    {
      return new XFont(searchfam, size, XFontStyle.Regular, options);
    }
    catch
    {
      return new XFont(fontHelv, size, XFontStyle.Regular, options);
    }
  }

  private static XBrush BrushFromPdfFont(PdfString font)
  {
    if (font != null)
    {
      var pieces = font.Value.Split(' ');
      switch (pieces.Last())
      {
        case "g": // grayscale
          double gray;
          if (double.TryParse(pieces[pieces.Count() - 2], out gray))
            return new XSolidBrush(XColor.FromGrayScale(gray));
          break;
        case "rg": // rgb
          double red, green, blue;
          var success = new bool[3];
          if (pieces.Count() < 4) break;
          success[0] = double.TryParse(pieces[pieces.Count() - 4], out red);
          success[1] = double.TryParse(pieces[pieces.Count() - 3], out green);
          success[2] = double.TryParse(pieces[pieces.Count() - 2], out blue);
          if (success.All(v => v))
            return new XSolidBrush(XColor.FromArgb((int)(255 * red), (int)(255 * green), (int)(255 * blue)));
          break;
      }
    }

    return XBrushes.Black;
  }
}

class FormField
{
  public XRect _rect;
  public string _title;
  public PdfString _font;
  public PdfInteger _alignment;

  private PdfAcroField _field;
  public PdfAcroField Field
  {
    get { return _field; }
    set
    {
      _field = value;
      if (_field != null)
        SetFontAndAlignment(_field);
    }
  }

  public FormField(string title, PdfAnnotation annotation, PdfPage page)
  {
    _title = title;

    var rect = annotation.Rectangle;
    _rect = new XRect(rect.X1, rect.Y1, rect.Width, rect.Height);
    Console.WriteLine(title + " " + _rect.TopLeft);
    _rect.Y = page.Height - _rect.Y - rect.Height;

    SetFontAndAlignment(annotation);
  }

  private void SetFontAndAlignment(PdfDictionary item)
  {
    if (_font == null)
      _font = item.Elements[PdfAcroField.Keys.DA] as PdfString;
    if (_alignment == null)
      _alignment = item.Elements[PdfAcroField.Keys.Q] as PdfInteger;
  }
}


Top
 Profile  
Reply with quote  
PostPosted: Tue Aug 28, 2012 11:50 am 
Offline

Joined: Tue Aug 28, 2012 11:49 am
Posts: 1
You also need this if you are attempting to populate PDF form fields, you also need to set the NeedsAppearances element to true. Otherwise the PDF will "hide" the values on the form. Here is the VB code.


Code:
If objPdfSharpDocument.AcroForm.Elements.ContainsKey("/NeedAppearances") = False Then
            objPdfSharpDocument.AcroForm.Elements.Add("/NeedAppearances", New PdfSharp.Pdf.PdfBoolean(True))
        Else
            objPdfSharpDocument.AcroForm.Elements("/NeedAppearances") = New PdfSharp.Pdf.PdfBoolean(True)
        End If


Top
 Profile  
Reply with quote  
PostPosted: Thu Mar 28, 2013 4:09 pm 
Offline

Joined: Thu Mar 28, 2013 3:50 pm
Posts: 1
starscream74 thanks so much! This solved the problem for me!

You don't even need to use the PdfString! Here is what it did and it worked beautifully (Thanks to starscream74!!)

Code:
        string tempDocPath = Server.MapPath("~/Pdf Files/InformationSheet.pdf");
        PdfDocument myTemplate = PdfReader.Open(tempDocPath, PdfDocumentOpenMode.Modify);
        PdfAcroForm myForm = myTemplate.AcroForm;

        //This section makes the text visible after passing a long text and will wrap it!
         if (myForm.Elements.ContainsKey("/NeedAppearances"))
        {
            myForm.Elements["/NeedAppearances"] = new PdfBoolean(true);
        }
        else
        {
            myForm.Elements.Add("/NeedAppearances", new PdfBoolean(true));
        }

        //Getting fields on the pdf form       
        PdfTextField Title = (PdfTextField)(myForm.Fields["Title"]);
        PdfTextField FirstName = (PdfTextField)(myForm.Fields["FirstName"]);
        PdfTextField LastName = (PdfTextField)(myForm.Fields["LastName"]);       
       
        PdfCheckBoxField NewPerson = (PdfCheckBoxField)(myForm.Fields["NewPerson"]);

        //Filling in the form               
        Title.Text = txtTitle.Text;
        Title.ReadOnly=True;       
        FirstName.Text = txtFirstName.Text;
        LastName.Text = txtLastName.Text;
       
        NewPerson.Checked = chkNewPerson.Checked;
       

        myTemplate.Save(Server.MapPath("~/Pdf Files/InformationSheet.pdf"));
        Response.Redirect("~/Pdf Files/InformationSheet.pdf");   


Note that I am passing the values from my .aspx form controls such as textboxes and checkboxes!

The problem I had was that the Title field could get very long and the Title.Text would show the text but not wrap it to the next line! Then I used the PdfString instead and passed it to the Title.Value instead of the Text property which solved the problem of wrapping text but then it was invisible when I opened the file, but a click on the field would show that the text is already there but not visible. Then I saw what starscream74 suggested and that will solve the problem either you are using the Text property as I do or using the PdfString type with the Value property! I tested both and both work great! Both the text is wrapped correctly and it is visible!
I decided not to use the PdfString and .Value as it is much easier to just pass text to the text property!

I thought I share here so that may be someone else can use my two cents here!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 15 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group