Introduction
This 
article describes a quick and simple approach to programmatically completing a 
PDF document through the use of the iTextSharp DLL. The article also discusses 
how one might go about using the iTextSharp DLL to discover and map the fields 
available within an existing PDF if the programmer has only the PDF but does not 
have Adobe Designer or even a list of the names of the fields present in the 
PDF.
![Image1.jpg]()
Figure 1:Resulting PDF after Filling in Fields Programmatically. 
iTextSharp is a C# port of a Java library written to support the creation and 
manipulation of PDF document; the project is available for download through 
SourceForge.net here:  
http://sourceforge.net/projects/itextsharp/
With the 
iTextSharp DLL, it is possible to not only populate fields in an existing PDF 
document but also to dynamically create PDFs. The examples here are limited to a 
description of the procedures associated with completion of a PDF; the download 
will contain examples of PDF creation in both Visual Basic and C#.
The 
examples contained herein are dependent upon the availability of the iTextSharp 
DLL; use the link provided previously in order to download the DLL locally to 
your development machine.
In order 
to demonstrate filling out a PDF using the iTextSharp DLL, I downloaded a copy 
of the W-4 PDF form from the IRS website. The form contains controls and may be 
filled out programmatically so it serves as a good example.  
PDF 
documents that do not contain controls; those meant to be printed and filled in 
with a pencil, cannot be completed using this approach. Of course if you have 
access to the Adobe tools (Adobe Professional, Adobe Designer), you can always 
create your own PDFs with controls, or can add controls to existing PDFs. 
Further, though not demonstrated here, one can also use iTextSharp to create a 
PDF document with embedded controls.
Getting Started:
In order 
to get started, fire up the Visual Studio 2005 IDE and open the attached 
solution. The solution consists of a single Win Forms project with a single 
form.  
I have 
also included a PDF that will be used for demonstration purposes; this form is 
the IRS W-4 form completed by US taxpayers; however, any PDF with embedded 
controls (text boxes, check boxes, etc.) is fair game for this approach. Note 
that a reference to the iTextSharp DLL has been included in the project.
All of 
the project code is contained with the single Windows form. The form itself 
contains only a docked textbox used to display all of the field names from an 
existing PDF document. The completed PDF is generated and stored in the local 
file system; the PDF is not opened for display by the application.  
The 
application uses the existing PDF as a template and from that template; it 
creates and populates the new PDF. The template PDF itself is never populated 
and it is used only to define the format and contents of the completed PDF.
![Image2.jpg]()
Figure 2:Solution Explorer.
The 
Code:  Main Form
As was 
previously mentioned, all of the code used in the demonstration application is 
contained entirely in the project's single Windows form.  The following section 
will describe the contents of the code file.
The file 
begins with the appropriate library imports needed to support the code.  Note 
that the iTextSharp libraries have been included into the project.  The class 
declaration is in the default configuration.
Imports 
System
Imports 
System.Collections
Imports 
System.ComponentModel
Imports 
System.Data
Imports 
System.Drawing
Imports 
System.Text
Imports 
System.Windows.Forms
Imports 
iTextSharp
Imports 
iTextSharp.text
Imports 
iTextSharp.text.pdf
Imports 
iTextSharp.text.xml
Imports 
System.IO 
 
Public
Class Form1       
 
The next section of code 
contains the form 1 load event handler.  During form load, two functions are 
called; those functions are used to display all of the fields present in the 
template PDF and to create a new PDF populated with a set of field values. 
 
''' <summary>
''' 
Application main form Load event handler
''' </summary>
''' <param name="sender"></param>
''' <param name="e"></param>
''' <remarks></remarks>
Private Sub Form1_Load(ByVal 
sender As Object,
ByVal e As System.EventArgs)
Handles Me.Load
 
    ' 
Load all field names from template PDF
    ListFieldNames()
 
    ' 
Fill the target PDF form with canned values
    FillForm()
 
End Sub 
 
The next section of code 
contained in the demo application defines a function used to collect the names 
of all of the fields from the target PDF.  The field names are displayed in a 
text box contained in the application's form. 
 
''' <summary>
''' List all 
of the form fields into a textbox.  The
''' form 
fields identified can be used to map each of the
''' fields in 
a PDF.
''' </summary>
Private Sub ListFieldNames()
    Dim 
pdfTemplate As String 
= "c:\Temp\PDF\fw4.pdf"
 
    ' 
title the form
    Me.Text 
+= " - " + PdfTemplate
 
    ' 
create a new PDF reader based on the PDF template document
    Dim 
pdfReader As PdfReader =
New PdfReader(pdfTemplate)
 
    ' 
create and populate a string builder with each of the 
    ' 
field names available in the subject PDF
    Dim 
sb As New 
StringBuilder()
   
Dim de As
New DictionaryEntry
   
For Each de
In pdfReader.AcroFields.Fields
       sb.Append(de.Key.ToString() 
+ Environment.NewLine)
   
Next
 
    ' 
Write the string builder's content to the form's textbox
    textBox1.Text = 
sb.ToString()
    textBox1.SelectionStart 
= 0
End Sub
 
Figure 3 shows the field 
names collected from the target PDF using the ListFieldNames function 
call. In order to map these fields to specific fields in the PDF, one need only 
copy this list and pass values to each of the fields to identify them. For 
example, if the form contains ten fields, setting the value (shown next) to a 
sequential number will result in the display of the numbers 1 to 10 in each of 
the fields. One can then track that field value back to the field name using 
this list as the basis for the map.  Once the fields have been identified, the 
application can be written to pass the correct values to the related field.
 
Checkbox controls may be a 
little more challenging to figure out. I tried passing several values to the 
checkbox controls before lining up a winner. In this example, I tried pass zero, 
one, true, false, etc. to the field before figuring out that 'yes' sets the 
check.
 
Figure 3:The Available PDF Fields. 
The next 
section of code in the demo project is used to fill in the mapped field values. 
The process is simple enough, the first thing that happens is that that the 
template file and new file locations are defined and passed to string 
variables.  Once the paths are defined, the code creates an instance of the PDF 
reader which is used to read the template file, and a PDF stamper which is used 
to fill in the form fields in the new file. Once the template and target files 
are set up, the last thing to do is to create an instance of the AcroFields 
which is populated with all of the fields contained in the target PDF. After the 
form fields have been captured, the rest of the code is used to fill in each 
field using the field's SetField function.
 
In this example, the first 
worksheet and the W-4 itself are populated with meaningful values whilst the 
second worksheet is populated with sequential numbers which are then used to map 
those fields to their location on the PDF.
 
After the PDF has been 
filled out, the application reads values from the PDF (the first and last names) 
in order to generate a message indicating that the W-4 for this person was 
completed and stored. 
 
Private Sub FillForm()
 
    Dim 
pdfTemplate As String 
= "c:\Temp\PDF\fw4.pdf"
    Dim 
newFile As String 
= "c:\Temp\PDF\Final_fw4.pdf"
 
    Dim 
pdfReader As New 
PdfReader(pdfTemplate)
    Dim 
pdfStamper As New 
PdfStamper(pdfReader, New FileStream( _newFile, 
FileMode.Create))
 
   
Dim pdfFormFields As 
AcroFields = pdfStamper.AcroFields
 
    ' 
set form pdfFormFields
   
' The first worksheet and W-4 form
    pdfFormFields.SetField("f1_01(0)",
"1")
    pdfFormFields.SetField("f1_02(0)",
"1")
    pdfFormFields.SetField("f1_03(0)",
"1")
    pdfFormFields.SetField("f1_04(0)",
"8")
    pdfFormFields.SetField("f1_05(0)",
"0")
    pdfFormFields.SetField("f1_06(0)",
"1")
    pdfFormFields.SetField("f1_07(0)",
"16")
    pdfFormFields.SetField("f1_08(0)",
"28")
    pdfFormFields.SetField("f1_09(0)",
"Franklin A.")
    pdfFormFields.SetField("f1_10(0)",
"Benefield")
    pdfFormFields.SetField("f1_11(0)",
"532")
    pdfFormFields.SetField("f1_12(0)",
"12")
    pdfFormFields.SetField("f1_13(0)",
"1234")
 
    ' 
The form's checkboxes
    pdfFormFields.SetField("c1_01(0)",
"0")
    pdfFormFields.SetField("c1_02(0)",
"Yes")
    pdfFormFields.SetField("c1_03(0)",
"0")
    pdfFormFields.SetField("c1_04(0)",
"Yes")
 
    ' 
The rest of the form pdfFormFields
    pdfFormFields.SetField("f1_14(0)",
"100 
North Cujo Street")
    pdfFormFields.SetField("f1_15(0)",
"Nome, 
AK  67201")
    pdfFormFields.SetField("f1_16(0)",
"9")
    pdfFormFields.SetField("f1_17(0)",
"10")
    pdfFormFields.SetField("f1_18(0)",
"11")
    pdfFormFields.SetField("f1_19(0)",
"Walmart, 
Nome,
AK")
    pdfFormFields.SetField("f1_20(0)",
"WAL666")
    pdfFormFields.SetField("f1_21(0)",
"AB")
    pdfFormFields.SetField("f1_22(0)",
"4321")
 
    ' 
Second Worksheets pdfFormFields
    ' 
In order to map the fields, I just pass them a sequential
    ' 
number to mark them once I know which field is which, I 
    ' 
can pass the appropriate value
    pdfFormFields.SetField("f2_01(0)",
"1")
    pdfFormFields.SetField("f2_02(0)",
"2")
    pdfFormFields.SetField("f2_03(0)",
"3")
    pdfFormFields.SetField("f2_04(0)",
"4")
    pdfFormFields.SetField("f2_05(0)",
"5")
    pdfFormFields.SetField("f2_06(0)",
"6")
    pdfFormFields.SetField("f2_07(0)",
"7")
    pdfFormFields.SetField("f2_08(0)",
"8")
    pdfFormFields.SetField("f2_09(0)",
"9")
    pdfFormFields.SetField("f2_10(0)",
"10")
    pdfFormFields.SetField("f2_11(0)",
"11")
    pdfFormFields.SetField("f2_12(0)",
"12")
    pdfFormFields.SetField("f2_13(0)",
"13")
    pdfFormFields.SetField("f2_14(0)",
"14")
    pdfFormFields.SetField("f2_15(0)",
"15")
    pdfFormFields.SetField("f2_16(0)",
"16")
    pdfFormFields.SetField("f2_17(0)",
"17")
    pdfFormFields.SetField("f2_18(0)",
"18")
    pdfFormFields.SetField("f2_19(0)",
"19")
 
    ' 
report by reading values from completed PDF
    Dim 
sTmp As String 
= "W-4 Completed for " 
+ pdfFormFields.GetField("f1_09(0)") +
" " + _pdfFormFields.GetField("f1_10(0)")
    MessageBox.Show(sTmp,
"Finished")
 
    ' 
flatten the form to remove editting options, set it to false
    ' 
to leave the form open to subsequent manual edits
    pdfStamper.FormFlattening = True
 
    ' 
close the pdf
    pdfStamper.Close()
 
End Sub 
 
End
Class
To finish up the PDF, it is necessary to determine whether or not 
additional edits will be permitted to the PDF after it has been programmatically 
completed. This task is accomplished by setting the FormFlattening value 
to true or false. If the value is set to false, the resulting PDF will be 
available for edits, if the value is set to true, the PDF will be locked against 
further edits.
 
Once the form has been 
completed, the PDF stamper is closed and the function terminated. That wraps up 
the discussion of the form based demo project.
 
Summary
This 
article described an approach to populating a PDF document with values 
programmatically; this functionality was accomplished using the iTextSharp DLL.
Further, the article described an approach for mapping the fields contained in 
PDF and may be useful if one is dealing with a PDF authored elsewhere and if the 
programmer does not have access to Adobe Professional or Adobe Designer. The 
iTextSharp library is a powerful DLL that supports authoring PDFs as well as 
using in the manner described in this document; however, when authoring a PDF, 
it seems that it would be far easier to produce a nice document using the visual 
environment made available through the use of the Adobe tools.  Having said 
that, if one is dynamically creating PDFs with variable content, the iTextSharp 
library does provide the tools necessary to support such an effort; with the 
library, one can create and populate a PDF on the fly.