Archive for January, 2008

Submitting PDF Forms with PHP

January 1st, 2008 by Bobby Whitman

Requirements: Adobe Acrobat Professional, PHP 5+.

In efforts to go paperless many companies are taking existing paper forms and making them available for completion online. However, if one desires the layout of such forms to mimic the original paper versions, HTML forms may not be the best idea. Attempting to use traditional HTML forms proves to be quite time consuming to program. And, let’s not forget that whether you go with a complex table or pure CSS, the resultant code will likely be rather clunky.

So, we turn to Adobe Acrobat’s interactive PDF forms. We can create and layout forms on top of an existing PDF with ease. Simply open up Acrobat Professional find the form tools and have at it.

Constructing the form is pretty simple stuff. The only real difficulty comes when trying to submit the form to a web server.

Start by creating a button element in the PDF to act as your submit button. Add a button action to ‘Submit the form’ on ‘Mouse Up’. You will see a field to enter the script to which the data will be submitted. Let’s call it pdfsubmit.php. Once this is done you are faced with four options of how you would like the data submitted.

Submit Form Selections

Let’s start simple and select HTML. This will send the data using POST and you can access it in PHP just as you would if it came from any HTML form, that is, using the $_POST superglobal. You have all of the fields from the PDF available to you and may do with them as you wish (throw them into a database, output them to a data file, send them in an e-mail, etc).

But, suppose we want to take it a step further and actually be able to store the completed PDF on the server. This way an admin can read the form in the exact layout in which it was filled out, thus being a more true substitute for the original paper form. For this we turn to Forms Data Format (FDF) or XML Forms Data Format (XFDF).

Both FDF and XFDF serve the same purpose, that being, to represent and store data in a PDF form. PHP has a function set specifically for working with FDF (http://us.php.net/manual/en/ref.fdf.php). However, in order to use these functions your server must have the FDF Toolkit provided by Adobe installed and configured. Rather than bother with server configurations, I recommend using XFDF as PHP has functions for parsing XML readily available.

Step 1: Getting the XFDF data from the PDF to our PHP script.

First, go back to that submit button and instead of submitting to HTML, select ‘XFDF Include’ as the method. Now, each field will NOT be available in POST, rather the XML file representing the form data is sent. The tricky part is figuring out how it is sent and how to find this XML with PHP. You will not find this XML data in any of the common places, it is not in $_POST, $_GET, $_FILES, or $_SERVER. Rather, the XFDF file appear in the oft hidden HTTP_RAW_POST_DATA. You can get to this by using the following global:

$xfdf = $GLOBALS['HTTP_RAW_POST_DATA'];
Note: you may or may not need to set the PHP directive always_populate_raw_post_data to ‘On’ in the php.ini.

Step 2: Storing the XFDF.

Before we even think about dropping that data in a safe place we need to have some way to associate the end-user with the XFDF they have sent. We certainly do not want our admin user to have to open every XFDF file in order to see to whom it belongs. This means, we will need to extract some data from the XFDF before we do anything else.

In our example, we will grab the name and e-mail address of our user. I will do this using PHP5′s SimpleXML class to traverse the XML.

## load our xml
$xfdfdata = simplexml_load_string($xfdf);

## Traverse the XML to build our data array
## of all fields.
$data = array();
foreach($xfdfdata->fields->field as $f) {

$a = $f->attributes();
$fieldname = (string)$a['name'];
$data[$fieldname] = (string)$f->value;
}

Now, there are a number of methods you can use to store XFDF for later retrieval. You may want to think twice about writing a simple data.xml file to the server as you risk making all of the data submitted public. A better choice would be to dump the entire XFDF to a field in a database table. The database option also gives us a way to associate this extracted data (name and e-mail) with the rest of the form data.

CREATE TABLE formdata (
id INT NOT NULL AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) NOT NULL,
xfdf LONGTEXT NOT NULL,
stamp INT NOT NULL,
PRIMARY KEY(id));
$sql = “INSERT INTO formdata (
id,
name,
email,
xfdf,
stamp
) VALUES (
”,
‘”.$data['name'].”‘,
‘”.$data['email'].”‘,
‘”.mysql_real_escape_string($xfdf).”‘,
‘”.time().”‘
)”;

mysql_query($sql, $dbh);

And there you go, the data has been submitted from the PDF and is now stored safely on the server. Now, all we need to worry about is how to get this data back to the format of the original PDF.

Step 3: Completed PDF Form retrieval.

Let’s suppose we are looking to download entry with id 5.

$sql = “SELECT xfdf FROM formdata WHERE id = 5″;
$result = mysql_query($sql, $dbh);

list($xfdf) = mysql_fetch_row($result);

header(“Content-type: application/vnd.adobe.xfdf”);
echo $xfdf;

By changing the content type of the PHP we are telling our browser to treat this server generated page as an XFDF file. Embedded in the XFDF is the address of the original PDF. So, when you browser runs the XFDF file it will retieve the specified PDF and load all of the form data into it automatically. Now, the admin user, using Acrobat Reader can view the PDF form as completed by the end user and has the option to save the PDF file.

Note: Retrieving completed PDF files only works in Internet Explorer. I have tested this both in IE 6 with Acrobat 6 and in IE 7 with Acrobat 8 and was successful.