DocRaptor

API

Creating documents using DocRaptor is extremely simple. All you have to do is make a POST request with some parameters!

URL

You can post against either:

  • http://docraptor.com/docs
  • https://docraptor.com/docs (if you feel the need for ssl)

Authentication

Authentication is done through the use of your API key, which can be found on your account dashboard. You can use the api key in one of two ways: either as the value of the query parameter "user_credentials", or as the username for basic http auth. You can see examples of both of these methods in our coding examples.

HTTP Status Codes

This is a list of the HTTP status codes DocRaptor returns, and what they mean for your document. Undestanding the response code DocRaptor returns for failed documents makes it much easier to troubleshoot problems and get the document you expected.

200 - OK

Your request was made successfully, and DocRaptor has returned a document. We will also return a 200 code when an asynchronous document has successfully generated.

400 - Bad Request

This error code means your request can not be completed as expected. DocRaptor will return this code if the download key is not valid, if the requested document has not completed, if there is an error in your generated document, or if there are errors in your HTTP POST request.

401 - Unauthorized

This status code means authorization is required, but has either not been provided or is incorrect. DocRaptor will return this status code if the API key provided is incorrect.

403 - Forbidden

A 403 status code means the request was made correctly, but the server is refusing to respond. We will return this status code if you do not have permission to view the status of that document, or if you are making too many simultaneous document generation requests.

422 - Unprocessable Entity

This error means your document has syntax errors and DocRaptor can not process it as expected. DocRaptor validates HTML prior to document generation by default. If you are confident your HTML will produce the document you want, you can turn off HTML validation. In many cases, turning off HTML validation will solve this issue.

Parameters

The following are the parameters that DocRaptor expects when you do a post to create a document. You can see these values in use in our coding examples.

Document Type

Specifies what type of document DocRaptor should try and create from the provided content.

Name: doc[document_type]

Options:

  • xls
  • xlsx
  • pdf

Document Content

The content that DocRaptor should use to create the document.
e.g. “<table><tr><td>Example!</td></tr></table>”

Name: doc[document_content]

You must supply either this or the following parameter, document_url.

Document URL

The url that DocRaptor should request the content from to create the document. e.g. “http://www.docraptor.com/documentation

Name: doc[document_url]

You must supply either this or the previous parameter, document_content.

Name

A name for the document. This can be any string that you find meaningful to describe this document - it is just used for identification purposes on the account dashboard.

Parameter Name: doc[name]

Test

Specifies if this document should be created using test mode. Test mode documents do not count against your monthly document quota - this way you can play with styles until you get a good looking document without wasting any of your allotted documents.

Parameter Name: doc[test]

Default: false

Options:

  • true
  • false

When test mode is on, generated PDFs will be watermarked, and generated Excel documents will be cut off after 20 rows. This parameter is optional.

Help

Specifies that you need suppport with your document. This will trigger an email to you and to support.

Parameter Name: doc[help]

Default: false

Options:

  • true
  • false

When a document is in help mode, we'll store your document contents for review until it's resolved. You can have up to 5 active help requests at any given time.

Tag

An arbitrary tag string for the document. Useful if you have multiple applications using DocRaptor under the same DocRaptor account, and you want to differentiate in the logs between each app.

Parameter Name: doc[tag]

Strict

Specifies if DocRaptor should try to validate the html being sent. For PDFs, by default, we do not validate the html you are sending - generally even invalid html will make a valid PDF file, and if the resulting PDF does not look the way you expect, then you know you have some html work to do. For XLS, on the other hand, we always validate the html that you are sending - this is because unlike PDF, XLS files are not freeform, and so elements need to map to XLS cells clearly and exactly. When html validation is on, DocRaptor will fail the document and report any html errors when the html is invalid.


Parameter Name: doc[strict]

Default (PDF): none
Default (XLS): html

Options:

  • none   (PDF only)
  • html

Javascript

If this parameter is set to true, DocRaptor will try and run any javascript in your html before we render it into a document. This parameter is false by default because it adds a significant amount of time to document processing - loading any external scripts and running them. If there are any errors running your javascript, the document creation process will fail (and the errors will be returned so you can see what went wrong). We currently have a 30 second timeout for loading all the assets and running the javascript.


Parameter Name: doc[javascript]

Default: false

Options:

  • true
  • false

Asynchronous Job

If this parameter is set to true, DocRaptor will queue your doc for background creation and send back JSON with a "status_id" key set. e.g.

{"status_id":"123454321"}

Making an authenticated request against http://docraptor.com/status/{status_id} will give you the status of your document job. The returned JSON from that call should look something like:

{"status":"completed", "download_url":"http://docraptor.com/download/12345asdf", "message":"Completed at Mon Jun 06 18:33:17 +0000 2011", "number_of_pages":2}

When the job is complete, DocRaptor will call the specified callback_url if one was provided, via a POST request.
Querying the status URL after the doc has been successfully created will provide a download_url in the returned JSON. The value associated with that key is a 2-time use URL from which you can download your doc. This download URL will expire 1 hour after successful asynchronous job creation.
If DocRaptor encounters an error generating your document, the status value will be "failed". A key "validation_errors" will be set with a value corresponding to the reason for the failure. An example of this is:

{"status":"failed", "validation_errors":"Name can't be blank\nName is too long (maximum is 200 characters)"}

If your document has been queued but processing has not yet begun, if will have a status of "queued". If your document is currently being processed, it will have a status of "working".


Parameter Name: doc[async]

Default: false

Options:

  • true
  • false

If there is an error creating your document, the callback_url will never be called. The status page will explain the error.

Asynchronous Callback URL

If this parameter is provided and the async parameter is set to true, DocRaptor will send a POST request to this URL after successfully completing an asynchronous job. The POST will contain the parameter "download_url" with the value being a url where your document can be downloaded.


Parameter Name: doc[callback_url]


If there is an error creating your document, the callback_url will never be called. The status page will explain the error.

Response Headers

For PDF documents, the response headers will contain the number of pages contained in the document.

The response header is "X-DocRaptor-Num-Pages"


Coding Examples

In this section, you'll find examples for making requests to our servers. We provide full documentation for making HTTP POST requests using C#, Curl, jQuery, Node.js, PHP, Prototype.js, Python, Ruby, and Rails.

You can read through our documentation here, or check out our repositories located at Github. If you have an example you'd like to share let us know and we'll share it with the world.

If you would like to see a more complex example you can check out this Using Doc Raptor to create Excel Spreadsheets tutorial by the guys at Switch on the Code!

C Sharp Examples

We provide a simple CLI program example (as opposed to a full blown GUI solution) as well as a WPF example. You can find the entire solution for the WPF example at Github.

CLI Example

WPF Example

Curl Example

Java CLI Example

Jquery Examples

Here you'll find several examples for creating PDF and Excel files with jQuery. We provide documentation for creating documents from forms and URLs, as well as using a jQuery plugin written by one of our users.

jQuery Form Based

jQuery Url Based

jQuery plugin

David Baldwin made a nice DocRaptor jQuery plugin. Examples and usage details can be found on that page.

Node.js

This is a Restler-based example for creating documents with Node.js.

PHP Examples

We provide two PHP examples: the first requires pecl_http to be installed on your server, and the other is a PHP wrapper written by one of our users.

PHP Example using pecl_http

PHP Wrapper Example

One of our users wrote a nice PHP wrapper for DocRaptor as well: https://github.com/krewenki/php-docraptor

Prototype.js Example

Python Examples

Thanks to John Keyes, there is a python wrapper that closely matches the functionality of the official ruby gem. Below are a couple of examples of that wrapper in action.

Python Example

Python Async Example

Ruby Examples

We provide Ruby examples using the official DocRaptor gem, as well as using the HTTP_Party gem.

DocRaptor Gem Example

DocRaptor Gem using Async Functionality

HTTParty Example

Rails Example

We've provided a couple of excerpts from a complete Ruby on Rails example that you can get at Github.

Note: You can define a PDF layout to be used for all of your PDFs by creating an application.pdf.haml (or .erb) file. This technique can also be used to create an XLS layout.

Note: Rails runs development mode through webrick by default. Webrick is single-threaded, so if your app is hitting its other endpoints in order to generate PDFs, you'll want to use something that supports concurrent requests in development like unicorn, passenger, or thin. You could also run multiple webrick instances on different ports.

API Key Setup

Add this to config/environment.rb to define your API key:

Controller Example

Some sample code from a controller:

HAML Example

Sample HAML code to generate an Excel file:


PDF Styles

You should be able to send normal html and css to DocRaptor. We suggest you put any css in a style block in the head of your document to reduce external connections. We will download external resources (images, css, etc.) to produce your PDF. It will go faster the fewer resources we have to look up.

Images can be embedded directly as data URIs in the html document you send to DocRaptor.

It is important to use the correct character encoding and locale in your document if you are sending non-ASCII/Unicode characters in the html document.
More information about encoding.

  • <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
  • <html lang="pt-BR">

If you are attempting to use page breaks, make sure the element that the css page-break property is on is not floated or within a floated element, or the page break will not function.

Running javascript before the html is converted to a pdf is supported - see the javascript parameter. You can also read about prince-pdf-script, which allows js execution in the PDF reader at the time a PDF is opened.

DocRaptor uses Prince XML to generate PDFs, and we are currently running Prince 9.0 in production. You can check out Prince's documentation here.

PDF Options

We expose a number of extra options for Prince through our API that you can set. These correspond to the Prince command line options that you can see documented here.

Base URL

Specify the base URL of the input document.

Parameter Name: doc[prince_options][baseurl]

This can also be accomplished by using the HTML Base tag.

No XInclude

Disable XInclude processing.

Parameter Name: doc[prince_options][no_xinclude]

Default: false

Options:

  • true
  • false

No Network

Disable network access (prevents HTTP downloads).

Parameter Name: doc[prince_options][no_network]

Default: false

Options:

  • true
  • false

HTTP User

Specify the username for HTTP authentication.

Parameter Name: doc[prince_options][http_user]

HTTP Password

Specify the password for HTTP authentication.

Parameter Name: doc[prince_options][http_password]

HTTP Proxy

Specify the HTTP proxy server.

Parameter Name: doc[prince_options][http_proxy]

HTTP Timeout

Specify how long to wait, in seconds, when requesting an external resource while generating your document.

Parameter Name: doc[prince_options][http_timeout]

Insecure

Disable SSL verification (not recommended).

Parameter Name: doc[prince_options][insecure]

Default: false

Options:

  • true
  • false

Media

Specify the media type (eg. print, screen).

Parameter Name: doc[prince_options][media]

No Author Stylesheets

Ignore author style sheets.

Parameter Name: doc[prince_options][no_author_style]

Default: false

Options:

  • true
  • false

No Default Stylesheets

Ignore default style sheets.

Parameter Name: doc[prince_options][no_default_style]

Default: false

Options:

  • true
  • false

No Embedded Fonts

Disable font embedding in PDF output.

Parameter Name: doc[prince_options][no_embed_fonts]

Default: false

Options:

  • true
  • false

No Subset Fonts

Disable font subsetting in PDF output.

Parameter Name: doc[prince_options][no_subset_fonts]

Default: false

Options:

  • true
  • false

No Compression

Disable compression of PDF output.

Parameter Name: doc[prince_options][no_compress]

Default: false

Options:

  • true
  • false

Encryption

Encrypt PDF output.

Parameter Name: doc[prince_options][encrypt]

Default: false

Options:

  • true
  • false

Key Bits

Set encryption key size.

Parameter Name: doc[prince_options][key_bits]

Options:

  • 40
  • 128

User Password

Set PDF user password.

Parameter Name: doc[prince_options][user_password]

Owner Password

Set PDF owner password.

Parameter Name: doc[prince_options][owner_password]

Disallow Print

Disallow printing of PDF output.

Parameter Name: doc[prince_options][disallow_print]

Default: false

Options:

  • true
  • false

Disallow Copy

Disallow copying from PDF output.

Parameter Name: doc[prince_options][disallow_copy]

Default: false

Options:

  • true
  • false

Disallow Annotate

Disallow annotation of PDF output.

Parameter Name: doc[prince_options][disallow_annotate]

Default: false

Options:

  • true
  • false

Disallow Modify

Disallow modification of PDF output.

Parameter Name: doc[prince_options][disallow_modify]

Default: false

Options:

  • true
  • false

Input Type

Specify the input type of the document to be used by prince during processing.

Parameter Name: doc[prince_options][input]

Default: html

Options:

  • html
  • xml
  • auto

Prince Version

Specify the version of Prince to use.

Parameter Name: doc[prince_options][version]

Options:

  • 7.1
  • 8.1
  • 9.0

If no parameter is supplied, this is based on your user account setting. Anyone who signed up for DocRaptor prior to August 1, 2012 will default to 7.1. Anyone who signed up between that date and July 10, 2013 will defalut to 8.1. New users will default to 9.0. This is adjustable via the “Edit Profile” link on your user dashboard.

Prince Javascript

Use the built-in Prince javascript engine.

Parameter Name: doc[prince_options][javascript]

Default: false

Options:

  • true
  • false

Prince CSS dpi

By default, Prince sets the page dpi for generated PDFs to 96dpi. However, when using Prince 9.0, you can override this setting to use the dpi you prefer.

Parameter Name: doc[prince_options][css_dpi]

Default: 96

Options:

  • 72
  • 200
  • etc.


Excel Styles

The general technique for producing an XLS file is to send some us some html in the form of a table per worksheet, with the tables' rows and cells corresponding to the same in excel. Below is a picture of a simple example transformation which also demonstrates the use of named worksheets (via the name attribute).
Simple_dinosaur_table

You can style cells, rows, and the entire table using style attributes, and those attributes cascade. We don't yet support writing arbitrary css style blocks. Soon, though. Below is a picture of a simple background-color example transformation.
Simple_bg_table

Read through our coding examples for more large examples.

Excel XLS Version Support

We currently produce Excel '97 compatible XLS files. As such, features added to excel later than that are not currently supported.

Special Table & Cell Attributes

Several element attributes have special meaning in DocRaptor. Below is a picture of those in action.
Special_attributes

table:name

Setting the name attribute on a table element will name the sheet produced by the table.

table:password

Setting the password attribute on a table element will password protect the sheet produced by the table with the given password. By default this means that all cells in the sheet will be readonly, unless the password is entered. You can control what cells will be readonly using the -xls-locked style.

td:colspan

Setting the colspan on a table cell will create a merged cell

td:rowspan

Setting the rowspan on a table cell will create a merged cell from cells below the current cell.

Multiple Worksheets

Creating multiple worksheets is easy. Just send more than one table in your request, wrapped inside a "tables" tag.
Multiple_worksheets

Specific Styles

What follows is a list of styles we support as part of a style attribute's value and the options they take. Excel-specific styles have been prefixed with ‘-xls-’. The options should more or less correspond to the options found via “Format Cell” in Excel.

-xls-content-type

The content type for the cell in Excel.

Default: auto

  • auto - will try and determine the excel cell type automatically based off the cell contents
  • number
  • formula
  • datetime
  • boolean
  • blank

text-align

The horizontal alignment for cell content.

Default: general

  • general
  • left
  • right
  • center
  • justify
  • fill

vertical-align

The vertical alignment for cell content.

Default: bottom

  • bottom
  • top
  • center
  • justify

text-indent

Amount of indentation of the cell content. Integer value from 0 to 14.

Default: 0

white-space

Cell content wrapping. If set to wrap, then Excel will wrap data in cells with this format so that it fits within the cell boundaries.

Default: nowrap

  • nowrap
  • wrap

-xls-text-orientation

Sets the text orientation for this cell.

Default: horizontal

  • horizontal (0)
  • vertical (90)
  • stacked
  • 0
  • 45
  • 90
  • 270
  • 315
  • 360

Arbitrary amounts are not allowed. The option closest to what you pass us will be chosen.
360 is equivalent to 0.

-xls-background-pattern

Sets the background pattern.

Default: none (solid if background-color is set)

  • none
  • 6.25%
  • 12.5%
  • 25%
  • 50%
  • 75%
  • solid
  • horizontal stripe
  • vertical stripe
  • reverse diagonal stripe
  • diagonal stripe
  • diagonal crosshatch
  • thick diagonal crosshatch
  • thin horizontal stripe
  • thin vertical stripe
  • thin reverse diagonal stripe
  • thin diagonal stripe
  • thin horizontal crosshatch
  • thin diagonal crosshatch

background-color

Sets the background color for the cell. Can take any named web color or hex value.

Default: transparent (grey if -xls-background-pattern is set)

These colors will be translated to the closest of the ~64 valid colors for Excel. Certain colors (such as black) will cause Excel to ignore the background pattern you set.

    border-top-color, border-bottom-color, border-left-color, border-right-color

    Sets the border color for the cell. Can take any named web color or hex value.

    Default: transparent (if a border color is set, the default will be black)

    These colors will be translated to the closest of the ~64 valid colors for Excel.

      border-top-style, border-bottom-style, border-left-style, border-right-style

      Sets the border style (lines that appear around a cell)

      Default: none (if color is set, the default is thin).

      • none
      • thin
      • medium
      • dashed
      • dotted
      • thick
      • double
      • hair
      • medium dashed
      • dash dot
      • medium dash dot
      • dash dot dot
      • medium dash dot dot
      • slanted dash dot

      compact border syntax

      Coming Soon (but not here yet) (like your standard CSS)!

      font-family

      Sets the font family.

      Default: Arial

        font-size

        Sets the font size in points.

        Default: 10pt

          font-style

          Sets if the font should be italic or not.

          Default: normal

          • normal
          • italic

          font-weight

          Sets the weight of the font.

          Default: normal

          • normal (400)
          • bold (700)
          • bolder (900)
          • lighter (200)
          • 100
          • 200
          • 300
          • 400
          • 500
          • 600
          • 700
          • 800
          • 900

          text-decoration

          Sets the text decoration.

          Default: none

          • none
          • line-through
          • underline

          color

          Sets the text color for the cell. Can take any named web color or hex value.

          Default: black

          These colors will be translated to the closest of the ~64 valid colors for Excel.

          -xls-format

          Sets the number/date format for the cell. There are many possible options for this. A few of the important ones are below, with more documentation to come in the future. As a warning, if you use a number format on a text or date cell, the results may be unpredictable.

          Formatting Numbers and Text

          Default: default

          • default
          • text
          • integer
          • float
          • percent float
          • percent integer
          • accounting float
          • accounting integer
          • accounting red float
          • accounting red integer
          • exponential
          • fraction_one_digit
          • fraction_two_digits
          • thousands_float
          • thousands_integer

          Formatting Currency

          • currency_dollar
          • currency_euro_prefix
          • currency_euro_suffix
          • currency_japanese_yen
          • currency_pound

          Formatting Dates and Times

          • date_format1 (m/d/yy)
          • date_format2 (d-mmm-yy)
          • date_format3 (d-mmm)
          • date_format4 (mmm-yy)
          • date_format5 (h:mm AM/PM)
          • date_format6 (h:mm:ss AM/PM)
          • date_format7 (h:mm)
          • date_format8 (h:mm:ss)
          • date_format9 (m/d/yy h:mm)
          • date_format10 (d/m/yy)
          • date_format11 (d/m/yy h:mm:ss)

          height

          Sets the height of a row. Only valid on tr elements.

          Default: auto

          width

          Sets the width of a column. The last width specified for a column wins (i.e., if you specify the width for a column in both row 1 and row 2, the width specified in row 2 is used).

          Default: auto

          -xls-locked

          Sets if this cell is locked. Only has meaning if a password has been set for the sheet that will contain this cell.

          Default: true

          • true
          • false

          -xls-thousands-delimiter

          When reading values for cells, what character delimits large numbers (i.e. 1 million written as ‘1,000,000’ is delimited by the comma character). If you are using this, you probably want to set ‘-xls-decimal-delimiter’, too.

          Default: ,

          -xls-decimal-delimiter

          When reading values for cells, what character delimits the begin of the decimal portion of numbers (i.e. 11/10 written as 1.1 is delimited by the period character). If you are using this, you probably want to set ‘-xls-thousands-delimiter’, too.

          Default: .


          Referrer-based Document Generation

          DocRaptor makes it easy to convert any webpage you have control over into a document using a simple anchor tag. On your account managment page, you can add domains you would like to link to DocRaptor, and requests to DocRaptor to create docs that have that domain as part of their HTTP_REFERER HTTP header will be generated using your account without the need for an API Key. Click "domains” after logging in to manage your domains!

          URL

          Once you've set up your domains, you can make a GET request against either:

          • http://docraptor.com/docs/from_site
          • https://docraptor.com/docs/from_site

          Example Code

          Live Example

          Documentation PDF

          Doc Listing

          You can also get a list of previously created documents through the API. This is just information like the name, the date, and if it was a test document. Since we don't actually store the created document, we can't return that. Info about the documents is returned as xml in a paginated list, ordered by date of creation (most recent first).

          URL

          You can make a GET request against either:

          • http://docraptor.com/docs
          • https://docraptor.com/docs (if you feel the need for ssl)

          Parameters

          The following are the parameters that DocRaptor expects when you request the document listing.

          Page

          Specifies the page (in terms of pagination) of documents to return

          Name: page

          Default: 1

          Per Page

          Specifies the number of documents per page (in terms of pagination) to return

          Name: per_page

          Default: 100