API
Creating documents using DocRaptor is extremely simple. All you have to do is make a POST request with some parameters!
URL
You can post against either:
- http://docraptor.com/docs
- https://docraptor.com/docs (if you feel the need for ssl)
Authentication
Authentication is done through the use of your API key, which can be found on your account dashboard. You can use the api key in one of two ways: either as the value of the query parameter "user_credentials", or as the username for basic http auth. You can see examples of both of these methods in our coding examples.
HTTP Status Codes
This is a list of the HTTP status codes DocRaptor returns, and what they mean for your document. Undestanding the response code DocRaptor returns for failed documents makes it much easier to troubleshoot problems and get the document you expected.
200 - OK
Your request was made successfully, and DocRaptor has returned a document. We will also return a 200 code when an asynchronous document has successfully generated.
400 - Bad Request
This error code means your request can not be completed as expected. DocRaptor will return this code if the download key is not valid, if the requested document has not completed, if there is an error in your generated document, or if there are errors in your HTTP POST request.
401 - Unauthorized
This status code means authorization is required, but has either not been provided or is incorrect. DocRaptor will return this status code if the API key provided is incorrect.
403 - Forbidden
A 403 status code means the request was made correctly, but the server is refusing to respond. We will return this status code if you do not have permission to view the status of that document, or if you are making too many simultaneous document generation requests.
422 - Unprocessable Entity
This error means your document has syntax errors and DocRaptor can not process it as expected. DocRaptor validates HTML prior to document generation by default. If you are confident your HTML will produce the document you want, you can turn off HTML validation. In many cases, turning off HTML validation will solve this issue.
Parameters
The following are the parameters that DocRaptor expects when you do a post to create a document. You can see these values in use in our coding examples.
Document Type
Specifies what type of document DocRaptor should try and create from the provided content.
Name: doc[document_type]
Options:
- xls
- xlsx
Document Content
The content that DocRaptor should use to create the document.
e.g. “<table><tr><td>Example!</td></tr></table>”
Name: doc[document_content]
You must supply either this or the following parameter, document_url.
Document URL
The url that DocRaptor should request the content from to create the document. e.g. “http://www.docraptor.com/documentation
Name: doc[document_url]
You must supply either this or the previous parameter, document_content.
Name
A name for the document. This can be any string that you find meaningful to describe this document - it is just used for identification purposes on the account dashboard.
Parameter Name: doc[name]
Test
Specifies if this document should be created using test mode. Test mode documents do not count against your monthly document quota - this way you can play with styles until you get a good looking document without wasting any of your allotted documents.
Parameter Name: doc[test]
Default: false
Options:
- true
- false
When test mode is on, generated PDFs will be watermarked, and generated Excel documents will be cut off after 20 rows. This parameter is optional.
Tag
An arbitrary tag string for the document. Useful if you have multiple applications using DocRaptor under the same DocRaptor account, and you want to differentiate in the logs between each app.
Parameter Name: doc[tag]
Strict
Specifies if DocRaptor should try to validate the html being sent. By default, we do validate the html and report any errors - more useful than getting back a malformed pdf file and wondering why. But sometimes you know that the html is valid enough to produce the pdf you want, and you want DocRaptor to stop complaining about errors in the document. In that case, set this parameter to 'none', and DocRaptor will try to create the document no matter how malformed the input html is.
Parameter Name: doc[strict]
Default: html
Options:
- html
- none
This parameter is only used when created PDF files. When creating XLS files, DocRaptor can't do anything useful with malformed html, and will always attempt to validate the input for XLS files.
Javascript
If this parameter is set to true, DocRaptor will try and run any javascript in your html before we render it into a document. This parameter is false by default because it adds a significant amount of time to document processing - loading any external scripts and running them. If there are any errors running your javascript, the document creation process will fail (and the errors will be returned so you can see what went wrong). We currently have a 30 second timeout for loading all the assets and running the javascript.
Parameter Name: doc[javascript]
Default: false
Options:
- true
- false
Asynchronous Job
If this parameter is set to true, DocRaptor will queue your doc for background creation and send back JSON with a "status_id" key set. e.g.
{"status_id":"123454321"}
Making an authenticated request against http://docraptor.com/status/{status_id} will give you the status of your document job. The returned JSON from that call should look something like:
{"status":"completed", "download_url":"http://docraptor.com/download/12345asdf", "message":"Completed at Mon Jun 06 18:33:17 +0000 2011", "number_of_pages":2}
When the job is complete, DocRaptor will call the specified
callback_url
if one was provided, via a POST request.
Querying the status URL after the doc has been successfully created will provide a download_url in the returned JSON. The value associated with that key is a 2-time use URL from which you can download your doc.
If DocRaptor encounters an error generating your document, the status value will be "failed". A key "validation_errors" will be set with a value corresponding to the reason for the failure. An example of this is:
{"status":"failed", "validation_errors":"Name can't be blank\nName is too long (maximum is 200 characters)"}
If your document has been queued but processing has not yet begun, if will have a status of "queued". If your document is currently being processed, it will have a status of "working".
Parameter Name: doc[async]
Default: false
Options:
- true
- false
If there is an error creating your document, the callback_url will never be called. The status page will explain the error.
Asynchronous Callback URL
If this parameter is provided and the async parameter is set to true, DocRaptor will send a POST request to this URL after successfully completing an asynchronous job. The POST will contain the parameter "download_url" with the value being a url where your document can be downloaded.
Parameter Name: doc[callback_url]
If there is an error creating your document, the callback_url will never be called. The status page will explain the error.
Response Headers
For PDF documents, the response headers will contain the number of pages contained in the document.
The response header is "X-DocRaptor-Num-Pages"
Coding Examples
In this section, you'll find examples for making requests to our servers. We provide full documentation for making HTTP POST requests using C#, Curl, jQuery, Node.js, PHP, Prototype.js, Python, Ruby, and Rails.
You can read through our documentation here, or check out our repositories located at Github. If you have an example you'd like to share let us know and we'll share it with the world.
If you would like to see a more complex example you can check out this Using Doc Raptor to create Excel Spreadsheets tutorial by the guys at Switch on the Code!
C Sharp Examples
We provide a simple CLI program example (as opposed to a full blown GUI solution) as well as a WPF example. You can find the entire solution for the WPF example at Github.
CLI Example
WPF Example
Curl Example
Java CLI Example
Jquery Examples
Here you'll find several examples for creating PDF and Excel files with jQuery. We provide documentation for creating documents from forms and URLs, as well as using a jQuery plugin written by one of our users.
jQuery Form Based
jQuery Url Based
jQuery plugin
David Baldwin made a nice DocRaptor jQuery plugin. Examples and usage details can be found on that page.
Node.js
This is a Restler-based example for creating documents with Node.js.
PHP Examples
We provide two PHP examples: the first requires pecl_http to be installed on your server, and the other is a PHP wrapper written by one of our users.
PHP Example using pecl_http
PHP Wrapper Example
One of our users wrote a nice PHP wrapper for DocRaptor as well: https://github.com/krewenki/php-docraptor
Prototype.js Example
Python Examples
Thanks to John Keyes, there is a python wrapper that closely matches the functionality of the official ruby gem. Below are a couple of examples of that wrapper in action.
Python Example
Python Async Example
Ruby Examples
We provide Ruby examples using the official DocRaptor gem, as well as using the HTTP_Party gem.
DocRaptor Gem Example
DocRaptor Gem using Async Functionality
HTTParty Example
Rails Example
We've provided a couple of excerpts from a complete Ruby on Rails example that you can get at Github.
Note: You can define a PDF layout to be used for all of your PDFs by creating an application.pdf.haml (or .erb) file. This technique can also be used to create an XLS layout.
Note: Rails runs development mode through webrick by default. Webrick is single-threaded, so if your app is hitting its other endpoints in order to generate PDFs, you'll want to use something that supports concurrent requests in development like unicorn, passenger, or thin. You could also run multiple webrick instances on different ports.
API Key Setup
Add this to config/environment.rb to define your API key:
Controller Example
Some sample code from a controller:
HAML Example
Sample HAML code to generate an Excel file:
PDF Styles
You should be able to send normal html and css to DocRaptor. We suggest you put any css in a style block in the head of your document to reduce external connections. We will download external resources (images, css, etc.) to produce your PDF. It will go faster the fewer resources we have to look up.
Images can be embedded directly as data URIs in the html document you send to DocRaptor.
It is important to use the correct character encoding and locale in your document if you
are sending non-ASCII/Unicode characters in the html document.
More information about encoding.
- <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
- <html lang="pt-BR">
If you are attempting to use page breaks, make sure the element that the css page-break property is on is not floated or within a floated element, or the page break will not function.
Running javascript before the html is converted to a pdf is supported - see the javascript parameter. You can also read about prince-pdf-script, which allows js execution in the PDF reader at the time a PDF is opened.
We use Prince to generate PDFs. You can check out their documentation here.
PDF Options
We expose a number of extra options for Prince through our API that you can set. These correspond to the Prince command line options that you can see documented here.
Base URL
Specify the base URL of the input document.
Parameter Name: doc[prince_options][baseurl]
This can also be accomplished by using the HTML Base tag.
No XInclude
Disable XInclude processing.
Parameter Name: doc[prince_options][no_xinclude]
Default: false
Options:
- true
- false
No Network
Disable network access (prevents HTTP downloads).
Parameter Name: doc[prince_options][no_network]
Default: false
Options:
- true
- false
HTTP User
Specify the username for HTTP authentication.
Parameter Name: doc[prince_options][http_user]
HTTP Password
Specify the password for HTTP authentication.
Parameter Name: doc[prince_options][http_password]
HTTP Proxy
Specify the HTTP proxy server.
Parameter Name: doc[prince_options][http_proxy]
HTTP Timeout
Specify how long to wait, in seconds, when requesting an external resource while generating your document.
Parameter Name: doc[prince_options][http_timeout]
Insecure
Disable SSL verification (not recommended).
Parameter Name: doc[prince_options][insecure]
Default: false
Options:
- true
- false
Media
Specify the media type (eg. print, screen).
Parameter Name: doc[prince_options][media]
No Author Stylesheets
Ignore author style sheets.
Parameter Name: doc[prince_options][no_author_style]
Default: false
Options:
- true
- false
No Default Stylesheets
Ignore default style sheets.
Parameter Name: doc[prince_options][no_default_style]
Default: false
Options:
- true
- false
No Embedded Fonts
Disable font embedding in PDF output.
Parameter Name: doc[prince_options][no_embed_fonts]
Default: false
Options:
- true
- false
No Subset Fonts
Disable font subsetting in PDF output.
Parameter Name: doc[prince_options][no_subset_fonts]
Default: false
Options:
- true
- false
No Compression
Disable compression of PDF output.
Parameter Name: doc[prince_options][no_compress]
Default: false
Options:
- true
- false
Encryption
Encrypt PDF output.
Parameter Name: doc[prince_options][encrypt]
Default: false
Options:
- true
- false
Key Bits
Set encryption key size.
Parameter Name: doc[prince_options][key_bits]
Options:
- 40
- 128
User Password
Set PDF user password.
Parameter Name: doc[prince_options][user_password]
Owner Password
Set PDF owner password.
Parameter Name: doc[prince_options][owner_password]
Disallow Print
Disallow printing of PDF output.
Parameter Name: doc[prince_options][disallow_print]
Default: false
Options:
- true
- false
Disallow Copy
Disallow copying from PDF output.
Parameter Name: doc[prince_options][disallow_copy]
Default: false
Options:
- true
- false
Disallow Annotate
Disallow annotation of PDF output.
Parameter Name: doc[prince_options][disallow_annotate]
Default: false
Options:
- true
- false
Disallow Modify
Disallow modification of PDF output.
Parameter Name: doc[prince_options][disallow_modify]
Default: false
Options:
- true
- false
Input Type
Specify the input type of the document to be used by prince during processing.
Parameter Name: doc[prince_options][input]
Default: html
Options:
- html
- xml
- auto
Prince Version
Specify the version of Prince to use.
Parameter Name: doc[prince_options][version]
Options:
- 7.1
- 8.1
If no parameter is supplied, this is based on your user account setting. Anyone who signed up for DocRaptor prior to August 1, 2012 will default to 7.1. This is adjustable via the “Edit Profile” link on your user dashboard.
Prince Javascript
Use the built-in Prince javascript engine.
Parameter Name: doc[prince_options][javascript]
Default: false
Options:
- true
- false
Excel Styles
The general technique for producing an XLS file is to send some us some html in the
form of a table per worksheet, with the tables' rows and cells corresponding to the
same in excel. Below is a picture of a simple example transformation which also
demonstrates the use of named worksheets (via the name attribute).
You can style cells, rows, and the entire table using style attributes, and those attributes
cascade. We don't yet support writing arbitrary css style blocks. Soon, though. Below is a
picture of a simple background-color example transformation.
Read through our coding examples for more large examples.
Excel XLS Version Support
We currently produce Excel '97 compatible XLS files. As such, features added to excel later than that are not currently supported.
Special Table & Cell Attributes
Several element attributes have special meaning in DocRaptor. Below is a picture of those in action.
table:name
Setting the name attribute on a table element will name the sheet produced by the table.
table:password
Setting the password attribute on a table element will password protect the sheet produced by the table with the given password. By default this means that all cells in the sheet will be readonly, unless the password is entered. You can control what cells will be readonly using the -xls-locked style.
td:colspan
Setting the colspan on a table cell will create a merged cell
td:rowspan
Setting the rowspan on a table cell will create a merged cell from cells below the current cell.
Multiple Worksheets
Creating multiple worksheets is easy. Just send more than one table in your request,
wrapped inside a "tables" tag.
Specific Styles
What follows is a list of styles we support as part of a style attribute's value and the options they take. Excel-specific styles have been prefixed with ‘-xls-’. The options should more or less correspond to the options found via “Format Cell” in Excel.
-xls-content-type
The content type for the cell in Excel.
Default: auto
text-align
The horizontal alignment for cell content.
Default: general
vertical-align
The vertical alignment for cell content.
Default: bottom
text-indent
Amount of indentation of the cell content. Integer value from 0 to 14.
Default: 0
white-space
Cell content wrapping. If set to wrap, then Excel will wrap data in cells with this format so that it fits within the cell boundaries.
Default: nowrap
- nowrap
- wrap
-xls-text-orientation
Sets the text orientation for this cell.
Default: horizontal
- horizontal (0)
- vertical (90)
- stacked
- 0
- 45
- 90
- 270
- 315
- 360
Arbitrary amounts are not allowed. The option closest to what you pass us will be chosen.
360 is equivalent to 0.
-xls-background-pattern
Sets the background pattern.
Default: none (solid if background-color is set)
- none
- 6.25%
- 12.5%
- 25%
- 50%
- 75%
- solid
- horizontal stripe
- vertical stripe
- reverse diagonal stripe
- diagonal stripe
- diagonal crosshatch
- thick diagonal crosshatch
- thin horizontal stripe
- thin vertical stripe
- thin reverse diagonal stripe
- thin diagonal stripe
- thin horizontal crosshatch
- thin diagonal crosshatch
background-color
Sets the background color for the cell. Can take any named web color or hex value.
Default: transparent (grey if -xls-background-pattern is set)
These colors will be translated to the closest of the ~64 valid colors for Excel. Certain colors (such as black) will cause Excel to ignore the background pattern you set.
border-top-color, border-bottom-color, border-left-color, border-right-color
Sets the border color for the cell. Can take any named web color or hex value.
Default: transparent (if a border color is set, the default will be black)
These colors will be translated to the closest of the ~64 valid colors for Excel.
border-top-style, border-bottom-style, border-left-style, border-right-style
Sets the border style (lines that appear around a cell)
Default: none (if color is set, the default is thin).
- none
- thin
- medium
- dashed
- dotted
- thick
- double
- hair
- medium dashed
- dash dot
- medium dash dot
- dash dot dot
- medium dash dot dot
- slanted dash dot
compact border syntax
Coming Soon (but not here yet) (like your standard CSS)!
font-family
Sets the font family.
Default: Arial
font-size
Sets the font size in points.
Default: 10pt
font-style
Sets if the font should be italic or not.
Default: normal
- normal
- italic
font-weight
Sets the weight of the font.
Default: normal
- normal (400)
- bold (700)
- bolder (900)
- lighter (200)
- 100
- 200
- 300
- 400
- 500
- 600
- 700
- 800
- 900
text-decoration
Sets the text decoration.
Default: none
- none
- line-through
- underline
color
Sets the text color for the cell. Can take any named web color or hex value.
Default: black
These colors will be translated to the closest of the ~64 valid colors for Excel.
-xls-format
Sets the number/date format for the cell. There are many possible options for this. A few of the important ones are below, with more documentation to come in the future. As a warning, if you use a number format on a text or date cell, the results may be unpredictable.
Default: default
- default
- text
- integer
- float
- percent float
- percent integer
- accounting float
- accounting integer
- accounting red float
- accounting red integer
- exponential
height
Sets the height of a row. Only valid on tr elements.
Default: auto
width
Sets the width of a column. The last width specified for a column wins (i.e., if you specify the width for a column in both row 1 and row 2, the width specified in row 2 is used).
Default: auto
-xls-locked
Sets if this cell is locked. Only has meaning if a password has been set for the sheet that will contain this cell.
Default: true
- true
- false
-xls-thousands-delimiter
When reading values for cells, what character delimits large numbers (i.e. 1 million written as ‘1,000,000’ is delimited by the comma character). If you are using this, you probably want to set ‘-xls-decimal-delimiter’, too.
Default: ,
-xls-decimal-delimiter
When reading values for cells, what character delimits the begin of the decimal portion of numbers (i.e. 11/10 written as 1.1 is delimited by the period character). If you are using this, you probably want to set ‘-xls-thousands-delimiter’, too.
Default: .
Referrer-based Document Generation
DocRaptor makes it easy to convert any webpage you have control over into a document using a simple anchor tag. On your account managment page, you can add domains you would like to link to DocRaptor, and requests to DocRaptor to create docs that have that domain as part of their HTTP_REFERER HTTP header will be generated using your account without the need for an API Key. Click “manage domains” after logging in to manage your domains!
URL
Once you've set up your domains, you can make a GET request against either:
- http://docraptor.com/docs/from_site
- https://docraptor.com/docs/from_site
Example Code
Live Example
Documentation PDFDoc Listing
You can also get a list of previously created documents through the API. This is just information like the name, the date, and if it was a test document. Since we don't actually store the created document, we can't return that. Info about the documents is returned as xml in a paginated list, ordered by date of creation (most recent first).
URL
You can make a GET request against either:
- http://docraptor.com/docs
- https://docraptor.com/docs (if you feel the need for ssl)
Parameters
The following are the parameters that DocRaptor expects when you request the document listing.
Page
Specifies the page (in terms of pagination) of documents to return
Name: page
Default: 1
Per Page
Specifies the number of documents per page (in terms of pagination) to return
Name: per_page
Default: 100
