API
Introduction
Creating documents using DocRaptor is extremely simple. All you have to do is make a POST request with some parameters!
URL
You can post against either:
- http://docraptor.com/docs
- https://docraptor.com/docs (if you feel the need for ssl)
Authentication
Authentication is done through the use of your API key, which can be found on your account dashboard. You can use the api key in one of two ways: either as the value of the query parameter "user_credentials", or as the username for basic http auth. You can see examples of both of these methods on the examples page.
Parameters
The following are the parameters that DocRaptor expects when you do a post to create a document. To see all of these values in use, check out the examples page.
Document Type
Specifies what type of document DocRaptor should try and create from the provided content.
Name: doc[document_type]
Options:
- xls
Document Content
The content that DocRaptor should use to create the document.
e.g. “<table><tr><td>Example!</td></tr></table>”
Name: doc[document_content]
You must supply either this or the following parameter, document_url.
Document URL
The url that DocRaptor should request the content from to create the document.
e.g. “http://www.docraptor.com/documentation
Name: doc[document_url]
You must supply either this or the previous parameter, document_content.
Name
A name for the document. This can be any string that you find meaningful to describe this document - it is just used for identification purposes on the account dashboard.
Parameter Name: doc[name]
Test
Specifies if this document should be created using test mode. Test mode documents do not count against your monthly document quota - this way you can play with styles until you get a good looking document without wasting any of your allotted documents.
Parameter Name: doc[test]
Default: false
Options:
- true
- false
When test mode is on, in the case of PDFs the resulting file is watermarked, and in the case of Excel the resulting file is cut off after 20 rows. This parameter is optional.
Tag
An arbitrary tag string for the document. Useful if you have multiple applications using DocRaptor under the same DocRaptor account, and you want to differentiate in the logs between each app.
Parameter Name: doc[tag]
Strict
Specifies if DocRaptor should try to validate the html being sent. By default, we do validate the html and report any errors - more useful than getting back a malformed pdf file and wondering why. But sometimes you know that the html is valid enough to produce the pdf you want, and you want DocRaptor to stop complaining about errors in the document. In that case, set this parameter to 'none', and DocRaptor will try to create the document no matter how malformed the input html is.
Parameter Name: doc[strict]
Default: html
Options:
- html
- none
This parameter is only used when created PDF files. When created XLS files, DocRaptor can't do anything useful with malformed html, and so it will always try and validate the input for XLS files.
Javascript
If this parameter is set to true, DocRaptor will try and run any javascript in your html before we render it into a document. This parameter is false by default because it adds a significant amount of time to document processing - loading any external scripts and running them. If there are any errors running your javascript, the document creation process will fail (and the errors will be returned so you can see what went wrong). We currently have a 10 second timeout for loading all the assets and running the javascript - so make sure you don't write any infinite loops :)
Parameter Name: doc[javascript]
Default: false
Options:
- true
- false
This parameter is only used when created PDF files. We don't accept any javascript when creating XLS files at the moment - if you have the need, let us know, and we might expand this feature.
Asynchronous Job
If this parameter is set to true, DocRaptor will queue your doc for background creation and send back JSON with a "status_id" key set. e.g.
{"status_id":"123454321"}
Making an authenticated request against http://docraptor.com/status/{status_id} will give you the status of your document job. The returned JSON from that call should look something like:
{"status":"completed", "download_url":"http://docraptor.com/download/12345asdf", "message":"Completed at Mon Jun 06 18:33:17 +0000 2011", "number_of_pages":2}
When the job is complete, DocRaptor will call the specified
callback_url
if one was provided, via a POST request.
Querying the status URL after the doc has been successfully created will provide a download_url in the returned JSON. The value associated with that key is a 2-time use URL from which you can download your doc.
If DocRaptor encounters an error generating your document, the status value will be "failed". A key "validation_errors" will be set with a value corresponding to the reason for the failure. An example of this is:
{"status":"failed", "validation_errors":"Name can't be blank\nName is too long (maximum is 200 characters)"}
If your document has been queued but processing has not yet begun, if will have a status of "queued". If your document is currently being processed, it will have a status of "working".
Parameter Name: doc[async]
Default: false
Options:
- true
- false
If there is an error creating your document, the callback_url will never be called. The status page will explain the error.
Asynchronous Callback URL
If this parameter is provided and the async parameter is set to true, DocRaptor will send a POST request to this URL after successfully completing an asynchronous job. The POST will contain the parameter "download_url" with the value being a url where your document can be downloaded.
Parameter Name: doc[callback_url]
If there is an error creating your document, the callback_url will never be called. The status page will explain the error.
Response Headers
For PDF documents, the response headers will contain the number of pages contained in
the document.
The response header is "X-DocRaptor-Num-Pages"
PDF Styles
Introduction
You should be able to send normal html and css to DocRaptor. We suggest you put any css in a style block in the head of your document to reduce external connections. We will download external resources (images, css, etc.) to produce your PDF. It will go faster the fewer resources we have to look up.
Images can be embedded directly as data URIs in the html document you send to DocRaptor.
It is important to use the correct character encoding and locale in your document if you are sending non-ASCII/Unicode characters in the html document. More information about encoding.
- <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
- <html lang="pt-BR">
If you are attempting to use page breaks, make sure the element that the css page-break property is on is not floated or within a floated element, or the page break will not function.
Running javascript before the html is converted to a pdf is supported - see the javascript parameter. You can also read about prince-pdf-script, which allows js execution in the PDF reader at the time a PDF is opened.
We use Prince to generate PDFs. You can check out their documentation here.
PDF Options
Base URL
Specify the base URL of the input document.
Parameter Name: doc[prince_options][baseurl]
This can also be accomplished by using the HTML Base tag.
No XInclude
Disable XInclude processing.
Parameter Name: doc[prince_options][no_xinclude]
Default: false
Options:
- true
- false
No Network
Disable network access (prevents HTTP downloads).
Parameter Name: doc[prince_options][no_network]
Default: false
Options:
- true
- false
HTTP User
Specify the username for HTTP authentication.
Parameter Name: doc[prince_options][http_user]
HTTP Password
Specify the password for HTTP authentication.
Parameter Name: doc[prince_options][http_password]
HTTP Proxy
Specify the HTTP proxy server.
Parameter Name: doc[prince_options][http_proxy]
Insecure
Disable SSL verification (not recommended).
Parameter Name: doc[prince_options][insecure]
Default: false
Options:
- true
- false
Media
Specify the media type (eg. print, screen).
Parameter Name: doc[prince_options][media]
No Author Stylesheets
Ignore author style sheets.
Parameter Name: doc[prince_options][no_author_style]
Default: false
Options:
- true
- false
No Default Stylesheets
Ignore default style sheets.
Parameter Name: doc[prince_options][no_default_style]
Default: false
Options:
- true
- false
No Embed Fonts
Disable font embedding in PDF output.
Parameter Name: doc[prince_options][no_embed_fonts]
Default: false
Options:
- true
- false
No Subset Fonts
Disable font subsetting in PDF output.
Parameter Name: doc[prince_options][no_subset_fonts]
Default: false
Options:
- true
- false
No Compression
Disable compression of PDF output.
Parameter Name: doc[prince_options][no_compress]
Default: false
Options:
- true
- false
Encrypt
Encrypt PDF output.
Parameter Name: doc[prince_options][encrypt]
Default: false
Options:
- true
- false
Key Bits
Set encryption key size.
Parameter Name: doc[prince_options][key_bits]
Options:
- 40
- 128
User Password
Set PDF user password.
Parameter Name: doc[prince_options][user_password]
Owner Password
Set PDF owner password.
Parameter Name: doc[prince_options][owner_password]
Disallow Print
Disallow printing of PDF output.
Parameter Name: doc[prince_options][disallow_print]
Default: false
Options:
- true
- false
Disallow Copy
Disallow copying from PDF output.
Parameter Name: doc[prince_options][disallow_copy]
Default: false
Options:
- true
- false
Disallow Annotate
Disallow annotation of PDF output.
Parameter Name: doc[prince_options][disallow_annotate]
Default: false
Options:
- true
- false
Disallow Modify
Disallow modification of PDF output.
Parameter Name: doc[prince_options][disallow_modify]
Default: false
Options:
- true
- false
Input Type
Specify the input type of the document to be used by prince during processing.
Parameter Name: doc[prince_options][input]
Default: html
Options:
- html
- xml
- auto
Excel XLS Styles
Introduction
The general technique for producing an XLS file is to send some us some html in the
form of a table per worksheet, with the tables' rows and cells corresponding to the
same in excel. Below is a picture of a simple example transformation which also
demonstrates the use of named worksheets (via the name attribute).
You can style cells, rows, and the entire table using style attributes, and those attributes
cascade. We don't yet support writing arbitrary css style blocks. Soon, though. Below is a
picture of a simple background-color example transformation.
See the examples for more large examples.
Excel XLS Version Support
We currently produce Excel '97 compatible XLS files. As such, features added to excel later than that are not currently supported.
Special Table & Cell Attributes
Several element attributes have special meaning in DocRaptor. Below is a picture of those in action.
table:name
Setting the name attribute on a table element will name the sheet produced by the table.
table:password
Setting the password attribute on a table element will password protect the sheet produced by the table with the given password. By default this means that all cells in the sheet will be readonly, unless the password is entered. You can control what cells will be readonly using the -xls-locked style.
td:colspan
Setting the colspan on a table cell will create a merged cell
td:rowspan
Setting the rowspan on a table cell will create a merged cell from cells below the current cell.
Multiple Worksheets
Creating multiple worksheets is easy. Just send more than one table in your request,
wrapped inside a "tables" tag.
Specific Styles
What follows is a list of styles we support as part of a style attribute's value and the options they take. Excel-specific styles have been prefixed with ‘-xls-’. The options should more or less correspond to the options found via “Format Cell” in Excel.
-xls-content-type
The content type for the cell in Excel.
Default: auto
text-align
The horizontal alignment for cell content.
Default: general
vertical-align
The vertical alignment for cell content.
Default: bottom
text-indent
Amount of indentation of the cell content. Integer value from 0 to 14.
Default: 0
white-space
Cell content wrapping. If set to wrap, then Excel will wrap data in cells with this format so that it fits within the cell boundaries.
Default: nowrap
- nowrap
- wrap
-xls-text-orientation
Sets the text orientation for this cell.
Default: horizontal
- horizontal (0)
- vertical (90)
- stacked
- 0
- 45
- 90
- 270
- 315
- 360
Arbitrary amounts are not allowed. The option closest to what you pass us will be chosen.
360 is equivalent to 0.
-xls-background-pattern
Sets the background pattern.
Default: none (solid if background-color is set)
- none
- 6.25%
- 12.5%
- 25%
- 50%
- 75%
- solid
- horizontal_stripe
- vertical_stripe
- reverse_diagonal_stripe
- diagonal_stripe
- diagonal_crosshatch
- thick_diagonal_crosshatch
- thin_horizontal_stripe
- thin_vertical_stripe
- thin_reverse_diagonal_stripe
- thin_diagonal_stripe
- thin_horizontal_crosshatch
- thin_diagonal_crosshatch
background-color
Sets the background color for the cell. Can take any named web color or hex value.
Default: transparent (grey if -xls-background-pattern is set)
These colors will be translated to the closest of the ~64 valid colors for Excel. Certain colors (such as black) will cause Excel to ignore the background pattern you set.
border-color, border-top-color, border-bottom-color, border-left-color, border-right-color
Sets the border color for the cell. Can take any named web color or hex value.
Default: transparent (if a border color is set, the default will be black)
These colors will be translated to the closest of the ~64 valid colors for Excel.
border-style, border-top-style, border-bottom-style, border-left-style, border-right-style
Sets the border style (lines that appear around a cell)
Default: none (if color is set, the default is thin).
- none
- thin
- medium
- dashed
- dotted
- thick
- double
- hair
- medium_dashed
- dash_dot
- medium_dash_dot
- dash_dot_dot
- medium_dash_dot_dot
- slanted_dash_dot
compact border syntax
A shorter way of setting all the borders at once. e.g. border: styles
Default: none
- medium #3366FF
- dash_dot_dot pink
- etc.
font-family
Sets the font family.
Default: Arial
font-size
Sets the font size in points.
Default: 10pt
font-style
Sets if the font should be italic or not.
Default: normal
- normal
- italic
font-weight
Sets the weight of the font.
Default: normal
- normal (400)
- bold (700)
- bolder (900)
- lighter (200)
- 100
- 200
- 300
- 400
- 500
- 600
- 700
- 800
- 900
text-decoration
Sets the text decoration.
Default: none
- none
- line-through
- underline
color
Sets the text color for the cell. Can take any named web color or hex value.
Default: black
These colors will be translated to the closest of the ~64 valid colors for Excel.
-xls-format
Sets the number/date format for the cell. There are many possible options for this. A few of the important ones are below, with more documentation to come in the future. As a warning, if you use a number format on a text or date cell, the results may be unpredictable.
Default: default
- default
- text
- integer
- float
- percent float
- percent integer
- accounting float
- accounting integer
- accounting red float
- accounting red integer
- exponential
height
Sets the height of a row. Only valid on tr elements.
Default: auto
width
Sets the width of a column. The last width specified for a column wins (i.e., if you specify the width for a column in both row 1 and row 2, the width specified in row 2 is used).
Default: auto
-xls-locked
Sets if this cell is locked. Only has meaning if a password has been set for the sheet that will contain this cell.
Default: true
- true
- false
-xls-thousands-delimiter
When reading values for cells, what character delimits large numbers (i.e. 1 million written as ‘1,000,000’ is delimited by the comma character). If you are using this, you probably want to set ‘-xls-decimal-delimiter’, too.
Default: ,
-xls-decimal-delimiter
When reading values for cells, what character delimits the begin of the decimal portion of numbers (i.e. 11/10 written as 1.1 is delimited by the period character). If you are using this, you probably want to set ‘-xls-thousands-delimiter’, too.
Default: .
Referrer-based Document Generation
Introduction
DocRaptor makes it easy to convert any webpage you have control over into a document using a simple anchor tag. On your account managment page, you can add domains you would like to link to DocRaptor, and requests to DocRaptor to create docs that have that domain as part of their HTTP_REFERER HTTP header will be generated using your account without the need for an API Key. Click “manage domains” after logging in to manage your domains!
URL
Once, you've setup your domains, you can make a GET request against either:
- http://docraptor.com/docs/from_site
- https://docraptor.com/docs/from_site
Example Code
Live Example
Documentation PDFDoc Listing
Introduction
You can also get a list of previously created documents through the API. This is just information like the name, the date, and if it was a test document. Since we don't actually store the created document, we can't return that. Info about the documents is returned as xml in a paginated list, ordered by date of creation (most recent first).
URL
You can make a GET request against either:
- http://docraptor.com/docs
- https://docraptor.com/docs (if you feel the need for ssl)
Parameters
The following are the parameters that DocRaptor expects when you request the document listing.
Page
Specifies the page (in terms of pagination) of documents to return
Name: page
Default: 1
Per Page
Specifies the number of documents per page (in terms of pagination) to return
Name: per_page
Default: 100