DocRaptor

DocRaptor Blog

Tips and tricks for our HTML to Excel and PDF API

  1. Archive
  2. Categories
  3. RSS

Prince 10 Is Now Available

We are pleased to announce that Prince 10 is now available for all DocRaptor PDF customers. You can use the version attribute in your API calls, or simply change your default version to 10 to take advantage of new features and better performance today!

You can find your current default Prince version in on your settings page, right under your API key.

With Prince 10 you can take advantage of:

  • Better SVG support
  • Parallelized asset fetching for faster PDF generation with many images or embedded elements
  • Several improvements to Prince’s JavaScript engine
  • Better CSS support for landscape layouts
  • Many other CSS options and improvements
  • Font improvements, including fixes for some international fonts
  • Support for the HTML5 template element
  • Support for UTF16 encoded HTML documents
  • And more!

All of the new features are listed in PrinceXML’s release notes. Try it out today and let us know what you think about the latest PDF rendering updates from Prince! If there are new features you need that require additional support from us, don’t hesistate to ask!

Adding PDF Generation to Your Rails App

One of our new users recently added DocRaptor to his Rails app. Ryan was kind enough to document his process, and he thought a step-by-step guide for Ruby on Rails would be awesome content for our blog.

We agreed, and we fired up a fresh Rails 4 application to show you just how easy it is to DocRaptor to your own Rails app. Many of our users rely on DocRaptor to create customized reports, which sounds like a great idea for a practical example. Good news - getting started is super easy!

Step One: Add the doc_raptor gem to your gem file

Including the doc_raptor gem is the easiest way to add DocRaptor to your Rails app.

Step Two: Add your API key to an initializer

You’ll need to add your DocRaptor API key to your project, and we recommend adding it to an initializer. This will ensure your API key is initialized when Rails loads, allowing you to easily add document generation to other models. No DocRaptor account? No problem - you can get your API key by signing up here.

Step Three: Define a model for PDF generation

You’ll define a Printable class with a sales method, which generates some dummy data to create a table of sales figures. You can call this method in your controller, then render the result in your view template.

Step Four: Create a controller action to create your PDF

You do this by adding the following code to your controller. Let’s create a PDF using the index action of your PrintablesController. The doc_raptor_send method defined below handles the interaction with DocRaptor to convert the template HTML to PDF.

Step Five: Update your routes

You’ll want to update your routes.rb file so you can properly load the data pulled from your model into your view before converting it into a PDF.

Step Six: Add a template for your PDF

Here’s a sample template that pulls information from your Printable model. This template also adds a button that can be clicked to download a generated PDF of the sales report.

Here’s what our index looks like:

DocRaptor index view

Now you can just click that “Download Report” link, and you’re done! I bet you can’t believe it could be so easy. Fire up a server, head back to your view and click the link in your printables view to generate your PDF. Here’s what this sample generated sales report looks like. Please note: we’re aware it’s woefully understyled.

From here, it’s simply a matter of tweaking your input HTML until you get the PDF you want. You can check out our samples page to get ideas for customizing your PDFs, with features like running headers and footers, custom font support, and dynamically generated tables of contents.

Do you have some crazy use case and you’d like a hand? Do you have a regular use case and you’d still like a hand? Send us an email - we’d love to help!

Implementing Chargify Webhooks

If you are a heavy user of Chargify (as we are), you probably need to get information into your system when an event like payment or product change happens on Chargify’s end. Hopefully, you are aware that postbacks will no longer be available starting July 1st 2015, in favor of their webhook system. If you are not using the postback system, don’t worry, I’m only going to mention it one more time.

Postbacks!

For every piece of information that lives in both places, you need to decide which end will be the Source of Authority for that information. It’s so important I capitalized it!

For data like your user’s name, address (email or otherwise), and application-specific data (for DocRaptor, things like what version of Prince to use as your default), your application should be authoritative. You can (and in fact, must) push some user data to Chargify in order for the system to work, but there is no reason your system should accept changes from Chargify for a change that originated inside your system.

If your application and Chargify disagree about the value for any given piece of data, when do you want to use Chargify’s version? For us, it’s a very small amount of data:

  • Current subscription period started at
  • Subscription state (active or disabled)
  • Which product is being used (e.g. Professional, Max, Silver)

If something is wrong with one of those pieces of information in either Chargify or your application, something Very Bad is going to happen. Any time your application sends that data (as in a DocRaptor customer upgrading their plan), you should carefully monitor for problems. If you can’t trust Chargify’s information on your customers’ product levels, you also can’t trust Chargify to bill them properly.

Because of the highly custom nature of this code, we use Instrumental for this monitoring, but there are plenty of good solutions available.

So, once you’ve properly limited the pieces of information you care about, let’s figure out which webhooks you actually need to receive. Chargify provides a nice table of all webhooks in their documentation. There’s no direct link to the table, but trust me: it’s there. For the purposes of the three pieces of data above, we care about these events:

  • subscription_state_change - maybe someone let their card expire
  • billing_date_change - usually due to support activity
  • subscription_product_change - someone changed which product they use
  • upgrade_downgrade_success - someone changed which product they use
  • expiration_date_change - card information was updated
  • payment_success - $$$

Other webhooks like customer_updated may be interesting based on your requirements, but the six I’ve listed above basically define anything that can impact the billing cycle. These are things you need to know so you can properly control your product’s premium features.

Now it’s time for the bad news: all the webhooks you’ve enabled will be sent to all the endpoints you’ve defined. You aren’t going to define something like webhook_events#payment_success, but rather, one endpoint to rule them all. If you aren’t going to handle an event, you should just return 200 as quickly as possible.

Not all webhooks have the same payload - which makes sense, and also makes this more complicated. Generally, each hook contains all the information you need to handle the event. A customer_updated payload does not include any subscription information - it only deals with basic customer information. Test each event to see exactly what the shape of the payload is. Sometimes the documentation lags behind the actual implementation.

Fortunately, every webhook body contains two pieces of information outside of the payload itself: a unique ID for that event (which you want to log) and the name of the event (e.g. payment_success). From these, it’s a simple matter to log the webhook to an appropriate handler.

While the webhook payload should contain enough information for you to take an appropriate action, you may find it easier to use a sort of brute force approach. Any time an event occurs, just fetch all the pieces of information that MAY have changed from Chargify’s API.

Okay, now I’ve written an awful lot about how to slice and dice your customers’ data over the internet, and I haven’t mentioned security once. Don’t worry, it’s pretty easy.

Each payload includes a header, X-Chargify-Webhook-Signature-Hmac-Sha-256, which contains a Sha256 signature of the payload, signed with your site’s shared secret key. You can also receive the signature as a parameter on the call, but that requires putting some magic values in your webhook configuration. I’ll always err on the side of less configuration, so we’ll stick to the header. Verifying this signature will verify that the information you are receiving is from Chargify and absolutely authentic.

Refer to the Chargify docs to find your secret key. Verifying that requests are authentic is then as easy as this:

Please, do not ever accept unverified data over your webhook. If you do, Very Bad People will steal your internet money and your customers will have a bad time. You have been warned!

So, there’s a little insight into how we think about and use webhooks at DocRaptor, and a few things we learned along the way. If you perform any support activities like refunds or adjustments for your customers, you absolutely need to make sure that Chargify and your application are in sync at all times. Internet!

DocRaptor vs wkhtmltopdf (and Snappy)

DocRaptor vs wkhtmltopdf

From our research, wkhtmltopdf is the best open-source HTML-to-PDF tool. It is one of the few open-source projects built solely for HTML-to-PDF generation and uses a specifically modified version of the WebKit browser engine.

wkhtmltopdf is used inside many popular language-specific tools such as PHP’s Snappy, Ruby’s wicked_pdf and pdfkit. Both wkhtmltopdf and phantomjs, another open-source tool, use the WebKit browser engine to create PDFs, but unlike wkhtmltopdf, phantomjs’s WebKit version is not customized for PDF generation.

wkhtmltopdf is an excellent solution for many, but not all, projects. Here’s what you should consider when choosing between DocRaptor and wkhtmltopdf:

  1. Is document size or complexity important? Generally speaking, we’ve found wkhtmltopdf generates documents 15-40% larger than the same document created by DocRaptor. In rare cases, especially with more complex documents, wkhtmltopdf produces documents that are significantly larger. As an example, wkhtmltopdf turns http://intertwingly.net/blog/2008/02/01/SVG-Tidy into a 4MB, CPU-hogging PDF file whereas DocRaptor outputs a 160kb file. Winner: DocRaptor

  2. Are you sending the document to a printer? We make it easy to send your PDFs right to your printer. DocRaptor has support for CSS3 page media queries and PDF/A-1B ISO standards. It’s easy to add features like page numbers, headers and footers, and customized page sizes with a simple change to your CSS styling. wkhtmltopdf offers similar features, but requires using configuration setting. This requires you to make style definitions in your backend code and CSS, as opposed to just in your CSS. If you’re intending to print your documents, we suggest closely comparing the documents generated by the various services. Winner: DocRaptor, but it’s close

  3. Do you want PDF forms or document thumbnails? Both of these features are supported by wkhtmltopdf, but not DocRaptor. Winner: wkhtmltopdf

  4. Do you also need to create XLS files? DocRaptor handles both PDFs and XLS files. Winner: DocRaptor

  5. Do you want to maintain your own architecture? This is the largest difference between wkhtmltopdf and DocRaptor. With wkhtmltopdf, each document takes up far more server capacity than a normal web page load. DocRaptor provides proven architecture capable of handling millions of documents a month without requiring additional server capacity.

    When you choose DocRaptor, you get direct access to a highly experienced support team to assist with your implementation and help debug any problems with your generated documents. We’ve created a PDF Generator Total Cost of Ownership calculator to help compare the architecture setup and support costs. The results might surprise you! Winner: DocRaptor (in most cases, but not always)

We hope this document helps. If you have further questions, please leave a comment or email Support@DocRaptor.com!

DocRaptor vs PrinceXML

DocRaptor vs PrinceXML

PrinceXML is part of the backend processing engine for DocRaptor. All PDFs created by DocRaptor use PrinceXML as the rendering engine. We love the PrinceXML product and their team. They recommend DocRaptor as a cloud solution and we recommend Prince for all self-hosted PDF generation.

So what’s the difference between DocRaptor and Prince? We have some big differences in our pricing and delivery models, as well as output documents. Here’s what you should keep in mind if you’re choosing between DocRaptor and Prince, or planning to switch from one service to the other.

  1. Instant high-volume scalability
    With DocRaptor, you receive immediate access to nearly infinite PDF generation across multiple datacenters.

  2. Infrastructure setup and server maintenance
    With any non-trivial volume of PDF generation, most organizations require multiple PDF generation servers. Obviously, this becomes costly and time-consuming to setup and maintain over time. With DocRaptor we handle all this work on your behalf. With PrinceXML you’ll have complete control over your environment. This can be important if you’re working with special regulations or business requirements.”

  3. Internet access
    As a cloud-based service, DocRaptor requires internet access. PrinceXML does not. If latency (the time to send a document to DocRaptor and download the created PDF) is important, using PrinceXML locally may be a good choice.

  4. Excel Documents
    In addition to PDF files, DocRaptor can convert HTML to XLS/XLSX.

  5. Fixed-cost vs value-based pricing
    DocRaptor pricing scales up or down based on your PDF creation needs. It costs very little to start and test documents are always free. However, the costs are on-going and can change as volume increases. With PrinceXML, the costs (excluding upgrades) are fixed and won’t change. Fixed costs are sometimes advantageous for client projects that require a definitive hand-off.

  6. Access to Upgrades and Support
    All DocRaptor accounts come with email and live chat support, as well as access to the latest production of Prince. Every commercial Prince license comes with 12 months of support and upgrades included. After that, you may purchase additional support and upgrades.

We hope this document helps. If you have further questions, please leave a comment or email Support@DocRaptor.com!

DocRaptor vs PhantomJS

DocRaptor vs PhantomJS

PhantomJS is a “headless browser” (a web browser, just for programming, without a visible interface). Exporting screen captures to PDF, PNG, and JPG is just one of many PhantomJS features. DocRaptor uses PrinceXML, an engine designed specifically for creating PDFs. PhantomJS is built on the WebKit browser, as does the other major open-source PDF generator, wkhtmltopdf.

PhantomJS is an excellent screen capture tool, but there are some major differences you should be aware of when choosing between DocRaptor and PhantomJS.

  1. Are you ok with larger documents? Simply put, PhantomJS is not as good at PDF generation as Doc Raptor. It is a screen capture tool, not a dedicated PDF generator. In general, we found documents generated by PhantomJS tend to be 15-40% larger than the same document generated by DocRaptor. PhantomJS also has particular difficulty with complex documents. Winner: DocRaptor

  2. Are you (or your customers) printing the documents? For printed documents, CSS3 page media queries are important features such as page numbers and document sizing. DocRaptor supports these queries, letting you make printer-ready documents using just CSS styles. PhantomJS does not support these queries at all. The other major open-source alternative to DocRaptor, wkhtmltopdf, does support many of these options, albeit in a roundabout fashion. Additionally, DocRaptor offers PDF/A-1B ISO standard support for improved archiving and printing. Winner: DocRaptor

  3. Do you want document thumbnails? PhantomJS can generate a PNG snapshot of your document. You’ll need to use another tool like ImageMagick to resize the PNG to thumbnail size, but image export functionality is a must have for many projects. Winner: PhantomJS (with ImageMagick)

  4. Would you like to generate Excel files too? DocRaptor can convert HTML to PDF and xls/xlsx, making it a one-stop shop for report generation. Winner: DocRaptor

  5. Do you want to manage and support PDF generation? Infrastructure and support costs are the big differences between DocRaptor and PhantomJS. Debugging failed documents can be painful without a detailed document generation log like DocRaptor’s. Generating PDFs takes up more server capacity than typical web page requests and often requires dedicated servers to handle the load.

    DocRaptor gives you reliable infrastructure, capacity for millions of documents and access to world-class support. You’ll never have to worry about getting too many requests, or whether you can handle a very complicate PDF document. You can use our Total Cost of Ownership calculator to compare setup and maintenance costs. It’s important to consider every factor when choosing a PDF generation solution, and the results may shock you. Winner: DocRaptor(in most cases, but not always)

We hope this document helps. If you have further questions, please leave a comment or email Support@DocRaptor.com!

DocRaptor vs PDFreactor

DocRaptor vs PDFreactor

PDFreactor is a fantastic HTML to PDF engine, and provides several features that DocRaptor does not yet support. However, DocRaptor has an entirely different pricing and delivery model that many organizations find a better fit. Here’s a summary of the differences between DocRaptor and PDFreactor:

  1. Do you want PDF forms, thumbnails, digital signatures, or 508 accessibility compliance? If so, PDFreactor is your best option. DocRaptor does not yet support these features. Winner: pdfreactor

  2. How important is high-availability or high-volume generation? Can you risk slowing down your entire application because of a few PDFs? PDF generation is the equivalent of running a full HTML browser engine on your server, requiring significantly more resources than a standard web application request. This can be especially troublesome when handling multiple simultaneous PDF creation requests. Many applications require dedicated PDF generation servers and infrastructure. When you choose DocRaptor, you’re getting access to reliable PDF creation infrastructure without slowing down your application. Winner: DocRaptor

  3. Do you want assistance with debugging documents? Creating PDFs, especially those destined for printing, is quite different than creating a web page. DocRaptor’s support includes document debugging for those moments when you need a extra hand. We’ve helped thousands of businesses generate high quality documents, and we’d love to help you too. Winner: DocRaptor

  4. Would you also like Excel documents? In addition to PDFs, DocRaptor can also create XLS or XLSX files from HTML. Winner: DocRaptor

  5. Have you considered the long-term, total cost of ownership? Our Total Cost of Ownership calculator can help you compare the cost of support, setup and maintenance of using PDFreactor vs DocRaptor. The results may surprise you, especially when you need high-availability. Winner: DocRaptor (in most cases, but not always)

We hope this document helps. If you have further questions, please leave a comment or email Support@DocRaptor.com!

Happy Birthday, DocRaptor!

Happy Birthday, DocRaptor!

Your favorite PDF and XLS generation service turned five years old today! Five years is a LONG time on the internet, and we realized we’d never told DocRaptor’s origin story.

UNTIL NOW.

Take a trip with us, into the past. The year was 2010, and Expected Behavior was a younger company. We were busy with consulting work, and we’d added HTML to PDF creation to several applications for our clients. If you’ve ever tried to add PDF generation to an application yourself, you have some idea of how painful this can be.

“Would this be a useful service?” we asked. “If we made an API for PDF generation, would people pay for it?”

We set out to answer these questions, and we built the first version of DocRaptor in a weekend. It wasn’t pretty. In fact, V. 1 DocRaptor looked something like this:

DocRaptor's home page in 2010

But it worked, and people liked it. Our first paying customer, Tinderbox, signed up less than a month after we launched. They’re still with us, and they’re still creating great looking proposals, contracts, and sales documents for their own clients.

We’ve learned a lot in the last 5 years. We’ve helped thousands of companies add reliable PDF and XLS creation to their own applications, reduce their infrastructure costs, and deliver beautiful documents to their customers. We’ve made a lot of good friends, all over the world.

So far this year, thousands of businesses have relied on DocRaptor to generate millions of PDF and XLS documents. The infrastructure work we did last summer improved our generation time per document, and made it a lot easier for us to scale to meet growing demand for document creation.

Five years is a long time on the internet, and we can’t wait to see what the next five years brings.

EmWeb: Why we've been happy DocRaptor customers for 3 years

The below case study was written by EmWeb, an online educational platform that uses DocRaptor to convert complex course descriptions and heavily formatted tables into ready to print PDF documents.

Example EmWeb PDF

Download a sample study program PDF generated by EmWeb

We provide a suite of study program management tools for higher education called EmWeb. It consists of three main parts:

  1. An overview of the study program, typically divided into 15 chapters detailing the contents of the program, prerequisites, and what kind of job you could get when you graduate.
  2. One or more heavily formatted tables with all courses offered during the study. These table display which semester the course is offered, along with important information like credits and exam dates. This is what students see and the design is crucial for readability as it can be information-dense.
  3. Course descriptions containing perhaps 40 different bits and pieces of information detailing each course.

The plan was for EmWeb to be a true online tool. However, our customers thought differently, and one morning our biggest client called us and asked how they could produce a printable document. They wanted to ship a well-formatted PDF to a printing press, and the deadline was very close. They thought “We already have all the data. How hard can it be?” Anyone that has tried making true reproductions from screen to paper knows it’s hard, very hard. Especially when you’re talking about documents with potentially hundreds of pages.

My first thought was “How on Earth can we produce a high quality printing press-ready PDF with consistent design across all browsers?” I had worked on printing issues before, but not PDF. I turned to Google and learned about the several libraries that exist to solve this exact problem.

If you want perfect PDFs, you need to reproduce design, structure and color. We couldn’t check off all those boxes until we found DocRaptor, a PDF generation service that uses PrinceXML.

I have always been a huge fan of the Opera browser. When I read that Håkon Wium Lie, father of CSS and co-founder of the Opera browser, was on the development team behind PrinceXML, I knew this was as solid as it could be.

DocRaptor offers a PDF-generating web service that removes all need for local installations of any kind. The documentation and code examples in several different programming languages are clear and concise.

DocRaptor takes all the hassle out of perfect PDF generation, and it really is perfect. All our complex layouts, tables, colours and fonts are preserved. I don’t think our clients appreciate how hard this is, but I certainly do.

Download a sample study program PDF generated by EmWeb

Also, I have contacted support two times since we started using DocRaptor. The response is quick, professional, and courteous. After longer sessions with emails flying back and forth, pitching ideas on how to debug the problem, the tone of the emails started including a bit more friendly banter. This will happen between developers working together, even halfway across the globe via email. This reduced some of my stress and gave me the impression that this guy must enjoy working with what he does. This is always great news for someone in need of support, and I felt well taken care of.

If you need the best, the quality of the PDFs are reason enough to use DocRaptor. Great support also makes it easy to recommend DocRaptor to anyone looking for a hassle-free top-quality PDF-generating service.

Agnar Ødegård
CTO
Norweb AS
Norway

DocRaptor Does Typekit Fonts, Too!

We wrote a series of posts on using Google fonts with DocRaptor a while back. This has proven to be a popular series for us, and we’ve often heard from users who want to add Typekit fonts to their DocRaptor PDFs. This is easy enough to do, provided you have a DocRaptor account, a Typekit account, and the proper license for the font you want to use.

First up, you’ll need to add any domain you want to use to create a PDF with Typekit fonts to your kit settings. This is the referrer that Typekit expects to find to apply the fonts from your kit correctly. For example, here’s how we’ve set this to use Myriad Pro on DocRaptor:

Adding your domain to your Typekit settings

Next, you’ll need to set “javascript” to “true” when making your POST request. This will force DocRaptor to wait until your JavaScripts have finished running on your page before attempting to turn it into a PDF. Since your Typekit fonts are loaded with JavaScript, DocRaptor can’t actually apply them correctly without setting this parameter.

Running JavaScripts in DocRaptor

Finally, you can use Typekit with DocRaptor using both document_content and document_url. Both methods work as expected, with some very minor implementation differences. Let’s check it out!

Typekit’s Javascript depends on your domain URL being set correctly. Typically, this will be the URL you’ve added your Typekit fonts to, but if you’re sending us raw HTML, you can’t actually check the referrer. Never fear - you can simply set this URL when making your POST request. Here’s how it works:

Setting referrer when making a POST request

This parameter must pass a correct URL, otherwise your Typekit fonts won’t actually be applied.

Your other option is to send us an existing URL via document_url. No need to pass referrer in this case, as the URL you’re sending us will already be referenced in your Typekit settings.

And you’re done! Your PDFs never looked so good.

Disclaimer: DocRaptor is not affiliated with Adobe Products in any way, and you must have the proper license to use any Typekit font with DocRaptor. Here’s some more information about Typekit font licensing: Typekit font licensing

Using Highcharts with DocRaptor

Adding Highcharts to DocRaptor PDFs

Spicing up your PDFs with Highcharts charting API is really easy. All you need to do is disable animations in the Javascript that creates your Highcharts graph and set our JavaScript parameter to “true” when making your POST request.

You can grab the input HTML here, and the resulting PDF here.

Let’s take a closer look at how this works!

First, you’ll want to disable animation for any Highchart graph you want to add to your PDF. Animations look really cool in a browser, but DocRaptor expects a static resource to convert into a PDF, and animations aren’t exactly static. You’ll just get a blank graph when attempting to use Highcharts unless you disable animation. Luckily enough, disabling animations is also easy to do:

Disabling animation in Highcharts

In our example code, we’ve added this to the plotOptions of the Javascript that renders each of these charts:

animation: false,

Next, you’ll need to set “javascript” to “true” when you make your POST request. This parameter uses DocRaptor’s Javascript engine when rendering your document, which will force DocRaptor to wait until your Highcharts graphs are actually loaded before attempting to create your document.

Using DocRaptor’s JavaScript engine

That’s all there is to it! You can see the code that generates the charts used in this example PDF on Highcharts example page.

Setting Image Opacity with CSS

We’ve just deployed an update to DocRaptor that adds support for the opacity property on image elements, support for 16-bit TIFF images and a bug fix for SVG image elements.

Hungry for more details? Keep reading!

Image Opacity Support

You can now set opacity levels on images in your generated PDFs using CSS. No more relying on PNG images with an alpha channel set to your desired transparency level!

Some sample CSS might look like:

.myimage { opacity: 0.5; }

Here’s a screenshot of a generated PDF showing this in action:

Image opacity using CSS

You can check out the input HTML for this PDF here, as well.

Support for 16-bit TIFF Files

You can now add 16-bit TIFF files to your input HTML document. An 8-bit TIFF file supports up to 256 distinct tones to represent a color at each pixel position, whereas a 16-bit TIFF file supports up to 65,536 tones.

This difference in tonal ranges is most obvious in large areas with gradually changing tones, such as blue skies. You’ll notice banding issues more quickly when working with 8-bit TIFF files, but the greater tonal range of 16-bit TIFFs makes these transitions much smoother.

16-bit TIFFs are larger files, but also print better than 8-bit images. Pretty handy if you’re creating PDFs containing images with a lot of subtle gradations.

SVG Fill-Opacity Bug Fix

Last but not least, this update also fixes a small bug with SVG images. The fill-opacity property on paths was being incorrectly applied to subsequent images, but no longer! Fill-opacity will now respect your wishes.

We’ve got a lot of cool updates on the way. Some of our users have contacted us with interesting use cases, and we’re working on some neat tutorial posts for further customizing your generated PDFs using DocRaptor with third party libraries and tools. Stay tuned!

Gone in 60 Seconds: How We Moved From Linode to AWS With Less Than a Minute of Downtime

Recently we moved DocRaptor from Linode to AWS, increasing average document creation speed by 6.9%, reducing network errors by an incredible 84.1%, and saving ourselves a ton of devops time related to scaling. Our move to AWS also allowed us to double our simultaneous request limit, scale up to meeting customer demand in a matter of minutes instead of hours, and gave us more time to help you create the documents you need.

DocRaptor is an infrastructure service, so our customers expect us to be up and creating PDFs and Excel docs at all hours. When we had datacenter issues 2 weeks in a row with Linode, we knew it was time to switch. We thought through all the details that needed to be accounted for, and started working on a migration plan. Over the course of a few weeks this plan solidified, and we scripted portions that could be automated.

The week before the migration we ran through the whole plan a few times, only excluding steps that would actually break production. This revealed a few small things that we missed, and highlighted a few more things we could automate, but mostly gave us confidence that our plan was going to work.

Even though we were migrating from Linode to AWS, most of our plan is data center agnostic, and we hope anyone who is thinking about migrating data centers will benefit. Here’s a step by step account of what our data center migration from Linode to AWS looked like.

Minimizing Downtime

One of our core goals for this migration was keeping customer impact low, and we spent a bunch of time planning to make our window of downtime as small as possible. We achieved this by breaking the migration down into several smaller, easily reversible steps, as well as having many tests along the way to verify that we were able to move from one step to the next with confidence. This careful approach allowed us to fully migrate DocRaptor from Linode to AWS with less than 60 seconds of downtime.

Testing AWS

We needed to verify that DocRaptor was setup and configured correctly in the new data center. We began by making a ton of documents on AWS and verifying these documents came out perfectly. We take customer privacy very seriously, so it was important to implement document verification in an automated way so even we didn’t need to manually inspect every document to verify it looked good.

Luckily most of the documents we generate produce the same binary file when generated multiple times (excluding a randomly generated id field in the PDF). This allowed us to verify AWS was working correctly on most documents, but there were a handful with different binary files. We needed to know why these documents were different, so we did some science.

Verifying Document Creation

DocRaptor stores a log of your document generation so you can get more detail if you run into issues. The reason we saw a few identical documents with differing binary files was minute differences in asset loading. Sometimes a customer has network or asset loading issues. Obviously if an image or a remote font fails to load the resulting documents are going to differ. This meant that we actually needed to test the variation within both systems, and compare those in order to know that AWS was performing at least as well as Linode.

Fortunately for us there are automated visual comparison tools. ImageMagick has a great tool called Compare that accepts two input images and returns a number indicating how similar the two images are (PHASH). Here you can see an example of an image diff using different font options from our example invoice on our try it out page.

Image diff using Compare

This technique allowed us to set a threshold, automatically run PDFs through Linode and AWS, convert them to images, and compare the PHASH values to our threshold. We sampled a huge number of customer documents and ran this comparison process on all of them. Because customer assets sometimes vary from request to request, we made our comparison more statistical by running the same document through both Linode and AWS multiple times, then checking the variation within each group.

Lo and behold, this finally gave us the confidence we were looking for that AWS was generating great looking documents, but what about when DocRaptor doesn’t produce a PDF or Excel document?

We still needed to verify that invalid input was handled the same between the AWS and Linode clusters. DocRaptor produces helpful error messages on a variety of invalid inputs such as document timeouts, invalid html, and javascript timeouts. While running documents through both services, we compared the error messages returned and verified that they were the same.

This was a great opportunity to gather statistics on how long it took to generate these documents, as well as any network errors that occurred. This experiment proved that on average we were producing documents 6.9% faster on AWS, with a whopping 84.1% fewer network errors!

6.9% faster

84.1% fewer network errors

Automating Our Infrastructure

DocRaptor is a complex system with a ton of moving parts. Different tools are required to do HTML parsing, Javascript execution, font handling, and PDF or Excel generation itself. We’ve been successful until now by cloning servers with the same configuration, but this method has quite a few downsides. Each new server still required some manual setup. We also didn’t clearly document and codify our setup procedure for each new server. As anyone who’s written software before knows, manual steps are a recipe for problems. It was time to fully automate our app setup!

Automating from scratch can definitely be a hassle, but we had already automated many of the big components thanks to our recent infrastructure work on Gauges and Instrumental. We’ve automated the setup and deployment of those apps using a combination of self written scripts and Puppet. We just had to add DocRaptor specific setup to our automation.

With very little effort we were able to spin up a whole new cluster, including web servers and background worker servers. Using a combination of Puppet scripts and Capistrano tasks we were able to reduce the spinup of a new server (or even a completely new cluster) to a matter of minutes. This work has already helped us test new code internally by making it incredibly easy to spin up a new server to use for testing.

Migration details

Below we’ve broken down our migration plan into sections with some explanation of each step. Want to check out our entire migration plan? We’ve added a link to a gist at the bottom of this post.

Estimated total time to execute: 2hr not including pre-replication

Ensure performance test make_doc script has production endpoint Comment out deploys tests, DO NOT COMMIT

To ensure updates that we make to DocRaptor don’t cause issues we have a set of tests that run against production after every deploy. For this migration, however, we tested out of band in order to make deploys faster and limit any downtime.

Setup SSH tunnel between AWS background-001 to Linode MySQL
Replicate Linode MySQL to AWS (3hr+) as docraptor-production
One-time copy non-queue Redis data
Verify AWS endpoint working
  	  ./script/service_test http://aws.docraptor.com && ./script/service_test https://aws.docraptor.com

We’re using automated tests here to verify everything is working correctly.

-- Wait Till Day of Maintenance --

Tweet: Reminder: we'll be doing maintenance from 2-3pm EDT today.

Definitely wanted to keep all of our customers informed and post notice just in case there were issues.

Connect AWS to Linode MySQL
  Fallback: (AWS BRANCH) cap production resque:stop
Verify AWS endpoint working
  ./script/service_test http://aws.docraptor.com && ./script/service_test https://aws.docraptor.com

Here we pointed both AWS and Linode servers to our Linode mysql instance so everything was using the same persistent data store. Note that our fallback is one simple command that runs very quickly. If the verification test had failed there would only have been a few seconds of downtime at this point and we could then investigate and resolve at our leisure.

Mysql switch

Switch to old Instrumental API key in app and automation
  be sure to restart instrument_server

Up until this point we had AWS connected to a test application monitoring project so it didn’t conflict with the DocRaptor production monitoring data. Since AWS was now using the main persistent store we want all our data in one place.

Start continuous testing against production endpoint
Direct Linode LB to AWS ELB for 10 seconds
  Verify production traffic hits Linode
  Update Linode LB config to point to AWS
  Verify production traffic hits AWS
  Verify queued jobs in AWS
  Update Linode LB config to point back to Linode
  Verify production traffic hits Linode
  Verify queue draining on AWS

  Fallback: Manually move some jobs from AWS -> Linode Redis
Stop continuous testing (TODO: COULD BE MOVED LATER)

Here we’re running a small amount of production traffic through AWS using nginx as a proxy. This allows us to test AWS connectivity between all servers other than mysql and verify documents continue to be processed. In our configuration we had two different queues: one in Linode and one in AWS, so enqueued documents would automatically drain out of the old queue as they were worked. We also switched from Passenger to Unicorn as part of the migration, so we used that fact as a quick additional check to verify which service was actually handling requests. Again, fallback is easy, and in fact this entire step is basically a practice fallback for the next step.

Direct Linode LB to AWS ELB permanently
  cp /opt/nginx/conf/nginx.production.lb.conf.new /opt/nginx/conf/nginx.production.lb.conf
  /etc/init.d/lb_nginx configtest

  /etc/init.d/lb_nginx reload
  curl -I http://docraptor.com | grep Pass  # should be empty
  curl -I https://docraptor.com | grep Pass # should be empty and no ssl error
  Fallback: Same as the 10 second one above
Verify production endpoint working
  ./script/service_test http://docraptor.com && ./script/service_test https://docraptor.com
Verify no traffic hitting Linode app servers
Verify Linode Resque is drained
Make a new branch of aws_migration
  with AWS MySQL + Port
Deploy out-of-band AWS app instance pointed at AWS MySQL
  # uncomment, DO NOT COMMIT web-oob enabled!!!
  cap production deploy HOSTFILTER=web-oob.docraptor.com
  # recomment web-oob so we do not deploy it in the next steps
  # verify connection to correct MySQL:
  eb ssh web-oob.docraptor.com
  /data/docraptor/current/script/rails runner 'User.last; puts `lsof -i -p #{Process.pid}`'
Deploy AWS MySQL with pause to AWS instances
  cap production deploy deploy:web:enable
Clear failed jobs on Linode

We’ve verified that production traffic is handled correctly through AWS, so we’ll start forwarding all traffic through AWS. In order to minimize downtime we spun up a new out-of-band server that ran the new AWS-only code (web-oob), but didn’t add it to our AWS load balancer. This will allow us to later run tests against AWS without waiting for new code to be deployed to the AWS web servers. Another optimization to note here is the deploy of the new code to the in-band AWS servers has code to pause right before restarting the DocRaptor services. This is another way we kept downtime low, by essentially decreasing the “deploy time” to only as long as it takes to restart Unicorn and Resque.

Stage and switch using out-of-band server

-- Must Be In Maintenance Window --

Enable maintenance mode in PagerDuty
  Pingdom
Tweet: We're starting maintenance now, you may see intermittent errors over the next hour.
Run dc_switch cap task
  cap production dc_switch
  Fallback: is automatic
Continue paused deploy
  Issue?: cap production deploy:restart
Verify no passenger
  curl -I http://docraptor.com | grep Pass   # should be empty
  curl -I https://docraptor.com | grep Pass # should be empty and no ssl error
Verify production endpoint
  ./script/service_test http://docraptor.com && ./script/service_test https://docraptor.com
Requeue failed jobs
Stress test (10min)
  ./script/performance.rb old 1000000 pdf small | tee -a tmp/performance-old-pdf-small-final-switch.log
  Fallback: switch Linode LB to point to Linode apps (will lose data)
Wait a safe period (1hr?)

-- Maintenance Complete --

This is the only part of the migration where there is downtime. We need to switch which mysql instance we’re using, and we want to ensure replication has caught up when we do so. We inform customers that the maintenance window is beginning. We codified the majority of the actual switching procedure in a capistrano task called dc_switch, which allowed us to fail back to Linode very quickly in the case that anything went wrong.

Disable maintenance mode in PagerDuty
Tweet: Maintenance complete! Please enjoy your regularly scheduled document service :)

Scheduled maintenance is complete! Party Time! Everything below this point doesn’t have to be done immediately, so the pressure is off. Whew!

Enable cloudfront
Move cron jobs from Linode to AWS
Remove out-of-band server and cleanup cap tasks
Remove deploy pauser
Enable gitflow
Switch DNS
Wait at least 48 hours
Move any outstanding temporary storage files
Shutdown linode non load balancer boxes
Possibly wait more
Verify no traffic hitting Linode load balancers for 1 day
Shutdown Linode
Remove Linode boxes from AWS security groups

And there you have it, a sub-60-second downtime data center migration!

Want to check out our full migration plan? Here’s a gist we put together while working on our data center migration.

Data center migration is always a nerve wracking process, but by carefully planning and limiting your chance for downtime you can have a much less stressful time.

Our New Help Request Feature

Let’s talk about our new help request feature!

Here’s what a typical support request chain looks like: a user has a problem, and emails support. We receive the email, then request extra information, usually a doc log ID and the input HTML. The user provides this information, and then we can start solving the problem. This means a lot of back and forth emails between a user and a friendly dinosaur.

We wanted to simplify this support process, and make it easier for you to send us the information we need to help troubleshoot a problem document.

Setting “help” to “true” when you make your POST request automatically opens a ticket and sends us your document log, input HTML and generated document. You’ll get a confirmation email, and we’ll be able to start working on your ticket a lot more quickly.

Here’s our documentation for making a help request.

Don’t worry - you can still contact us the old fashioned way if you prefer. We love talking to you guys!

Retinafy Your Olark Attention Grabber Image

DocRaptor added Olark as a means of support a couple months back and it has been really great for talking to our new customers and existing customers. With the onset of mass availability of retina devices, we also went through our entire site and upgraded all our assets to look good on the devices of today. If you’re interested how to do that, check out Thomas Fuch’s Retinafy book. Unfortunately one thing that didn’t look that great was the Olark attention grabber (the little DocRaptor icon you see in the lower right of your screen, unless you closed it).

So if you’re in the same boat as us and want your customer support to look as good as the rest of your site, here are the steps you might take. Let’s start with an image. Let’s say I have a 100px x 127px version of that uploaded to Olark and my attention grabber settings are set correctly. By default I will get something that looks like this:

small alien image

You can immediately the aliasing difference between the alien image vs. the text and chatbox element. We want to fix that. The relevant details of this are explained a little more on this page about attention grabber customization. We have to combine that with a little custom CSS to override the default background styles. Note this will also allow you to set different images for when you are available for chat and not.

// some olark stuff ^^
var retina_image_path = "/app/assets/green_alien@2x.png"
olark.configure('CalloutBubble.bubble_image_url', retina_image_path);
olark.configure('CalloutBubble.offline_bubble_image_url', retina_image_path);
olark.configure('CalloutBubble.bubble_width', 100);
olark.configure('CalloutBubble.bubble_height', 127);
// maybe some other olark stuff
olark.identify('0000-000-00-0000');
#olark-callout-bubble, #olark-callout-bubble-offline {
  background-size: 100px 127px !important;
}

In the end, we get something that looks much nicer: small alien image

Caveats: This will not work on older browsers (IE8 and below, for example) due to no support for background-size.

Making Printable PDFs Just Got Easier

We’ve recently made some changes to DocRaptor that make it much easier to generate easily printed PDFs: fully customizable crop marks and transparency support for TIFF images!

You can now completely customize the length and offset of your crop marks, allowing you to create a trim area on each PDF that will match printer specifications exactly. Custom dimensions for crop marks are set using two new CSS selectors: prince-mark-length and prince-mark-offset.

TIFF files with transparency are supported now, in both CMYKA and RGBA color.

Let’s take a closer look at how you can use these features when generating PDFs!

A PDF with customized crop marks

The default length of crop marks is set to 24pt, and customizing this value with CSS is easy. The syntax looks like:

prince-mark-length: length

Here’s a sample rule, setting the length of crop marks to 30pt:

@page { prince-mark-length: 30pt; }

Not surprisingly, customizing the distance between your crop marks and your trim box is easy, too:

prince-mark-offset: auto | [ length ] {1..4}

A sample integration might look like this, setting the offset between crop marks and the page to 6pts:

@page { prince-mark-offset: 6pt; }

You don’t have to do anything fancy with CSS to add transparency support for TIFF files. Transparency is set when you create the TIFF file, and DocRaptor will respect your wishes when you send the HTML document our way.

We created a sample PDF with customized crop marks to show you how this looks. This PDF also shows off transparency support for a TIFF file, and you can grab the source HTML for this generated PDF.

We’re always working to improve DocRaptor, so keep your eyes peeled for more feature updates!

A Better Dashboard, Part I - Your Document Usage

Tracking how many documents you’re making just got a lot easier! We’ve been overhauling the design of our user dashboard, and we’re excited to talk about how these changes can help you. Let’s get started by taking a look at your usage page!

More useful document generation stats

If you guys like meters and charts, I’ve got great news for you. In the upper right you’ll see a bar showing how manydocuments you’ve made this month. Below that, we’ve got more meters tracking how many production and test documents you’ve created, and how close you are to your monthly capacity. Don’t worry - you’ve still got unlimited test documents.

Track your monthly documents

We wanted to make it easier for you to track how many documents you create in a month, as well as exposing how many test and production documents you make in a month. This graph lives below your monthly tracking meters, showing what your daily document creation looks like.

We hope you guys love the changes we’ve made to your dashboard as much as we do! What could we add to make your dashboard more useful? Let us know!

This is the first in a series of blog posts that will dig into our user interface changes and how you can use them to geneerate better looking documents with less trouble than ever before. Same great PDF and Excel generation service, fresh new coat of paint.

Stay tuned for the next post in this series, when we talk about the improvements we’ve made to your document logs page!

Convert HTML5 to PDF with DocRaptor!

DocRaptor is now running Prince 9.0 in production, which adds full HTML5 support, better CSS3 support and faster Javascript. Many of our users have asked about support for HTML5 (the canvas element in particular) and we’re excited about this update.

All new accounts will default to using Prince 9.0. If you’re an existing user and you’d like to add HTML5 elements to your PDFs, you’ll need to log in and update your default Prince version setting. You can find this option under “Edit Profile” in your Account Dashboard.

Selecting Prince 9 as your default

Be aware that updating your Prince version may break your existing styles, and you may need to do a bit of tweaking to get HTML5 elements working with templates created prior to this update. Don’t worry! You can test Prince 9.0 by setting this optional parameter when creating a PDF: Prince Version

While support for HTML5 and the canvas element are the most exciting parts of this deploy, Prince 9.0 is a big update and brings a lot of new features and bug fixes to our service. You can read through the entire change set here:

Prince 9.0 change set

Want to see what kind of output you can expect with DocRaptor? Take a look at some sample documents on Prince’s website:

Sample generated PDFs

As always, let us know if you run into any problems, or need a hand!

Running Javascripts Before Generating PDFs

So maybe you want to use Javascript to spice up your PDFs prior to generating them. Many of our users need to generate PDF documents with more complicated output than HTML and CSS support. DocRaptor’s Javascript support to the rescue! Running arbitrary Javascript in your document prior to creating it is incredibly simple.

Take a look at this file, for example. You’ll notice it’s mostly Javascript (over 1600 lines of Javascript, in fact) and it produces a QR code you can scan to be taken to DocRaptor’s site.

The bulk of the Javascript allows for the creation of the code, and this line near the bottom of the script determines where the code sends a user:

qr.addData("http://docraptor.com");

You can change this to be any site you wish, and the QR code will automatically update.

Getting DocRaptor to run any Javascripts prior to generating your PDF is easy. Simply pass this optional parameter prior to creation:

Running Javascript in DocRaptor

By setting another parameter, you can use Prince’s built-in Javascript engine. Prince provides documentation for several useful Javascripts, such as getting a total count of pages in a PDF, or generating tables of contents. You can see an entire list of these Javascripts here:

List of Prince Javascripts

To use Prince’s built-in Javascript engine, you’ll need to pass this parameter when creating a PDF using DocRaptor:

Prince Javascripts in DocRaptor

RAZUR Case Study

We’ve recently been working with a client who needed to implement dynamic PDF generation across multiple marketing pages. This was an interesting use for DocRaptor, and we thought it was a great opportunity for a case study. Chris Merkle, CEO of RAZUR, was kind enough to share his story with us.


Client: RAZUR

www.razuragency.com

Background

RAZUR was in the market for an HTML to PDF solution for their Client, an industry-leading managed security service provider. After reviewing multiple options, including developing their own PHP-based web solution, RAZUR choose DocRaptor.

“Our development team choose DocRaptor because it was cost-effective and allowed us to rapidly roll-out a solution for our Client,” says Chris Merkle, CEO of RAZUR.

The goal was to dynamically generate marketing collateral from content that was already stored in a content management system, without having to change static PDFs every few weeks.

Solution

By using DocRaptor’s URL-based solution and a highly-customized CSS3 stylesheet RAZUR was able to create a flexible template that would scale across 25+ marketing web pages. Hidden HTML “div” tags were used as wrappers to dynamically include “About Us” content, custom call-outs and page titles while adhering to the Client’s strict brand guideline.

Results

RAZUR’s Client is now able to provide their sales team and website visitors with elegant, always up-to-date PDF brochures with the click of a button. By using DocRaptor to generate PDF files, RAZUR was able to create a solution that saved hundreds of hours of content design; the Client no longer had to make changes to the slew of marketing collateral every time the web content changed.


We’re always happy to help developers implement DocRaptor. Our goal is to make it simple to convert HTML to PDF and Excel, and it’s great to hear from satisfied clients like RAZUR.

Do you have an interesting DocRaptor implementation story? Contact us if you’d like to share!

What if Debugging was Easier?

We just made it a lot easier to prototype with DocRaptor! We’ve implemented a new feature that will email errors to you when a PDF or XLS file fails to generate as expected. This feature makes it simpler for new users to implement high quality PDF and Excel generation, as well as helping our established users deal with unexpected problems.

Here’s an example of what these emails look like:

Sample Error Email

We include the error DocRaptor returned as well as a link to the document log for the failed document. By clicking through, you’ll see all the information we have for that document, including any problems or errors in DocRaptor’s attempt to create it.

All new users will automatically be opted in to receive error emails, but existing users will have to opt in. Or maybe you already receive a ton of email, and this feature isn’t super helpful to you. It’s easy enough to turn off. You can make your selection by clicking the “Edit Profile” link in your account dashboard. Behold:

My Profile Page With Error Email Opt In

That’s all there is to it! This feature will notify you when something breaks in production, reduce the time spent on debugging, and make it easier for our support team to assist you if you get stuck.

Generating Landscape Format PDFs

DocRaptor makes it easy to generate PDF files with customized sizes and orientations. Our service uses Prince XML for PDF generation, and setting customized sizes is simply a matter of using optional CSS rules.

Here’s some sample code that will generate a US Letter sized PDF in landscape format:

<html>
  <head>
    <style type="text/css">
      @page { margin: 1em; }
      @page { size: US-Letter landscape; }
      body { font-size: 200%; }
    </style>
  </head>
  <body>
    Behold: a landscape oriented PDF!
  </body>
</html>

Landscape format PDF

You can use optional Prince CSS parameters to create PDF files with a wide variety of standardized sizes and formats, as well as setting your own custom dimensions.

Here’s a link to Prince’s documentation, with a full list of their optional page size CSS parameters:

Optional PDF page sizes

HTTP Timeout Option

Most DocRaptor customers have complete control over the assets they use in their assets, but one of our newer customers is aggregrating content from a variety of marketing material. All of the assets from that material may or may not be available. Bloomingdale’s is not going to keep their May, 2012 email marketing content online forever.

Enter http_timeout, the latest feature available for DocRaptor. By default, we try to fetch an external resource for up to 60 seconds. With this option, you can set the timeout to whatever you want. As an example, here’s a contrived document that takes a long time due to timeout

curl -H "Content-Type:application/json" -d'{"user_credentials":"YOUR_API_KEY_HERE", "doc":{"name":"docraptor_sample.pdf", "document_type":"pdf", "test":"true", "document_content":"<html><body><img src='http://docraptor-callbacks.herokuapp.com/slow'></body></html>"}}' http://docraptor.com/docs > docraptor_sample.pdf

If you run that command, it will sit there for quite a while waiting for a timeout. Let’s try setting the timeout to 5 seconds:

curl -H "Content-Type:application/json" -d'{"user_credentials":"YOUR_API_KEY_HERE", "doc":{"name":"docraptor_sample.pdf", "prince_options":{"http_timeout":"5", "version":"8.1"}, "document_type":"pdf", "test":"true", "document_content":"<html><body><img src='http://docraptor-callbacks.herokuapp.com/slow'></body></html>"}}' http://docraptor.com/docs > docraptor_sample.pdf

Much better.

The important part is

"prince_options":{"http_timeout":"5", "version":"8.1"}

This option only works if you’re Prince 8 (or newer). If you’re using an older version, the option is ignored. It’s not necessary to specify the Prince version if you’re already using 8.1 as your default. If you’re not sure, check your profile page. It’s easy to switch!

Prince 8 - 8 Times More Princely!

Good news, everyone! DocRaptor has been upgraded to Prince 8.1.

The latest version of Prince XML provides better CSS support, which means DocRaptor produces even better PDFs. Prince 8.1 is faster than our current installation of Prince 7.1, and supports CSS3 transforms and has experimental support for HTML5.

For a full list of the changes, check out the change log on Prince XML’s home page:

Prince 8.1 change set

A bit worried about Prince 8.1 breaking your Prince 7.1 layouts? Never fear! We are running both Prince 7.1 and Prince 8.1 side by side, which means you can still use Prince 7.1 to generate documents, if you prefer.

All new DocRaptor accounts will default to Prince 8.1, while all existing accounts will default to Prince 7.1. You can change this setting for all documents you create, or set it per document. Here’s a screenshot showing how to set this parameter globally:

My Profile Page With Default Prince Options

You can use our new Prince version API parameter to select which version of Prince you want to use for each document.

In the near future, we’ll provide improved support for Javascript DOM properties, but this feature has not been implemented with this update.

Don't get caught in the Jurassic Period. Upgrade to DocRaptor Gem 0.2.0

Good news, friend! DocRaptor Gem v0.2.0 Released today. Check out the new Gem with test suite and ARec-style ! methods. Here’s the repo. You can see code changes here.

Gem change log for 0.2.0: * tests! * added a create! method which will raise an exception when doc creation fails * added a list_docs! method which will raise an exception when doc listing fails * added a status! method which will raise an exception when getting an async status fails * added a download! method which will raise an exception when getting an async status fails

Have your own incredible idea? Send an email with feedback to the DocRaptor dinosaurs or fork it and send us a pull request.

DocRaptor Does Google Webfonts Part 3: Rails Edition

Mark from Carriemail asked about using Google Webfonts</a> in the DocRaptor Rails Example (clone it from github).

Now that the link-fest is over, let’s talk about how to do that (it’s really easy).

In app/views/index.pdf.haml (original), I add the following to the <head> of the haml file used to make the PDFs:

%link{:href => "http://fonts.googleapis.com/css?family=Cantarell|Gravitas+One&v2", :rel => "stylesheet", :type => "text/css"}
%style{:type => "text/css"}
  = "h1 { font-family: 'Gravitas One', cursive; font-weight: bold; }"
  = "th { font-family: 'Cantarell', serif; font-size: 16px; }"

I also altered the table to use th and thead tags. The changes are on the google-web-fonts branch at github.

I added a couple of items to the sample app, and here’s what it looks like after the change (non-test document and font-size bumped).

Listing Examples

Note: If you’re planning to use this in production, I highly suggest using Rails Layouts and external stylesheets to manage your common code and style settings.

Thanks for the question, Mark!

Links: * The PDF produced * The github branch * The updated view

DocRaptor Does Google Web Fonts Part 2 - Electric Boogaloo

In my original Google Web Fonts post, I gave a simple example of using Google Web Fonts via their CSS downloads. This post deals with using Google Web Fonts via javascript. Yay!

I’m going to pull the js example from the Webfont Loader page. It is reproduced below. Note: I have altered the sample to remove some CSS rules to make it easier to follow.

<html>
  <head>
    <script type="text/javascript">
      WebFontConfig = { google: { families: [ 'Tangerine', 'Cantarell' ] } };
      (function() { 
        var wf = document.createElement('script');
        wf.src = ('https:' == document.location.protocol ? 'https' : 'http') + '://ajax.googleapis.com/ajax/libs/webfont/1/webfont.js';
        wf.type = 'text/javascript';
        wf.async = 'true';
        var s = document.getElementsByTagName('script')[0];
        s.parentNode.insertBefore(wf, s);
      })();
    </script>
    <style type="text/css">
      .wf-active p { font-family: 'Tangerine', serif }
      .wf-active h1 { font-family: 'Cantarell', serif; font-size: 16px; }
    </style>
  </head>
  <body>
    <h1>This is using Cantarell</h1>
    <p>This is using Tangerine!</p>
  </body>
</html>

If you’re using the DocRaptor gem example from the DocRaptor Examples repo, you could change the PDF block to look like this (assuming you had saved the above html to a file named google-fonts-js.html in the directory with doc_raptor_gem_example.rb):

File.open("google-fonts-js.pdf", "w+") do |f|
  f.write DocRaptor.create(:document_content => File.read("google-fonts-js.html"),
                           :name => "google-fonts-js.pdf",
                           :document_type => "pdf",
                           :test => true,
                           :javascript => true)
end

After running the file through ruby, you should end up with a PDF named google-fonts-js.pdf that looks like this (caveat: I bumped up the font size, centered the text, and turned off test mode for this picture):

Generated PDF Image

More beautiful fonts in your PDFs with ease!

Downloads: * The HTML * The PDF produced

No Margin PDFs with DocRaptor

We had a support ticket come in this morning about how to make a PDF with DocRaptor having no margins. What follows is a little background information and the simplest way to achieve the result.

Page Styles in Prince XML and DocRaptor

To understand how to do this, you need to know DocRaptor uses Prince XML to produce PDFs. Prince has several interesting additions to standard DOM/CSS, the most important of which for this tutorial is page styles.

Prince allows us to set styles on a @page element. That element corresponds to an element that wraps everything else that will appear in your PDF. An analog of this you are probably familiar with is the page styles you can set when physically printing documents. I encourage you to poke around the @page-related documentation on Prince’s site. Really cool stuff!

Setting Your Page to Have No Margins

Anyway, back to the tutorial. What we want to do is set our page to have no margins, as Prince’s default works out to around an inch/2.5cm. Here’s some HTML and a picture of the document that it generates.

<html>
  <head>
    <style type="text/css">
      @page { margin: 0; }
      body { font-size: 200%; }
    </style>
  </head>
  <body>
    This content is right up against the edge of the page!
  </body>
</html>

What it looks like:

no margin example

It’s as easy as that. Check out the downloads below to see the actual documents.

Downloads: * The HTML * The PDF produced

DocRaptor Does Google Web Fonts

Google Web Fonts are an easy way to add some panache to a website. With DocRaptor’s javascript and font-face support, they’re an easy way to add some style to a PDF.

Google’s Getting Started page has a a very simple example (reproduced below) that can be used to generate a simple PDF.

<html>
  <head>
    <link rel="stylesheet" type="text/css" href="http://fonts.googleapis.com/css?family=Tangerine">
    <style>
      body { font-family: 'Tangerine', serif; font-size: 48px; }
    </style>
  </head>
  <body>
   <h1>Making the Web Beautiful!</h1>
  </body>
</html>

If you’re using the Ruby Example from the DocRaptor Examples repo, you could change the PDF block to look like this (assuming you had saved the above html to a file named font-sample.html in the directory with straight_ruby_example.rb):

File.open("google-fonts-ftw.pdf", "w+") do |f|
  f.write DocRaptor.create(:document_content => File.read("font-sample.html"),
                           :name             => "google-fonts-ftw.pdf",
                           :document_type    => "pdf",
                           :test             => true)
end

After running the file through ruby, you should end up with a PDF named google-fonts-ftw.pdf that looks like this (caveat: I bumped up the font size, centered the text, and turned off test mode for this picture): PDF generated image

And that’s it. Beautiful fonts in your PDFs with ease!

Headers and Footers in PDFs with DocRaptor

DocRaptor Support gets asked a lot if there’s support for headers and footers. The answer is a most definite YES!

We use Prince XML to product PDFs. As such, there are some specialty CSS rules that can be used to create consistent headers and footers for your PDFs. Here’s the Prince XML page header and footer documentation.

A simple example and output are below.

<!DOCTYPE html>
<html lang="en-US">
  <head>
    <meta charset="utf-8">
    <style type="text/css">
      @page {
        font-size: 200%;
        @top {
          content: "header";
          background-color: #eee;
        }
        @bottom {
          content: "footer";
          background-color: #eee;
        }
      }
    </style>
  </head>
  <body>
  </body>
</html>

After running it against DocRaptor, you should get a doc that looks like this: header and footer sample image

Downloads