Skip to content

Chapter 1: Introduction

ReportLab's solution solves several central problems that ebusinesses face in creating publishing caliber reports that are customized, produced in real time, in volume, and platform independent. Existing reporting tools are limited to database reports, are typically Windows-based, have problematic restrictions on layout and graphic design, and go straight to a printer. More complex publishing systems involve pipelines of applications which are simply too unwieldy for real-time use in large scale environments.

ReportLab's product suite allows direct creation of rich PDF reports on web or application servers in real time. The tools run on any platform, can actively acquire data from any source (XML, flat files, databases, COM/Corba/Java), place no limits on the output, and facilitate electronic delivery and archival. The ReportLab suite lets you define your own business rules to automatically create custom online reports, catalogs, business forms, and other documents

RML2PDF is a central component of the toolkit: a translator which converts high level XML markup into PDF documents. Report Markup Language describes the precise layout of a printed document, and RML2PDF converts this to a finished document in one step. In a dedicated reporting application, other components of our toolkit handle data acquisition and preparation of the RML document.

RML2PDF on its own also fills a key technology gap. Our full toolkit relies heavily on the Python scripting language. Nevertheless we recognize that IT departments and software houses have their own distinct skill sets and development tools. A company may already have developed a rich 3-tier architecture with the key business data in Java or COM objects on an application server. All they need is the formatting component. They can use exactly the same techniques they use to generate HTML (XSLT, JSP, ASP or anything else) to generate an RML file, and the program turns this into a finished document. Fast.

Unlike a number of other formatting languages, RML aims squarely at corporate needs. Paragraph, table and page styles are kept in independent 'stylesheets', allowing reuse and global changes across a family of documents. The table model has been designed for efficient rendering of business data. And a plug-in architecture lets you easily develop and add in custom vector graphics or page templates within the same tool set.

RML2PDF can also work in tandem with our PageCatcher product. PageCatcher is a support tool which extracts graphical elements from PDF files for inclusion in documents generated by RML2PDF or the ReportLab core API. Since any external program with the ability to print can produce PDF files, this means that a ReportLab document can include graphical elements created by virtually any program. These imported elements can be combined freely with text or graphics drawn directly into the document. For example an application can import pages from a government tax form and draw text in the spaces provided to fill in the form. The resulting document can then be combined with a cover letter at the beginning and supporting tabular data at the end -- all in a single PDF document.

1.2. Installation and Use

To avoid duplication, the full installation instructions are always on ReportLab's web site at this address:

http://www.reportlab.com/software/installation/

RML2PDF is a compiled Python programming language module. It can be used with options from a command line, and also has a programmable API interface and may be used as a component of a larger Python language installation. Since Python integrates with a wide variety of other languages, it is also possible to access RML2PDF from C and C++ programs, COM and many other environments.

RML2PDF is delivered as part of ReportLab's rlextra package and licensed under the name ReportLab PLUS. This package depends on our 'reportlab' package and some other open source libraries, all detailed on the above installation page.

RML2PDF requires a license key file to work in production mode. Without the license key each page produced by RML2PDF will be visibly marked as an "evaluation" copy, and the file will be annotated invisibly as produced for evaluation purposes as well. With a valid license key file present, RML2PDF will run in production mode and the PDF file generated will contain the licensing information. You can purchase a ReportLab PLUS license using your user account on our website http://www.reportlab.com. Once we issue you a '.pyc' license file you will need to install it somewhere on your PYTHONPATH so that rml2pdf can find it.

Running RML2PDF from the command line

RML2PDF can be run from the command line, provided that you place it on your path. We normally ship this module in compiled (.pyc) format, so you need a Python interpreter of the correct version to run it, and need to know where it was installed. The installation process does not currently register a script for you. On Unix, you may wish to add the directory to your path, or create a wrapper script in your bin directory.

python /path/to/rlextra/rml2pdf/rml2pdf.pyc filename.rml

On Windows, .pyc files are normally associated with the most-recently-installed Python interpreter, so you could execute this...

c:\temp> c:\python26\lib\site-packages\rlextra\rml2pdf\rml2pdf.pyc filename.rml

After completing successfully the rml2pdf program will return to a command prompt. The output PDF file should be created in the current working directory.

Calling RML2PDF from Python

RML2PDF can also be called directly from your own Python program using the rml2pdf.go(...) entry point.

There are two main ways the 'go' function can be used - either to generate the resulting PDF file on disk in the file system, or to generate it in memory (useful for web applications returning the PDF directly to the user).

This example uses the 'go' function to create the output PDF file on disk:

from rlextra.rml2pdf import rml2pdf

rml = getRML()  # Use your favorite templating laguage here to create the RML string
output = '/tmp/output.pdf'

rml2pdf.go(rml, outputFileName=output)

This is an example Django web application view generating a PDF in memory and returning it as the result of an HTTP request:

from django.http import HttpResponse
from rlextra.rml2pdf import rml2pdf
import cStringIO

def getPDF(request):
    """Returns PDF as a binary stream."""

    # Use your favourite templating language here to create the RML string.
    # The generated document might depend on the web request parameters,
    # database lookups and so on - we'll leave that up to you.
    rml = getRML(request)

    buf = cStringIO.StringIO()

    rml2pdf.go(rml, outputFileName=buf)
    buf.reset()
    pdfData = buf.read()

    response = HttpResponse(mimetype='application/pdf')
    response.write(pdfData)
    response['Content-Disposition'] = 'attachment; filename=output.pdf'
    return response

The 'go' function has the following interface:

def go(xmlInputText, outputFileName=None, outDir=None, dtdDir=None,
       passLimit=2, permitEvaluations=1, ignoreDefaults=0,
       pageCallBack=None,
       progressCallBack=None,
       preppyDictionary=None, preppyIterations=1,
       dynamicRml=0, dynamicRmlNameSpace={},
       encryption=None,
       saveRml=None,
       parseOnly=False,
       ):
  • xmlInputText must be a string which contains the RML specification for the PDF document to be generated.

  • outputFileName when specified overrides any output file name specified in the xml input text. You may also pass in a file-like object (e.g. a StringIO, file object or web request buffer), in which case nothing is written to disk.

  • outDir (output directory) parameter when present specifies the directory in which to place the output file.

  • dtdDir is an optional DTD directory parameter which specifies the directory containing the DTD for the current version of RML.

  • passLimit of None means "keep trying until done", of 3 means, "try 3 times then quit".

  • permitEvaluations when false disallows the evalString tag for security (e.g. web apps).

  • ignoreDefaults 1 means "do one pass and use the default values where values are not found".

  • pageCallBack is a callback to execute on final formatting of each page - used for counting number of pages.

  • progressCallBack is a cleverer callback; see the progressCB function in reportlab/platypus/doctemplate.

  • preppyDictionary if set to a dictionary indicates that the xmlInputText should be preprocessed using preppy with the preppyDictionary as argument. If preppyDictionary is not None and preppyIterations is >1 then the preppy preprocessing will be repeated preppyIterations times (max of 3) with the same dict, to generate, e.g., table of contents.

  • preppyIterations - see preppyDictionary.

  • dynamicRml is an optional boolean field for whether the RML can be dynamically altered.

  • dynamicRmlNameSpace is for use with dynamicRml. It's a dictionary which you can add variables to for processing.

  • encryption if set it must be an encryption object, for example: rlextra.utils.pdfencrypt.StandardEncryption("User", "Owner", canPrint=0, canModify=0, canCopy=0, canAnnotate=0).

  • saveRml is useful for debugging dynamically generated RML. Specify a filename where the RML should be saved.

  • parseOnly if set to True, will only parse the RML and not generate a PDF.

It is also possible to call rml2pdf from other programming languages (such as C++) by using standard methods for calling a python callable. See the Python Language Embedding and Extension manuals.

NB it is also possible to use the userPass, ownerPass, permissions & encryptionStrength attributes of the document tag to make rml2pdf create an encrypted PDF.

For further information regarding the installation of your version of RML2PDF please see the release notes and READMEs that come with the package.

1.3. What is RML?

RML is the Report Markup Language - a member of the XML family of languages, and the XML dialect used by rml2pdf to produce documents in Adobe's Portable Document Format (PDF).

RML documents can be written automatically by a program or manually using any word processor that can output text files (e.g. using a "Save as Text" option from the save menu). Since RML documents are basic text files, they can be created on the fly by scripts in Python, Perl, or almost any other language.

RML makes creating documents in PDF as simple as creating a basic web page - RML is as easy to write as HTML, and uses "tags" just like HTML. It is much easier than trying to write PDF programmatically.

1.4. What is this document?

This document is a user guide and tutorial for RML. It deals with RML as specified in the RML DTD - . If your installation of RML uses a later version, you will need a later version of the DTD and of this tutorial. Look on the ReportLab website (http://www.reportlab.com) for more details.

This document has been generated from RML. If you need another example of RML in action, look at the file "rml_user_guide.rml" to see how this file was produced.

1.5. Who is this document aimed at?

This document is aimed at anyone who needs to write RML. It assumes that you have some experience with some form of programming or scripting. Basic HTML is fine.

You do not have to be employed as a programmer or have extensive programming skills for this guide to make sense. We have tried to keep it as simple as possible and to minimise confusion.

1.6. Conventions used in this document

It is more technically correct to call the various items in RML "elements", as you do in XML. However, since we're assuming that more people know basic HTML than XML, we'll call them "tags" rather than elements in this guide.

There are also a couple of typographical conventions we'll be using:

constant width

Throughout this User Guide, we'll be using a constant width typeface to highlight any literal element of RML (such as tag names or attributes for tags) when they appear in the text.

A smaller constant width font is used for code snippets (short one or two line examples of what RML commands look like) and code examples (longer examples of RML which usually have an illustration of the output they produce).