Chapter 7: Advanced Text
The <title>
tag sets the title for a document, or a section of a document, and displays it on the page. By default, this is set in a larger typeface than the body text (in a similar way that headers are). You can change the way a title is set by setting a style called style.Title (in the stylesheet section of your document).
[Note: This tag does not affect what is displayed in the "title bar" at the top of a document.]
Example:
<stylesheet>
<paraStyle name="style.Title"
fontName="Courier-Bold"
fontSize="36"
leading="44"
/>
</stylesheet>
<story>
<title>This is the Title</title>
<para>
And it should be set in 36 pt Courier Bold.
</para>
</story>
7.2. Headings -- h1, h2, h3
Headings are also handled in the same way as in HTML. The most important heading level has its text enclosed by <h1>
and </h1>
tags, and less important sub-headings use the <h2>
</h2>
and <h3>
</h3>
tags in the same way.
7.3. Paragraphs and paragraph styles
As well as explicitly placing a piece of text into a certain position on a page using the drawString commands, RML also allows you to use paragraphs of text. Paragraphs are flowables. This means that you don't need to tell RML exactly where every line is going to go on the page - you let rml2pdf worry about that.
To do this you place your text inside the story section of an RML document, and use the <para>
and </para>
tags to tell the parser where each paragraph starts and ends.
As well as delineating where paragraphs begin and end, the
style:
If you have set up a style in the stylesheet section of a document, you can refer to them by name by using the style attribute. For example, if you have defined a style called Normal, you can have your paragraph appear in that style by using.
alignment:
How the text is aligned within the paragraph. It can be LEFT, RIGHT, CENTER (or CENTRE) or JUSTIFY.
fontName, fontSize:
fontName and fontSize set the name and size of the font that you want this paragraph displayed in. (This can often be better done using the <paraStyle>
tag inside a <stylesheet>
, and then using the <style>
tag to apply it to that paragraph). Example: <para fontName="Helvetica" fontSize="12">
leading:
leading is used is used to alter the space between lines. In RML, it is expressed as the height of a line PLUS the space between lines. So if you are using 10 point font, a leading of 15 will give you a space between lines of 5 points. If you use a number that is smaller than the size of font you are using, the lines will overlap.
leftIndent, rightIndent:
leftIndent and rightIndent apply space to the left or right of a paragraph which is in addition to any margin you have set.
firstLineIndent:
firstLineIndent is used when you want your paragraph to have an additional indent on the first line - on top of anything set with leftIndent.
spaceBefore, spaceAfter:
spaceBefore and spaceAfter, as you would expect, set the spacing before a paragraph or after it.
textColor:
This sets the color to be used in displaying a paragraph.
bulletText, bulletColor, bulletFontName, bulletFontSize, bulletIndent:
These are all used to set the characteristics for any bullets to be used in the paragraph.
Inside the story, you can also do a number of things that you can't do with the drawString commands. For a start, you can use bold, italics and underlining. If you are familiar with HTML, you will recognize these tags - and start and stop italics, and start and stop the text being set as bold, and and start and stop underlining.
The font tag
7.4. The font tag
You can also explicitly set the font using the tag. This has the optional attributes of face, color, and size which are all pretty self-explanatory. You need to use a tag to close this before the end of the paragraph. Example:
<font face="Courier" color="crimson">This is courier in crimson!</font>
That example produces this line of text:
This is courier in crimson!
7.5. Superscripts and subscripts
Another thing you can do inside the story is using superscripts and subscripts. You do this with the <super> </super>
and <sub> </sub>
tags. (Superscript is where the text is raised up on the line such as in the mathematical symbol for squared or cubed, and subscript is where it is lowered relative to the rest of the line in the same way). The <super>
tag can also be called <sup>
. Thses tags have optional attributes rise to set the baseline shift (negatively for <sub>
) and size which sets the font size to use in the tag. The default rise is 50% of the font size and the default size is the existing font size - min(2,20% of existing font size). Example:
<sub>This is subscript.</sub>
This is normal text.
<super>This is superscript.</super>
That example produces this:
This is subscript. This is normal text. This is superscript.
whereas this example:
<sub size="6" rise="5">This is subscript.</sub>
This is normal text.
<super size="6" rise="5">This is superscript.</super>
produces this:
This is subscript. This is normal text. This is superscript.
7.6. Lists
RML supports ordered and unordered lists, using the tags <ol> <ul>
and <li>
. They work in a similar way to their HTML equivalents. A list item can be any normal flowable element but there can be only one such item within a pair of list item tags. Lists can be nested.
WARNING: The contents of a list are flowable objects, and the list itself does not know what font sizes or spacing you will use in the enclosed paragraphs. Therefore, if you want to get normal typography, it's very important to define a <listStyle>
with font names, size and spacing matching that of the <paraStyle>
you use for the enclosed text.
You should also be aware that RML's <para>
tag already has a flexible feature named the `bullet` which can provide bulleted, numbered and definition lists which match the corresponding text. In general lists should only be used when you are transforming in a mapping from HTML, or when you need to place arbitrary flowables such as tables or images in the body of a list.
Lists and list items can be styled using tag attributes or with <listStyle>
tags in the stylesheet section. See the rml.dtd for the full list of attributes on the <ul>
<ol>
and <li>
tags using LIST_MAIN_ATTRS.
In ordered lists, you can use the following types of enumeration in the bulletType or start attributes:
'I': Roman numerals (capitals)
'i': Roman numerals (lower case)
'1': Arabic numerals
'A': Capital letters
'a': Lowercase letters
For unordered lists, bulletType must be set to 'bullet'
Unordered lists can use bullet types of the following shapes by setting the 'start' attribute in <ul>
or the 'value' attribute in <li>
tags:
bulletchar
square
disc
diamond
diamondwx
circle
blackstar
squarers
arrowhead
Alternatively any non-whitespace character can be used as a bullet.
As a final possibility blank separated strings of the possible starts can be used to indicated that automatic depth changes should be attempted.
The size, colour and position (indenting, space before/after etc.) of bullets and enumerations can be adjusted with the relevant tag attributes. List item attributes override the attributes on <ol>
or <ul>
tags.
Definition lists are not yet implemented.
A simple example of nested ordered/unordered lists:
<story>
<ol bulletColor="red" bulletFontName="Times-Roman">
<li bulletColor="blue" bulletFontName="Helvetica">
<para>
Welcome to RML 1
</para>
</li>
<li>
<ul bulletColor="red" bulletFontName="Times-Roman" bulletFontSize="5" rightIndent="10">
<li bulletColor="blue" bulletFontName="Helvetica">
<para>
unordered 1
</para>
</li>
<li>
<para>
unordered 2
</para>
</li>
</ul>
</li>
</ol>
</story>
For more examples of how to use lists see 'test_046_lists.rml' in '/rlextra/rml2pdf/test/'.
Using multiple frames
7.7. Using multiple frames
If you have split your page into more than one frame, you can flow text between frames. To do this you use the nextFrame/
tag. This is an "empty" or "singleton" tag - it doesn't take any content. Put in nextFrame/
and your text will continue into the next frame. It should appear outside your paragraphs - between one /para
and the next para
tag. An optional name attribute can be used to specify the name or index of the frame which you wish to switch to.
You can control the automatic switch of frames by using the setNextFrame/
tag. The required name attribute can be used to specify the name or index of the frame which you wish to switch to. The setNextFrame/
tag is an "empty" or "singleton" tag - it doesn't take any content. Put in setNextFrame name="F5"/
and your text will flow into the frame specified. It should appear outside your paragraphs - between one /para
and the next para
tag.
If you have defined more than one kind of template (by using pageTemplate
in the template section at the head of the RML document), you can also force RML into using a new template for the next page. You do this by using the setNextTemplate
tag. This tag has only one attribute - the mandatory one of name, which tells RML which template it should use.
In practice, you would usually set the next template and then use a nextFrame:
<setNextTemplate name="yetAnotherTemplate"/>
<nextFrame/>
7.8. Preformated text -- pre and xpre
One tag that is also a flowable, but that can't be used inside the <para> </para>
tags is <pre>
. Just as in HTML, the pre
tag denotes pre-formatted text. It displays text exactly as you typed it, with the line breaks exactly where you put them and no line-wrapping. If you want to keep any formatting in your text (such as tabs and extra whitespace), enclose it in <pre>
tags rather than para
tags.
You can also pass a style to the <pre>
tag. If you don't use the optional style attribute, anything between the <pre>
tag and the </pre>
tag will appear in the default style for pre-formatted text. This uses a fixed width "typewriter" font (such as courier), and is useful for things such as program listings, but may not be what you want for your quotation or whatever. If you have already defined a style (in the stylesheet section of your RML document), then you can make the <pre>
tag use this for your pre-formatted text.
Example:
<xpre style="myStyle">
this is pre-formatted text.
</xpre>
The xpre is similar to the pre tag in that it preserves line breaks where they are placed in the text, but xpre also permits paragraph annotations such as bold face and italics and font changes. For example, the following mark-up
<xpre>
this is an <i>xpre</i> <b>example</b>
<font color="red">including red text!</font>
</xpre>
generates the following text
7.9. Greek letters
The
Example:
In physics, Planck's formula for black body radiation can be expressed as:
Rλ=(c/4) (8π/λ4) [ (hc/λ) 1/ehc/λkT-1 ]
In RML, this is expressed as:
R<greek>l</greek>=(c/4) (8<greek>p</greek>/<greek>l</greek><super>4</super>)
[ (hc/<greek>l</greek>) 1/e<super>hc/<greek>l</greek>kT</super>-1 ]
For a table of the Greek letters used by the
This next example show features from several of the commands describes in the previous sections; such as the use of frames, the options to the template tag, stylesheets, and so on. See the next section for information on using the
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?>
<!DOCTYPE document SYSTEM "../rml.dtd">
<document filename="example_5.pdf">
<template pageSize="(21cm, 29.7cm)"
leftMargin="2.5cm"
rightMargin="2.5cm"
topMargin="2.5cm"
bottomMargin="2.5cm"
title="Example 5 - templates and pageTemplates"
author="Reportlab Inc (Documentation Team)"
showBoundary = "1"
allowSplitting = "20"
>
<!-- showBoundary means that we will be able to see the limits of frames -->
<pageTemplate id="main">
<pageGraphics>
</pageGraphics>
<frame id="titleBox" x1="2.5cm" y1="27.7cm" width="16cm" height="1cm"/>
<frame id="columnOne" x1="2.5cm" y1="2.5cm" width="7.5cm" height="24.7cm"/>
<frame id="columnTwo" x1="11cm" y1="2.5cm" width="7.5cm" height="24.7cm"/>
</pageTemplate>
</template>
<stylesheet>
<initialize>
<name id="FileTitle" value="Example 5 - templates and pageTemplates"/>
<name id="ColumnOneHeader" value="This is Column One"/>
<name id="ColumnTwoHeader" value="This is Column Two"/>
</initialize>
<paraStyle name="titleBox"
fontName="Helvetica-Bold"
fontSize="18"
spaceBefore = "0.4 cm"
alignment = "CENTER"
/>
<paraStyle name="body"
fontName="Helvetica"
fontSize="10"
leftIndent = "5"
spaceAfter = "5"
/>
</stylesheet>
<story>
<para style = "titleBox">
<b><getName id="FileTitle"/></b>
</para>
<nextFrame/>
<h2>
<getName id="ColumnOneHeader"/>
</h2>
<para>
This is the contents for <b>column one</b>.
</para>
<para>
It uses the default style for paragraph.
</para>
<para>
Does it come out OK?
</para>
<para>
There now follows some random text to see how these paragraphs look with longer content:
</para>
<para>
Blah blah morale blah benchmark blah blah blah blah blah blah communication blah
blah blah blah blah blah blah blah blah blah stretch the envelope blah blah blah.
Blah blah quality vector blah blah blah blah blah gap analysis blah blah blah.
Blah blah blah blah blah implement blah blah blah blah blah blah blah synergize
blah blah blah blah phase blah blah blah blah blah blah blah. Blah blah blah
blah world class blah blah blah blah blah experiencing slippage blah blah blah
blah blah networking communication. Blah blah blah blah blah blah blah blah blah
blah blah blah blah blah blah.
</para>
<para>
Blah blah blah blah blah blah blah blah blah blah blah blah architect blah inter
active backward-compatible blah blah blah blah blah. Blah blah blah blah value-a
dded blah go the extra mile blah blah solutioning recognition blah phase blah cr
edibility. Blah networking blah blah blah blah market segment blah blah blah har
dball blah networking blah blah blah blah blah implement blah blah blah.
</para>
<para>
Blah blah blah blah blah blah blah blah blah re-factoring phase blah knowledge
management blah blah. Blah blah blah blah interactive blah vision statement blah
blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah
blah blah blah blah blah blah blah. Blah blah blah empowering blah blah
interactive blah empowerment blah blah blah blah blah backward-compatible blah
downsize quality blah blah blah blah synergy blah blah blah.
</para>
<para>
Blah blah blah blah blah blah conceptualize blah downsize blah blah blah blah.
Blah blah blah blah blah blah blah blah blah blah blah blah synergy client-
centered vision statement. Blah appropriate blah synergize regroup blah blah blah blah
blah synergy blah blah blah blah blah blah blah blah blah vision statement down
size goal-setting.
</para>
<para>
Blah blah dysfunctional blah blah blah blah blah blah blah appropriate blah blah
blah blah blah blah blah blah re-factoring go the extra mile blah blah blah blah.
Blah implement blah blah blah blah streamline blah quarterly blah blah blah blah
blah blah goal-setting blah blah blah real estate.
</para>
<nextFrame/>
<h2>
<getName id="ColumnTwoHeader"/>
</h2>
<para style = "body">
This is the contents for <i>column two</i>.
</para>
<para style = "body">
It uses the paragraph style we have called "body".
</para>
<para style = "body">
Does it come out OK?
</para>
<para style = "body">
There now follows some random text to see how these paragraphs look with longer content:
</para>
<para style = "body">
Blah OS/2 blah blah blah blah coffee blah blah blah blah Windows blah blah blah
blah blah blah blah. Blah blah blah blah blah blah blah Modula-3 blah blah blah
blah blah blah blah blah. Blah blah bug report blah blah blah blah blah memory
blah blah TeX TCP/IP SMTP blah blah. Blah blah blah Multics blah blah blah blah
blah blah blah blah blah Modula-2 blah blah blah blah blah XML blah blah blah
blah Perl blah. Blah blah blah blah blah blah format your hard drive blah blah blah
Sun Microsystems blah blah blah.
</para>
<para style = "body">
Blah blah blah blah blah Em blah letterform blah blah blah blah blah blah blah
blah blah letterform blah blah. Blah blah blah blah leader blah blah blah blah
frame blah blah blah. Blah blah blah blah blah Pantone[TM] ligature blah blah
flush left blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah
blah blah blah colour separations rule blah blah blah blah blah. Blah blah blah
blah blah blah blah blah letterform blah blah type foundry blah blah flush-right
blah prepress blah blah blah blah flush-right blah blah.
</para>
<para style = "body">
Blah blah blah blah blah uppercase blah blah right justified blah blah blah
flush-right blah blah blah. Blah blah blah blah blah blah spot-colour blah Em
ligature blah blah blah Em.
</para>
<para style = "body">
Blah dingbat blah blah blah blah blah blah blah blah blah blah blah blah blah
blah blah. Blah blah blah blah blah drop-cap blah blah blah blah blah blah blah
blah blah. Blah blah blah blah blah blah gutter right justified blah blah blah
blah blah blah blah Pantone[TM].
</para>
</story>
</document>
Output from EXAMPLE 5
7.10. Asian Fonts
RML supports all of Adobe's Asian Font Packs. You can display text in Japanese, Traditional and Simplified Chinese and Korean using two different techniques.
The most robust technique is to include the standard Asian fonts Adobe specifies for use with Acrobat Reader. These will already be installed on the end user's machine if they have a localized copy of Acrobat Reader, or may be downloaded in the free "Asian Font Packs" from Adobe's site. In these cases there is no need to embed any fonts or to have any special software on the server. The first stage is to declare the fonts you need in the optional 'docinit' tag at the beginning of the document as follows:
<document filename="test_015_japanese.pdf">
<docinit>
<registerCidFont faceName="HeiseiMin-W3"/>
</docinit>
<template ...>
etc.
Note: The encName attribute of registerCidFont is deprecated: you should not use it with new documents.
You may then declare paragraph styles, use string-drawing operations or blockTable fonts referring to the font you have defined:
<paraStyle name="jtext"
fontName="HeiseiMin"
fontSize="12"
leading="14"
spaceBefore="12" />
The test directory includes a file test_015_japanese.rml containing a working simplified example in Japanese.
Warning: You will need to have a number of CMap files available on your system. These are files provided by Adobe which contain information on the encodings of all the glyphs in the font. RML2PDF looks for these in locations defined in the CMapSearchPath variable in the file reportlab/rl_config.py, which knows where to find Acrobat Reader on most Windows and Unix systems. If you wish to use Asian fonts on another system, you can copy these files (which may be redistributed freely) from a machine with Acrobat Reader on to your server.
Editor's note at 28/12/2002 - there is a great deal of information on fonts which needs adding to this manual including embedded Type 1 fonts and encodings and use of embedded subsetted TrueType fonts
Hyphenation functionality (requires Pyphen package installed) - Pyphen is a pure Python module to hyphenate text using included or external Hunspell hyphenation dictionaries.
Usage - Set the hyphenationLang attribute in the paraStyle and the content will be slplit according to the language used.
You can also exclude any word or part of a sentence from hyphenation by opening and closing a nobr tag around the content.
If the pyphen python module is installed attribute hyphenationLang controls which language will be used to hyphenate words without explicit embedded hyphens.
If embeddedHyphenation is set then attempts will be made to split words with embedded hyphens. Attribute uriWasteReduce controls how we attempt to split long uri's. It is the fraction of a line that we regard as too much waste. The default in module reportlab.rl_settings is 0.5 which means that we will try and split a word that looks like a uri if we would waste at least half of the line.
Currently the hyphenation and uri splitting are turned off by default. You need to modify the default settings
by using the file `~/.rl_settings` or adding a module reportlab_settings.py to the python path.
Suitable values are
hyphenationLanguage='en_GB'
embeddedHyphenation=1
uriWasteReduce=0.3
More examples test_001_hello.rml and test_001_hello.pdf