Skip to content

Paragraphs

The reportlab.platypus.Paragraph class is one of the most useful of the Platypus Flowables; it can format fairly arbitrary text and provides for inline font style and colour changes using an XML style markup. The overall shape of the formatted text can be justified, right or left ragged or centered. The XML markup can even be used to insert greek characters or to do subscripts.

The following text creates an instance of the Paragraph class: Paragraph(text, style, bulletText=None) The text argument contains the text of the paragraph; excess white space is removed from the text at the ends and internally after linefeeds.

This allows easy use of indented triple quoted text in Python scripts. The bulletText argument provides the text of a default bullet for the paragraph. The font and other properties for the paragraph text and bullet are set using the style argument.

The style argument should be an instance of class ParagraphStyle obtained typically using

from reportlab.lib.styles import ParagraphStyle

This container class provides for the setting of multiple default paragraph attributes in a structured way. The styles are arranged in a dictionary style object called a stylesheet which allows for the styles to be accessed as stylesheet['BodyText']. A sample style sheet is provided.

from reportlab.lib.styles import getSampleStyleSheet
stylesheet=getSampleStyleSheet()
normalStyle = stylesheet['Normal']

The options which can be set for a Paragraph can be seen from the ParagraphStyle defaults. The values with leading underscore ('_') are derived from the defaults in module reportlab.rl_config which are derived from module reportlab.rl_settings.

class ParagraphStyle

class ParagraphStyle(PropertySet):
    defaults = {
        'fontName':_baseFontName,
        'fontSize':10,
        'leading':12,
        'leftIndent':0,
        'rightIndent':0,
        'firstLineIndent':0,
        'alignment':TA_LEFT,
        'spaceBefore':0,
        'spaceAfter':0,
        'bulletFontName':_baseFontName,
        'bulletFontSize':10,
        'bulletIndent':0,
        'textColor': black,
        'backColor':None,
        'wordWrap':None,
        'borderWidth': 0,
        'borderPadding': 0,
        'borderColor': None,
        'borderRadius': None,
        'allowWidows': 1,
        'allowOrphans': 0,
        'textTransform':None,
        'endDots':None,
        'splitLongWords':1,
        'underlineWidth': _baseUnderlineWidth,
        'bulletAnchor': 'start',
        'justifyLastLine': 0,
        'justifyBreaks': 0,
        'spaceShrinkage': _spaceShrinkage,
        'strikeWidth': _baseStrikeWidth,    #stroke width
        'underlineOffset': _baseUnderlineOffset,    #fraction of fontsize to offset underlines
        'underlineGap': _baseUnderlineGap,      #gap for double/triple underline
        'strikeOffset': _baseStrikeOffset,  #fraction of fontsize to offset strikethrough
        'strikeGap': _baseStrikeGap,        #gap for double/triple strike
        'linkUnderline': _platypus_link_underline,
        #'underlineColor':  None,
        #'strikeColor': None,
        'hyphenationLang': _hyphenationLang,
        'uriWasteReduce': _uriWasteReduce,
        'embeddedHyphenation': _embeddedHyphenation,
        }

Using Paragraph Styles

This will be used in the next examples.

sample = """You are hereby charged that on the 28th day of May, 1970, you did
willfully, unlawfully, and with malice of forethought, publish an
alleged English-Hungarian phrase book with intent to cause a breach
of the peace.  How do you plead?"""

The Paragraph and ParagraphStyle classes together handle most common formatting needs. The following examples draw paragraphs in various styles, and add a bounding box so that you can see exactly what space is taken up.

Image

The two attributes spaceBefore and spaceAfter do what they say, except at the top or bottom of a frame. At the top of a frame, spaceBefore is ignored, and at the bottom, spaceAfter is ignored. This means that you could specify that a 'Heading2' style had two inches of space before when it occurs in mid-page, but will not get acres of whitespace at the top of a page. These two attributes should be thought of as 'requests' to the Frame and are not part of the space occupied by the Paragraph itself.

The fontSize and fontName tags are obvious, but it is important to set the leading. This is the spacing between adjacent lines of text; a good rule of thumb is to make this 20% larger than the point size. To get double-spaced text, use a high leading.

If you set autoLeading(default "off") to min(use observed leading even if smaller than specified) or max(use the larger of observed and specified) then an attempt is made to determine the leading on a line by line basis. This may be useful if the lines contain different font sizes etc.

The figure below shows space before and after and an increased leading:

Image

The attribute borderPadding adjusts the padding between the paragraph and the border of its background. This can either be a single value or a tuple containing 2 to 4 values. These values are applied the same way as in Cascading Style Sheets (CSS). If a single value is given, that value is applied to all four sides. If more than one value is given, they are applied in clockwise order to the sides starting at the top. If two or three values are given, the missing values are taken from the opposite side(s). Note that in the following example the yellow box is drawn by the paragraph itself.

Image

The leftIndent and rightIndent attributes do exactly what you would expect; firstLineIndent is added to the leftIndent of the first line. If you want a straight left edge, remember to set firstLineIndent equal to 0.

Image

Setting firstLineIndent equal to a negative number, leftIndent much higher, and using a different font (we'll show you how later!) can give you a definition list:.

Image

There are four possible values of alignment, defined as constants in the module reportlab.lib.enums. These are TA_LEFT, TA_CENTER or TA_CENTRE, TA_RIGHT and TA_JUSTIFY, with values of 0, 1, 2 and 4 respectively. These do exactly what you would expect.

Set wordWrap to CJK to get Asian language linewrapping. For normal western text you can change the way the line breaking algorithm handles widows and orphans with the allowWidows and allowOrphans values. Both should normally be set to 0, but for historical reasons we have allowed widows. The default color of the text can be set with textColor and the paragraph background colour can be set with backColor. The paragraph's border properties may be changed using borderWidth, borderPadding, borderColor and borderRadius.

The textTransform attribute can be None, uppercase or lowercase to get the obvious result and capitalize to get initial letter capitalization.

Attribute endDots can be None, a string, or an object with attributes text and optional fontName, fontSize, textColor, backColor and dy(y offset) to specify trailing matter on the last line of left/right justified paragraphs.

The splitLongWords attribute can be set to a false value to avoid splitting very long words.

Attribute bulletAnchor can be start, middle, end or numeric to control where the bullet is anchored.

The justifyBreaks attribute controls whether lines deliberately broken with a <br/> tag should be justified

Attribute spaceShrinkage is a fractional number specifiying by how much the space of a paragraph line may be shrunk in order to make it fit; typically it is something like 0.05

The underlineWidth, underlineOffset, underlineGap & underlineColor attributes control the underline behaviour when the <u> or a linking tag is used. Those tags can have override values of these attributes. The attribute value for width & offset is a fraction * Letter where letter can be one of P, L, f or F representing fontSize proportions. P uses the fontsize at the tag, F is the maximum fontSize in the tag, f is the initial fontsize inside the tag. L means the global (paragrpah style) font size. strikeWidth, strikeOffset, strikeGap & strikeColor attributes do the same for strikethrough lines.

Attribute linkUnderline controls whether link tags are automatically underlined.

If the pyphen python module is installed attribute hyphenationLang controls which language will be used to hyphenate words without explicit embedded hyphens. If embeddedHyphenation is set then attempts will be made to split words with embedded hyphens.

Attribute uriWasteReduce controls how we attempt to split long uri's. It is the fraction of a line that we regard as too much waste. The default in module reportlab.rl_settings is 0.5 which means that we will try and split a word that looks like a uri if we would waste at least half of the line. Currently the hyphenation and uri splitting are turned off by default. You need to modify the default settings by using the file ~/.rl_settings or adding a module reportlab_settings.py to the python path. Suitable values are

    hyphenationLanguage='en_GB'
    embeddedHyphenation=1
    uriWasteReduce=0.3

Paragraph XML Markup Tags

XML markup can be used to modify or specify the overall paragraph style, and also to specify intra- paragraph markup.

The outermost < para > tag

The paragraph text may optionally be surrounded by <para attributes....> </para> tags. The attributes if any of the opening <para> tag affect the style that is used with the Paragraph text and/or bulletText.

from reportlab.platypus import SimpleDocTemplate, Table, TableStyle
from reportlab.platypus.paraparser import _addAttributeNames, _paraAttrMap, _bulletAttrMap
from reportlab.lib.colors import black


def getAttrs(A):
    _addAttributeNames(A)
    S={}
    for k, v in A.items():
        a = v[0]
        if a not in S:
            S[a] = [k]
        else:
            S[a].append(k)

    K = list(sorted(S.keys()))
    K.sort()
    D=[('Attribute','Synonyms')]
    for k in K:
        D.append((k,", ".join(list(sorted(S[k])))))
    cols=2*[None]
    rows=len(D)*[None]
    return D,cols,rows

story = []

t=Table(*getAttrs(_paraAttrMap))
t.setStyle(TableStyle([
            ('FONT',(0,0),(-1,1),'Times-Bold',10,12),
            ('FONT',(0,1),(-1,-1),'Courier',8,8),
            ('VALIGN',(0,0),(-1,-1),'MIDDLE'),
            ('INNERGRID', (0,0), (-1,-1), 0.25, black),
            ('BOX', (0,0), (-1,-1), 0.25, black),
            ]))
story.append(t)
Table <seq template="%(Chapter)s-%(Table+)s"/> - Synonyms for style attributes Image Image

Some useful synonyms have been provided for our Python attribute names, including lowercase versions, and the equivalent properties from the HTML standard where they exist. These additions make it much easier to build XML-printing applications, since much intra-paragraph markup may not need translating. The table below shows the allowed attributes and synonyms in the outermost paragraph tag.

Intra-paragraph markup

Within each paragraph, we use a basic set of XML tags to provide markup. The most basic of these are bold (<b>...</b>), italic (<i>...</i>) and underline (<u>...</u>).

Other tags which are allowed are strong (<strong>...</strong>), and strike through (<strike>...</strike>). The <link> and <a> tags may be used to refer to URIs, documents or bookmarks in the current document. The a variant of the <a> tag can be used to mark a position in a document. A break (<br/>) tag is also allowed.

<b>You are hereby charged</b> that on the 28th day of May, 1970, you did
willfully, unlawfully, and <i>with malice of forethought</i>, publish an
alleged English-Hungarian phrase book with intent to cause a breach
of the peace.  <u>How do you plead</u>?,\ "Simple bold and italic tags"
Image

This <a href="#MYANCHOR" color="blue">is a link to</a> an anchor tag ie <a name="MYANCHOR"/><font color="green">here</font>. This <link href="#MYANCHOR" color="blue" fontName="Helvetica">is another link to</link> the same anchor tag.,"anchors and links"

Image

The <b>link</b> tag can be used as a reference, but not as an anchor. The a and link hyperlink tags have additional attributes <i>fontName</i>, <i>fontSize</i>, <i>color</i> & <i>backColor</i> attributes. The hyperlink reference can have a scheme of <b>http:</b><i>(external webpage)</i>, <b>pdf:</b><i>(different pdf document)</i> or document:(same pdf document); a missing scheme is treated as document as is the case when the reference starts with # (in which case the anchor should omit it). Any other scheme is treated as some kind of URI.

`<strong>You are hereby charged</strong>` that on the 28th day of May, 1970, you did
willfully, unlawfully, `<strike>`and with malice of forethought`</strike>`, `<br/>`publish an
alleged English-Hungarian phrase book with intent to cause a breach
of the peace. How do you plead?, "Strong, strike, and break tags"
Image

The <font> tag

The <font> tag can be used to change the font name, size and text color for any substring within the paragraph. Legal attributes are size, face, name (which is the same as face), color, and fg (which is the same as color). The name is the font family name, without any 'bold' or 'italic' suffixes. Colors may be HTML color names or a hex string encoded in a variety of ways; see reportlab.lib.colors for the formats allowed.

<font face="times" color="red">You are hereby charged that on the 28th day of May, 1970, you did willfully, unlawfully, and <font size=14>with malice of forethought</font>, publish an alleged English-Hungarian phrase book with intent to cause a breach of the peace. How do you plead?.

Image

Superscripts and Subscripts

Superscripts and subscripts are supported with the <super>/<sup> and <sub> tags, which work exactly as you might expect. Additionally these three tags have attributes rise and size to optionally set the rise/descent and font size for the superscript/subscript text. In addition, most greek letters can be accessed by using the <greek></greek> tag, or with mathML entity names.

<greek>epsilon</greek><super><greek>iota</greek>
<greek>pi</greek></super> = -1,
(Equation (&alpha;): <greek>e</greek> <super rise=9 size=6><greek>ip</greek></super>  = -1,
Image

Inline Images

We can embed images in a paragraph with the <img/> tag which has attributes src, width, height whose meanings are obvious. The valign attribute may be set to a css like value from baseline, sub, super, top, text-top, middle, bottom, text-bottom; the value may also be a numeric percentage or an absolute value.

This <img/> <img src="../images/testimg.gif" valign="top"/> is aligned top.
This <img/> <img src="../images/testimg.gif" valign="bottom"/> is aligned bottom.
This <img/> <img src="../images/testimg.gif" valign="middle"/> is aligned middle.
This <img/> <img src="../images/testimg.gif" valign="-4"/> is aligned -4.
This <img/> <img src="../images/testimg.gif" valign="+4"/> is aligned +4.
This <img/> <img src="../images/testimg.gif" width="10"/> has width 10.
Image

The src attribute can refer to a remote location eg src="https://www.reportlab.com/images/logo.gif". By default we set rl_config.trustedShemes to ['https','http', 'file', 'data', 'ftp'] and rl_config.trustedHosts=None the latter meaning no-restriction. You can modify these variables using one of the override files eg reportlab_settings.py or ~/.reportlab_settings. Or as comma separated strings in the environment variables RL_trustedSchemes & RL_trustedHosts. Note that the trustedHosts values may contain glob wild cars so .reportlab.com will match the obvious domains.
*NB
use of trustedHosts and or trustedSchemes may not control behaviour & actions when URI patterns are detected by the viewer application.

The <u> & <strike> tags

These tags can be used to carry out explicit underlineing or strikethroughs. These tags have attributes width, offset, color, gap & kind. The kind attribute controls how many lines will be drawn (default kind=1) and when kind>1 the gap attribute controls the disatnce between lines.

The <nobr> tag

If hyphenation is in operation the <nobr> tag suppresses it so <nobr> averylongwordthatwontbebroken</nobr> won't be broken.

Numbering Paragraphs and Lists

The <seq> tag provides comprehensive support for numbering lists, chapter headings and so on. It acts as an interface to the Sequencer class in reportlab.lib.sequencer. These are used to number headings and figures throughout this document. You may create as many separate counters as you wish, accessed with the id attribute; these will be incremented by one each time they are accessed. The seqreset tag resets a counter. If you want it to resume from a number other than 1, use the syntax <seqreset id="mycounter" base="42"> Let's have a go:

<seq id="spam"/>, <seq id="spam"/>, <seq id="spam"/>.

Reset <seqreset id="spam"/>. <seq id="spam"/>, <seq id="spam"/>, <seq id="spam"/>.

Image

You can save specifying an ID by designating a counter ID as the default using the ` tag; it will then be used whenever a counter ID is not specified. This saves some typing, especially when doing multi-level lists; you just change counter ID when stepping in or out a level.

<seqdefault id="spam"/>Continued... <seq/>, <seq/>, <seq/>, <seq/>, <seq/>, <seq/>, <seq/>.

Image

Finally, one can access multi-level sequences using a variation of Python string formatting and the template attribute in a <seq> tags. This is used to do the captions in all of the figures, as well as the level two headings. The substring %(counter)s extracts the current value of a counter without incrementing it; appending a plus sign as in %(counter)s increments the counter. The figure captions use a pattern like the one below:

Figure <seq template="%(Chapter)s-%(FigureNo+)s"/> - Multi-level templates,

Image

We cheated a little - the real document used Figure, but the text above uses FigureNo - otherwise we would have messed up our numbering!

Bullets and Paragraph Numbering

In addition to the three indent properties, some other parameters are needed to correctly handle bulleted and numbered lists. We discuss this here because you have now seen how to handle numbering.

A paragraph may have an optional bulletText argument passed to its constructor; alternatively, bullet text may be placed in a <bullet>..</bullet> tag at its head.

This text will be drawn on the first line of the paragraph, with its x origin determined by the bulletIndent attribute of the style, and in the font given in the bulletFontName attribute.

The bullet may be a single character such as (doh!) a bullet, or a fragment of text such as a number in some numbering sequence, or even a short title as used in a definition list.

Fonts may offer various bullet characters but we suggest first trying the Unicode bullet (&bull;), which may be written as &bull;, &#x2022; or (in utf8) \\xe2\\x80\\xa2

t=Table(*getAttrs(_bulletAttrMap))
t.setStyle([
            ('FONT',(0,0),(-1,1),'Times-Bold',10,12),
            ('FONT',(0,1),(-1,-1),'Courier',8,8),
            ('VALIGN',(0,0),(-1,-1),'MIDDLE'),
            ('INNERGRID', (0,0), (-1,-1), 0.25, colors.black),
            ('BOX', (0,0), (-1,-1), 0.25, colors.black),
            ])
getStory().append(t)
Table <seq template="%(Chapter)s-%(Table+)s"/> - <bullet> attributes & synonyms Image

The <bullet> tag is only allowed once in a given paragraph and its use overrides the implied bullet style and bulletText specified in the Paragraph creation.

<bullet>&bull;</bullet>this is a bullet point.  Spam
spam spam spam spam spam spam spam spam spam spam spam
spam spam spam spam spam spam spam spam spam spam ,
Image

Exactly the same technique is used for numbers, except that a sequence tag is used. It is also possible to put a multi-character string in the bullet; with a deep indent and bold bullet font, you can make a compact definition list.