PDF Accessibility
Experience enhanced PDF accessibility with our latest ReportlabPlus 4, now equipped with PDF auto-tagging that is effortlessly in compliance with industry standards, making your documents more accessible and user-friendly for all. Tagging can be done automatically.
Introduction
In recent years, the accessibility of PDFs for disabled users has become a big topic. Governments in particular may mandate compliance with accessibility standards. The key point is that screen readers, which read electronic documents aloud, should be able to make sense of a PDF document.
The guidelines on accessibility the W3C PDF Techniques for WCAG 2.0. Practically, there are many tools to check a file for compliance such as this PDF checker, used by the European Union.
A number of governments and organisations state that they "require" perfect compliance (it basically costs nothing to decree that), but to our knowledge there are no report generators that achieve it, and certainly not automatically. We believe ReportLab do very well on accessibility, and have dealt with the main things accessibility checkers reported many years ago (like adding metadata to describe the language the output is in).
ReportLab PLUS is a fairly low-level way of constructing PDF files, so achieving good accessibility is up to the developer using it. Here are some guidelines on testing and improving accessibility.
A perfect accessibility score, while often mandated in software requirements, is not always possible nor economic, especially with a complex graphical representation of data which is hard to explain in spoken words. We've often found in practice that disabled readers are much happier with an alternative form of the same information - for example a web page - which might allow you to agree an exemption for a PDF file, if this is always available.
How to Test
There are many testing tools and your organisation may have its own preferences.
The most popular desktop tool is probably Acrobat Pro.
There are many web services - The UK Government have some accessibility guidance that refers to Tingtun Checker for checking PDF files. This is a freely available site with no subscription needed, so it's our first port of call. Users can check PDFs via a url or the upload facility;
We've also used PAC3 to help us test/develop automated PDF Tagging in Reportlab Plus. PAC3 - PAC stands for PDF Accessibility Checker. It was released in 2010 and was the first automated PDF/UA compliance validation tool. PAC is designed to easily identify all of the machine verifiable success criterion of ISO 14289-1 (PDF/UA) and WCAG (Web Content Accessibility Guidelines).
The site currently tests for the following:
- Structure Elements (tags)
- Document Permissions
- Scanned Document
- Alternative Text for Images
- Bookmarks
- Correct Tab and Reading Order
- Decorative Images
- Table Elements
- Heading Levels
- Form Fields
- Running Headers and Footers
- Submit Buttons
- Natural Language
- Page Numbering
- Document Title
- Link Text for External Links
Improving Accessibility in your ReportLab and Rlextra output
There are optional properties we let you specify, like the language being used and providing alt descriptions for images within the PDF, which help your PDF to score pretty well. A large part of the accessibility score depends on the scripts you use to generate them and the content you put in. For a more further detail, tips and background reading on accessibility please see W3C PDF Techniques for WCAG 2.0 and Checker PDF test.
1. Setting A Document Title
In ReportLab's Open Source framework (henceforth ReportLab), it depends on what parts of the toolkit you are using. Use the canvas
method;
canvas.setTitle("My PDF Title")
Or maybe is you are using SimpleDocTemplate
;
SimpleDocTemplate('my-file.pdf', title="My Tile")
In our commercial Report Markup Language (henceforth RML), set the title attribute in the template
tag;
<template title="My Title" etc >
2. Setting a default Language
In RML, set the title attribute in the template
tag;
<template lang="en-GB" etc >
In ReportLab, pass an argument; SimpleDocTemplate('my-file.pdf', lang='en-gb')
3. Setting Bookmarks
Our system does not automatically generate bookmarks according to heading levels. We have a fairly common convention that you will set headings with <para style="h1">
but there's nothing to force you to do that; and you might not be creating a long flowing document anyway. So it's up to you to create bookmarks for the main sections, which will appear in the left sidebar.
In ReportLab, use the canvas
method - see section 4.2 in the ReportLab userguide;
canvas.bookmarkPage("My bookmarks")
In RML, see the bookmark
and bookmarkPage
tags
4. Providing labels for interactive form controls in PDF documents
Interactive forms are rarely used in PDF; since PDF was launched, the world has moved to people being online almost everywhere, and in the vast majority of cases, any organisation capturing information has a web form. However they may still be used in specialised situations, and we support their creation.
In RML, use the tooltip
tags with the associated form fields eg textField
, checkboxField
, radioField
, choiceField
, listboxField
5. Specifying consistent page numbering for PDF documents
This is up to you, the author, but it's very easy to number pages.
In RML, see Page Numbering Sample and associated source
Tagged PDF
Tagged PDFs contain structured information that allows assistive technologies such as screen readers to navigate and interpret the content of the PDF document more effectively. This helps to make the content accessible to people with disabilities such as visual impairments.
Tagging can be done automatically.
Properly tagging a PDF involves adding tags to the document's content that indicate the logical reading order of the content, the structure of the document (such as headings, paragraphs, and lists), and the alternate text for images and other non-text elements. This information is then used by assistive technologies to present the content in a way that is more understandable and navigable for people with disabilities.
In addition, tagged PDFs can also be designed to reflow their content to adapt to different screen sizes and orientations. This is particularly important for mobile devices, which have smaller screens and may be held in different orientations depending on the user's preferences.
Overall, proper tagging of PDFs is essential for ensuring that all users, regardless of their abilities or devices, can access and understand the content of the document.
Availability
Tagging is only available in the upcoming commercial Reportlab Plus version 4 and above.
We've used PAC3 to help us test/develop automated PDF Tagging in Reportlab Plus. Download PAC3
Summary report
Here is a summary report for a PDF of PDF generated with Reportlab Plus
with input RML file test_057_taggedparas.rml
and PDF output test_057_taggedparas.pdf
(see samples samples test_057_taggedparas.rml)
Check Points
Structure detail
How to tag
Add tagged="1" as an attribute to the document section to automatically tag paragraphs
<document filename="test_057_taggedparas.pdf" tagged="1">
Setting tagged=1
should be good enough, optionally you can put in explicit tagType
, inline tagging examples:
Tag H1
<para style="h1" tagType="H1">This is in h1 style</para>
Tag P
<para style="normal" tagType="P">This is an ordinary paragraph</para>
Tag Artifcact
<para style="normal" tagType="Artifact">This is an artifact</para>
Tabs
<blockTable style="basic" align="LEFT" tagType="Table" altText="Region versus Profit Table">
<tr><td/><td tagType='th scope=column' altText="South">South</td><td tagType='th scope=column' altText="East">East</td><td tagType='th scope=column' altText="West">South</td></tr>
<tr><td tagType="th scope=row" altText="Income">Income</td><td altText="100">100</td><td altText="120">120</td><td altText="140">140</td></tr>
<tr><td tagType="th scope=row" altText="Expenditure">Expenditure</td><td altText="90">90</td><td altText="150">150</td><td altText="115">115</td></tr>
<tr><td tagType="th scope=row" altText="Profit">Profit</td><td altText="10">10</td><td altText="-30">-30</td><td altText="25">25</td></tr>
<tr tagType="artifact"><td>unwanted</td><td>0</td><td>1</td><td>2</td></tr>
</blockTable>