PDF Files - Getting The Page Numbers Right
Major Keary

An irritating feature of PDF files is that when a printed text - such as a book - is converted to PDF, the pagination usually gets out of kilter. By default PDF files number pages from `0', even when the original document may have `front matter' some of which is without page numbers, and other pages are numbered using small roman numerals. Navigating to a particular page, especially by using the page number from a document's index, requires some mental arithmetic to determine the offset. For example, there may be ten pages of front matter before reaching page 1 of the main text. If an index entry refers to page 241, and one selects `241' in Acrobat Reader's go-to-page command, the result is page 251 of the original document.

It is important to distinguish between the index created by Acrobat when it writes a PDF file, and the index that may be part of the original material (such as a book). Pages in a PDF file are numbered from 0 (zero), but
the Reader adjusts the count to begin at 1 (one).

In the course of reviewing a book with a companion CD that contains the full text in a PDF file I was surprised to see the front matter had been given different forms of pagination and, when Acrobat Reader was asked to go to page 241, it went to the page numbered 241 in the book. In the pagination display it showed "241 (261 of 978)".

How did they do that? Curiosity aroused, I went in search of how one implements such a useful option. One book on Acrobat 4 didn't sat anything; another mentioned the feature, but offered no further information. The Adobe PDF Reference did have information under the heading, page labels.

To cut a long story short, since Acrobat 4.0 (PDF 1.3) there has been a 'page label' facility that enables a user to specify page numbers to suit a particular document. Parallel with the PDF internal index (not to be confused document's index) the user can specify labels for front matter, and back matter.

However, to 'see' the custom pagination in Acrobat Reader it is necessary to check the 'use logical page numbers' box in 'preferences'. That seems to be the default when the Reader is installed, but people who distribute PDF documents should advise end-users to install a current copy of the Reader and ensure that the 'use logical page numbers' box in 'preferences' is checked.

An example of the PDF code for the logical page numbers feature is to be found at page 482 of the PDF Reference (a copy is in the library). There is little point in reproducing it here; apart from its arcane nature, a high degree of PDF programming expertise would be essential for implementation. As a matter of interest, I looked inside the PDF file that contains the complete text of the Adobe PDF Reference, and the pagination code bears little resemblance to their own example. There is a much easier way than trying to manhandle a PDF file.

Easy, but expensive: one needs the full Acrobat package. The Distiller module that comes with a number of applications does not enable custom pagination - at least, not the versions I have seen. What used to be called Acrobat Exchange, and (since version 4) is now called just, Acrobat, is the key to fine tuning PDF files.

The procedure in Acrobat is to select the Document tab, and - from the drop-down menu - select number pages to activate a dialog box. The Acrobat help pages contain information about using the feature. With a little experimentation it is not difficult to get the hang of the system.

Suppose you are converting a book with covers, and you want to include the front cover and front matter. Inside there may be separate pages for: half-title, title, copyright page, dedication, preface, foreword, table of contents, and introduction. Using the 'number pages' feature you might label the cover and any subsequent unnumbered pages fc 1, 2, 3, .. Pages with lower case roman numerals could be labelled, i, ii, iii, .. The first page of the main text (the book's own page 1) will be numbered 1, 2, 3, .. It is not necessary to label each page separately; labelling ranges are specified using the PDF index page numbers.


