Some Theoretical Background: CUPS, IPP, PostScript® and GhostScript

Chapter 4. Some Theoretical Background: CUPS, IPP, PostScript® and GhostScript

This chapter aims to give a bit of theoretical background to printing in general, and to CUPS especially. If you are not in need of this, you might like to skip ahead to the next chapter. I am betting you will come back to this chapter at some point anyway, because sometimes one needs extra theory to solve a practical problem.

Basics About Printing

Printing is one of the more complicated chapters in IT technology.

Earlier on in history, every developer of a program that was capable of spitting out printable output had to write his own printer drivers too. That was quite complicated, because different programs have different file formats. Even programs with the same usage, for example, word processors, often do not understand each others formats. So, there was no common interface to all printers, hence the programmers often supported only a few selected models.

A new device appearing on the market required the program authors to write a new driver if they wanted their program to support it. Also for manufacturers, it was impossible to make sure their device was supported by any program known to the world (although, there were far fewer than today.)

Having to support ten application programs and a dozen printers, meant a system administrator had to deal with 120 drivers. So the development of unified interfaces between programs and printers became an urgent need.

The appearance of “Page Description Languages”, describing the graphical representation of ink and toner on sheets of paper (or other output devices, like monitors, photo typesetters, etc.) in a common way was a move that found a big gap.

One such development was PostScript® by Adobe. It meant that an application programmer could concentrate on making his program give out a PostScript® language description of his printable page, while printing device developers could focus on making their devices PostScript® literate.

Of course, there came, over time, the development of other description methods. The most important competitors to PostScript® were PCL (“Print Control Language”, from Hewlett-Packard®), “ESC/P” (from Epson) and GDI (“Graphical Device Interface” from Microsoft®).

The appearance of these page description languages eased life, and facilitated further development for everybody. Yet the fact that there still remained different, incompatible, and competing page description languages keeps life for users, administrators, developers and manufacturers difficult enough.

PostScript® in memory - Bitmaps on Paper

PostScript® is most heavily used in professional printing environments such as PrePress and printing service industries. In the domains of UNIX® and Linux®, PostScript® is the pre-dominant standard as a PDL. Here, nearly every program generates a PostScript® representation of it's pages once you push the ‘Print’ button. Let us look at a simple example of (hand-made) PostScript® code. The following listing describes two simple drawings:

Example 4.0. PostScript® Code

%!PS
100 100 moveto
0 50 rlineto
50 0 rlineto
0 -50 rlineto
closepath
.7 setgray fill
% first box over; next
160 100 moveto
0 60 rlineto
45 10 rlineto
0 -40 rlineto
closepath
.2 setgray fill

This tells the imaginary PostScript® ‘pen’ to draw a path of a certain shape, and then fill it with different shades of gray. The first part translates into more comprehensive English as ‘Go to coordinate (100,100), draw a line with length 50 upward; then one from there to the right, then down again, and finally close this part.Now take a paint of 70% gray, and use it to fill the drawn shape.’

Example 4.1. Rendered PostScript®

Example 4.0 example rendered as an
image.

Of course, PostScript® can be much more complicated than this simplistic example. It is a fully fledged programming language with many different operators and functions. You may even write PostScript® programs to compute the value of Pi, format a harddisk or write to a file. The main value and strength of PostScript® however lays in the field to describe the layout of graphical objects on a page: it also can scale, mirror, translate, transform, rotate and distort everything you can imagine on a piece of paper -- such as letters in different font representations, figures, shapes, shades, colors, lines, dots, raster...

A PostScript® file is a representation of one or more to-be-printed pages in a relatively abstract way. Ideally, it is meant to describe the pages in an device-independent way. PostScript® is not directly ‘visible’; it only lives on the hard disks and in RAM memory as a coded representation of future printouts.

Raster Images on Paper Sheets

What you see on a piece of paper is nearly always a ‘raster image’. Even if your brain suggests to you that your eyes see a line: take a good magnifying glass and you will discover lots of small dots... (One example to the contrary are sheets that have been drawn by ‘pen plotters’). And that is the only thing what the ‘marking engines’ of todays printers can put on paper: simple dots of different colors, size, resolution to make up a complete ‘page image’ composed of different bitmap patterns.

Different printers need the raster image prepared in different ways. Thinking about an inkjet device: depending on its resolution, the number of used inks (the very good ones need different 7 inks, while a cheaper one might have use 3), the number of available jets (some print heads have more than 100!) spitting out ink simultaneously, the ‘dithering algorithm’ used, and many other things, the final raster format and transfer order to the marking engine is heavily dependent on the exact model used.

Back in the early life of the ‘Line Printer Daemon’, printers were machines that hammered rows of ASCII text mechanically onto long media, folded as a zig-zag paper snake, drawn from cardboard boxes beneath the table... What a difference from today!

RIP: From PostScript® to Raster

Before the final raster images are put on paper cut-sheets, they have to be calculated somehow out of their abstract PostScript® representation. This is a very computing-intensive process. It is called the ‘Raster Imaging Process’, more commonly ‘RIP’).

With PostScript® printers the RIP-ping is taken care of by the device itself. You just send to it the PostScript® file. The ‘Raster Imaging Processor’ (also called the RIP) inside the printer is responsible (and specialized) to fullfil quite well this task of interpreting the PostScript®-page descriptions and put the raster image on paper.

Smaller PostScript® devices have a hardware-RIP built in; it is cast in silicon, on a special chip. Big professional printers often have their RIP implemented as a software-RIP inside a dedicated fast UNIX® run computer, often a Sun SPARC Solaris or a SGI™ IRIX® machine.

GhostScript as a Software RIP

But what happens, if you are not lucky enough to have a PostScript® printer available?

You need to do the RIP-ing before you send the print data the marking engine. You need to digest the PostScript® generated by your application on the host machine (the print client) itself. You need to know how the exact raster format of the target printers' marking engine must be composed.

In other words, as you can't rely on the printer to understand and interpret the PostScript® itself, the issue becomes quite a bit more complicated. You need software that tries to solve for you the issues involved.

This is exactly what the omnipresent ghostscript package is doing for many Linux®, *BSD and other UNIX® boxes that need to print to non-PostScript® printers: ghostscript is a PostScript® interpreter, a software RIP capable to run a lot of different devices.

‘Drivers’ and ‘Filters’ in General

To produce rasterized bitmaps from PostScript® input, the concept of ‘filters’ is used by ghostscript. There are many different filters in ghostscript, some of them specialized for a certain model of printer. ghostscript filterspecializedin devices have often been developed without the consent or support of the manufacturer concerned. Without access to the specifications and documentation, it was a very painstaking process to reverse engineer protocols and data formats.

Not all ghostscript filters work evenly well for their printers. Yet, some of the newer ones, like the stp Filter of the Gimp Print project, produce excellent results leading to photographic quality on a par or even superior to their Microsoft® Windows® driver counterparts.

PostScript® is what most application programs produce for printing in UNIX® and Linux®. Filters are the true workhorses of any printing system there. Essentially they produce the right bitmaps from any PostScript® input for non-PostScript® target engines.

Drivers and Filters and Backends in CUPS

CUPS uses its own filters, though the filtering system is based on ghostscript. Namely the pstoraster and the imagetoraster filters are directly derived from ghostscript code. CUPS has re-organized and streamlined the whole mechanics of this legacy code and organized it into a few clear and distinct modules.

This next drawing (done with the help of Kivio) gives an overview of the filters and backends inside CUPS and how they fit to each other. The ‘flow’ is from top to bottom. Backends are special filters: they don't convert date to a different format, but they send the ready files to the printer. There are different backends for different transfer protocols.

kprinter dialogue started (Kivio draft
drawing)

Spoolers and Printing Daemons

Besides the heavy part of the filtering task to generate a print-ready bitmap, any printing software needs to use a SPOOLing mechanism: this is to line up different jobs from different users for different printers and different filters and send them accordingly to the destinations. The printing daemon takes care of all this.

This daemon is keeping the house in order: it is also responsible for the job control: users should be allowed to cancel, stop, restart etc. their jobs (but not other peoples's jobs) and so on.

KDE Logo