Geruva Publications - Software Dept.

Order the CD

Contact us by email

Home Page


cs2050 Software for Creation and Manipulation of PDF files - Windows, Linux/UNIX, Macintosh, etc.

December, 2005 Edition Copyright Arnold Kochman. Other copyrights also apply, including but not limited to the GNU General Public License.  About Me

This product includes a number of different packages. These are primarily source distributions for Linux/UNIX, and binary for Windows users. Some are relatively simple and some are quite complex.

I have included some documentation, where possible. For some packages you will need one of the commonly available unzip utilities, such as PKUNZIP or WinZip or tar, depending upon your system environment. In addition, packages contain documentation at varying levels of thoroughness, which you will see when you expand the archives in which they are stored. Note that there are various archive formats, which generally are appropriate to the area of applicability of the contents.

Broadly speaking, the packages on this CD fall into two categories:
1. Applications and Utility programs that can be installed and used, perhaps with some configuration. (User oriented.)
2. Libraries, classes such as Java classes or PHP classes, PERL modules, etc. which are intended to be of use for developing applications with PDF fgunctionality, or for integrating PDF functionality into existing applications. (Developer Oriented.)

The CD includes the following:

PDF Software
Applications / Utilities
Cupsys - Cupsys PDF backend - A PDF backend for CUPS, written by Ola Lundqvist. It is distributed under the GNU General Public License. It allow users create PDF files via the print server. For further information, refer to http://freshmeat.net/redir/cupsys-pdf-backend/54036/url_homepage/cupsys-pdf-backend
pdftk - A simple, command line tool for manipulating PDF documents, written in C++ by Sid Steward. It can be used to merge PDF documents, split PDF pages into a new document, decrypt input, encrypt output, fill PDF forms with FDF data and/or flatten forms, apply a background watermark, report on PDF metrics, update PDF metadata, attach files to PDF pages or the PDF document, unpack PDF attachments, burst a PDF document into single pages, decompress and re-compress page streams, and some repairs on corrupted PDF files. The pdftk tool is suitable to run under Linux/UNIX type systems, Windows, and Mac OS X. http://freshmeat.net/redir/pdftk/52709/url_homepage/pdftk
PandaScript - A scriptable PDF generation frontend by Michael Still. PandaScript is a frontend to the Panda PDF generation API that reads from standard in and generates a PDF document as instructed. This makes it much easier to generate PDF documents from shell and Perl scripts. PandaScript is a UNIX shell script and it is distributed under the GNU General Public License. For further details look to http://www.stillhq.com/extracted/pandascript/
PDFMap - A command line tool and Python library which eases the creation of very high quality clickable maps in PDF format. You can place objects on the maps. Each object can be represented by either a shape or an image, and is scaled and rotated to reflect its original dimensions and orientation. You can make each object clickable in Adobe Acrobat Reader (e.g., to access a Web-enabled application from which you've extracted the data used to create the map). PDFMap was written by Jerome Alet, and is distributed under the GNU General Public License. Further information can be had from http://www.librelogiciel.com/software/PDFMap/action_Presentation
HermesPost - A network Postscript to PDF converter, distributed under the GNU General Public License. HermesPost is an interactive interface for PDF conversion. You can use it as a printer and receive back a PDF. This is a simple way to have a network PDF printer. The usage is intended within a network with many W2* clients and one Linux-based server. You may find further details here: http://www.openit.it/index.php/openit_en/layout/set/print/content/view/full/255
Multivalent - Multivalent PDF Tools - A suite of tools for manipulating PDF documents by Tom Phelps. It includes tools for compressing, uncompressing (for hand editing), obtaining metadata, splitting and merging, encrypting and decrypting, validating, imposition (aka n-up), making page images, extracting text, and full-text indexing (with Lucene). The compress tool shrinks the PDF 1.5 Reference from 13.5MB to 8MB in PDF 1.5/Acrobat 6 format and down to 5.1MB in a new proposed "Compact" format. It is written in Java and is suitable for MacOS X, Microsoft Windows 95/98/ME/NT/2000/XP, and Linux/UNIX type systems.
mbtPdfAsm - Mad Builder PDF Assembler - A utility that provides an easy way to merge PDF files, written by Thierry Schmit. Mad Builder PDF Assembler allows you to merge several PDF files selected by a selection mask. The merging may be ordered and may specify particular pages of the original files. It is written in C/C++ and is suitable for Windows NT/2000/XP and Linux/UNIX systems. Operates in "assemble," "extract," and "update" modes.
YAFPC - Yet Another Freeware PDF-Composer - A PDF-composer for network shared PDF-printers, written by Wolfgang Ullrich. It can compose PDF documents from picture files and other PDF files, encrypt the created document, and send it to a given email address. YAFPC is designed to act as a command-line tool that combines with GhostScript to provide a network shared PDF printer which automatically adds letterheads, company-logos, watermarks, and Terms & Condition- pages to the printed documents, and then mails the document to the user who initiated the print job. It has a graphical user interface (GUI) for easy configuration and testing. Easy-to-use sample scripts for setting up a PDF-printer on Windows or Linux servers are included. YAFPC requires Java 1.4.1 or above. PDF files produced are compatible with Adobe Reader version 5.0 and later. It is distributed under the GNU General Public License.
pdfTeX - An extended version of TeX that can produce PDF documents, written by Han The Thanh. When PDF output is not selected, pdfTeX produces normal DVI output, otherwise it produces PDF output that looks identical to the DVI output. An important aspect of this project is to investigate alternative justification algorithms, optionally making use of multiple master fonts. pdfTeX is distributed under the GNU General Public License. Further information can be found at http://www.tug.org/applications/pdftex/
PDFTrans - A utility to add metadata, protect, and encrypt PDF documents. PDFTrans is a small front-end to the iText PDF library. It's a simple utility written in Java, and distributed under the GNU Lesser General Public License. It requires JRE 1.2 or later and the iText package. For further information refer to http://maddingue.free.fr/softwares/pdftrans.html
extendedPDF - An OpenOffice.org macro by Martin Brown. It converts an OpenOffice document into a PDF document. The output includes the original document's headings as PDF bookmarks, and includes the original hyperlinks as PDF hyperlinks. Document meta-information (such as title, author, and keywords) is also added. extendedPDF has many features, including the possibility oof converting OpenOffice.org Notes to PDF Note Annotations.
    The Professional Edition is designed for Windows users who need an easy-to-use product; the Universal Edition is designed for users and administrators on all OpenOffice.org and StarOffice platforms; and the Open Edition is designed for users who need the features offered by extendedPDF and who are prepared to install the software manually. Open edition does not support PDF security.Further details are available at http://www.jdisoftware.co.uk/pages/epdf-home.php.
Xpdf is a viewer for PDF files, written by Derek Noonburg. The Xpdf project also includes a PDF text extractor, PDF-to-PostScript converter, and various other utilities. It runs under the X Window System on UNIX, VMS, and OS/2. The non-X components (pdftops, pdftotext, etc.) also run on Win32 systems, and should run on pretty much any system with a decent C++ compiler. Xpdf is designed to be small and efficient. It can use Type 1 and TrueType fonts. Further information can be found at http://www.foolabs.com/xpdf/
OpenReport - A fast, flexible application for rendering professional PDF documents. It is a set of two embedded or standalone components: Tiny RML2PDF and the Open Report Server. It is designed to produce easily quality documents in real-time from multiple datas sources. It can use and render vectorial graphics, bitmaped pictures, high level formating elements, low level formating object, businesses templates, XML datas and put all these primitives together to produce a PDF file It runs under Windows 95/98/NT/2000/XP, and Linux/UNIX systems. Deatiled information is available at http://openreport.org/
iPDF - Indexed PDF Creator - An application to create indexed PDF documents from text files, written by Steve Slaven. It was designed to aid in creating an electronic distribution method for legacy system reports, since many mainframe type print spools are plain text. It allows indexing, customizing page settings, font size, font face, and superimposing text over an image in the case of using pre-printed forms. It supports unlimited levels of indexing bookmarks in documents and system/user configuration files. It is distributed under the GNU General Public License. Further details can be found at http://hoopajoo.net/projects/ipdf.html
HTML_ToPDF - A PHP class, written by Jason Rust, that facilitates conversion of HTML documents to PDF files on the fly. HTML_ToPDF grew out of the need to convert HTML files (which are easy to create) to PDF files (which are not so easy to create) fast and easily. HTML_ToPDF operated on Linux and Windows, and is distributed under the PHP License. It has many useful features and more details can be found at http://www.rustyparts.com/pdf.php
ascii2pdf - A simple text to simple PDF converter. ascii2pdf translates simple text documents to PDF format. It has options for changing font, font size, and landscape vs. portrait mode, but not much else. It is written in perl, and requires the PDF::Create module for perl. It is distributed under the GNU General Public License. http://bulldog.tzo.org/ascii2pdf/ascii2pdf.html
slidesystem - Slide System is a PHP-based Web application to create and manage slide presentations in the PDF format. It is distributed under the GNU General Public License. http://slidesystem.migratis.net/
KeyJnote - A program that displays presentation slides. KeyJnote offers some unique tools that are really useful for presentations. In addition, KeyJnote has facilities for certain visual effects in the frame transitions. KeyJnote is written in Python so it can run on many platforms, including Windows, Mac OS X, and Linux/UNIX. Python 2.3+, PyOpenGL, PyGame, and PIL are required; pdftk and GhostScript are recommended. KeyJnote is distributed persuant to the GNU General Public License. Further information can be obtained from http://keyjnote.sourceforge.net/
joinPDF - A command line tool to join and split PDF files. When you have to deal with joining several files, it is much faster than using Adobe Acrobat. However, it will not join PDF files which have security enabled, but generate an error. joinPDF requires a Java Runtime Environement. It is distributed under the terms of the GNU General Public License.
xtopdf - A converter to PDF file format from plain text and .DBF (XBase) files, by Vasudev Ram. It is written inPython and distributed under the BSD License.
PDFWebX - A PDF to Web site converter based on the programs pdftohtml and ghostscript, written by Alberti Corbi. It features a drag-and-drop interface, and supports the creation of Web sites that include frames, links, images, and more. Images may be exported as JPEG or PNG. A selected range of pages from the PDF or the whole PDF may be converted. Hidden text can be exported. The use of frames in the output is optional. The default font size can be increased in the Web output. The overall process can be inspected more closely through a built-in mini-terminal. PDFWebX runs on MacOS X and is distributed under the terms of the GNU General Public License.
PyCalendarGen - A program that generates customizable calendar pages in PDF format for use, for example, in photo calendars. It supports custom holidays, birthdays, and other special days and has decent internationalization. It is written in Python by Johan Warlander, and is distributed under the GNU General Public License. English and Swedish.
slide2handout - A TeX script to create printable versions of PDF slides that look similiar to handouts generated with MS PowerPoint. It was written by Krzysztof Leszczynski and is distributed under the GNU General Public License.
Dvipdfm - A DVI to PDF translator, Mark A. Wicks. Its features include TeX special's that approximate the functionality of the PostScript pdfmarks used by Adobe Acrobat Distiller, the ability to include PDF files and JPEG files as embedded images, support for both Type1 and PK fonts, support for arbitrary linear graphics transformations, a color stack accessible via special's, partial font embedding and stream compression for reduced output file size, native, portable graphics via TPIC specials, balanced page and destination trees for improved reader access on very large document files. Dvipdfm is made available under the GNU General Public License.
pdf2html - A web application to convert one PDF into a series of HTML pages with PNG images of logical pages. It runs GhostScript at high resolution and processes the output into low-res, 8-bit grayscale PNG's, using 17x15 times subsampling to achieve exactly 256 levels of gray. It is intended to convert PDF for online Web browsing where time is not critical but quality is desired. A special utility to print the content on a Epson-compatible 9-pin printer is also included. pdf2html was written by Karel Kulhavy and is distributed under the GNU General Public License.
JadeTex - A TeX macro package for processing the output from Jade/OpenJade in TeX (-t) mode. It is part of the standard DSSSL Jade-based toolchain to produce PostScript or PDF from SGML/XML documents. It produces output either as a DVI file, or directly to PDF when using pdfjadetex.
PdfRipImage - A utility to extract images from PDF files and convert them to the format of your choice. It is really a front-end for 'pdfimages' from Xpdf and 'pnmto*' from netpbm. The program is available as a command line shell script and a GNOME graphical version. PdfRipImage was written by Neil McPhail and is distributed under the GNU General Public License.
HTMLDOC - A tool to convert HTML files and Web pages into indexed HTML, PostScript, and PDF files suitable for online viewing and printing. It can be used as a standalone GUI application, in a batch document processing environment, as a Web-based report generation application, or in embedded environments to support printing of HTML content. It runs on all Unix platforms as well as Mac OS X and Windows 2000 and higher. HTMLDOC was written by Michael Sweet and is released under the GNU General Public License. Further details can be found at http://www.easysw.com/htmldoc/
Linbox - Linbox-Converter - A Python application that converts MS Office documents to PDF, PS, text, HTML, or TIFF. It is a client/server application in which the client transfers the MS Office documents to convert to the server. The server does the conversion into the requested file type and sends back the result to the client. Linbox is distributed under the GNU General Public License. Detailed information can be found at http://linbox.com/en/converter.
sam2p - A Unix commandline utility that converts many raster (bitmap) image formats into Adobe PostScript or PDF files. The images are not vectorized. It gives full control to the user to specify standards-compliance, compression, and bit depths. It is common that sam2p can compress an image down to an 50kB Level1 PostScript file without quality loss, while other popular converters produce multi-megabyte output. It provides some Level3 compression filters even on Level1 devices. sam2p runs in an X11 environment. It is released under the GNU General Public License. For further information refer to http://www.inf.bme.hu/~pts/sam2p/
fig2ps - A script for converting xfig files to PS or PDF, processing text using LaTeX. It is intended to help typeset good quality documents, where the font on the pictures is exactly the same as the font in the text. The advantage it has over some other xfig exporters such as eepic is that you compile the picture only once and not every time you compile your LaTeX file, giving a great gain in speed with complex pictures. It should work with LyX. It was written by Vincent Fourmond and is released under the GNU General Public License.
form_tools - A collection of CGI scripts and command line tools that can pre-populate Web forms (PDF and HTML) with data from a MySQL database as well as dump data submitted by a Web form into a MySQL database. Each Web form has a corresponding configuration file that controls how the CGI scripts and tools behave. Adding new forms that use other databases for input and output is simply a matter of creating a new configuration file for the new form. The form_tools package requires CGI.pm, HTML-Parser and LWP.pm. It was written by Joshua Starmer, and is released under the GNU General Public License. Find further details at http://www.itlab.musc.edu/form_tools/form_tools_readme.html.
Scribus - A desktop page layout program with the aim of producing commercial grade output in PDF and Postscript. Scribus is written in C++ for a Linux X11 environment. A version exists for OS X, but is not included on this CD. Scribus is distributed under the GNU General Public License. Further details can be found at http://www.scribus.org.uk/.
filegarden - Redstone FileGarden - A Linux application package with an editor, a built-in file manager, and a built-in email client. It can convert text to PostScript and PDF, perform image conversion, and supports custom scripting, batch processing, and custom component plugins. It can act as a front-end for page formatting with enscript with page preview. Batch processing of files is supported. It can format booklet style PostScript for book binding. You can create color graphical LaTeX files with XFG and export them to PostScript or PDF. There are some simple builtin LaTeX tags and functions. From the menu, users can also perform sound and video decoding and encoding. It supports Flac, OGG, and MP3 for audio, and AVI, DivX, WM8/9, and MPEG 1/2 for video. It is distributed under the GNU General Public License.
Ghostscript - AFPL Ghostscript - A processor for PostScript and PDF files, written by Raph Levien. It can rasterize these files to a wide variety of printers, devices for screen preview, and image file formats. Since applications tend to prepare pages for printing in a high-level format such as PostScript, most Unix users with low-level bitmap printers, such as inkjets, use GhostScript as part of the printing process. In addition, Ghostscript is capable of converting PostScript files, functionality comparable to Adobe Acrobat Distiller, but on the command line. In addition, Ghostscript is used for file import and viewing by a great many other applications, including xv, ImageMagick, gimp, and xdvi. AFPL Ghostscript can be run on Linux/UNIX, MacOS, MacOS X, Windows 95/98/ME, OS/2, BeOS, and other operating systems. Detailed information can be found at http://www.ghostscript.com/
Xprint - An advanced printing API based on the X11 protocol which enables applications to use devices like printers, and to fax or create documents in formats like PostScript, PCL, PDF, SVGprint, etc. Xprint was written by Roland Mainz and is distributed under the MIT/X Consortium License. It is suitable to run on Linux/UNIX type systems. Details can be found at http://xprint.mozdev.org/.
quaneko - A file indexing and search application, written in C++. It can index files in pdf, doc, html, xml, mp3, etc., as well as plain text. quaneko runs under MacOS X, Windows 95/98/ME/NT/2000/XP, and Linux/UNIX systems. The package is distributed under the GNU General Public License.
ReportLab - An easy package for high-quality dynamic personalized PDF documents in real-time & high volumes from any data sources. EPS and bitmap formats can also be created. Has many graphics libraries, e.g. business charts. It is ideal for automated reporting needs. ReportLab is written in C, Java, and Python and uses JDBC database functionality. The authors feet that is is relatively system independent. It is available under the terms of the GNU General Public License. Further particulars can be found at http://www.reportlab.com/.
PDFCreator - A tool to create PDF documents easily from nearly any application. With the PDFCreator Printer driver, any program that produces printable output can dispose the output to a PDF queue. PDFCreator is a Visual Basic application that runs under Windows 95/98/NT/2000/XP. It is quite a nice program. It does have some flaws, but nothing to bother the average user. It can conveniently be configured as a network printer and you can equally well use it fron Linux (or other) clients on your network. The program is distributed under the GNU General Public License. Additional infoormation can be found at http://sector7g.wurzel6.de/pdfcreator/index_en.htm
Libraries / Java Classes / PHP Classes / PERL Modules
CL-PDF - A cross-platform Common Lisp library for generating PDF files. It does not need any third-party tools from Adobe or others. It is used by cl-typesetting to provide a complete typesetting system. It is released with a BSD license. Further information can be found at http://www.fractalconcept.com/asp/oot/sdataQIW1nISK97c8DM==/potsdataQucgleAq9b==
AxisPHP - A library that provides PHP objects for the creation of PostScript and PDF documents, objects for (relatively) seamless session and user account handling, and various other utility objects and functions. The Session and User objects store their data in a MySQL database via a database abstraction layer, allowing them to be quickly adapted for use with other SQL database engines. AxisPHP runs in a web environment. It was written by Edward Almasy and is distributed persuant to the GNU General Public License. Further data can be found at http://www.axisdata.com/AxisPHP/
pdfDOM - A PHP library for creating PDF documents via a document object model, written by Timo A. Hummel. Its features include automatic text wrapping with containers, image placement, and border, margin, and padding settings for containers. pdfDOM is distributed persuant to the GNU General Public License. For further details, refer to http://pdfdom.sourceforge.net/
EASY pdflib - A set of functions and procedures which allow an easy generation of PDF files from within tdbengine-based Web applications. It generates Adobe Acrobat Reader compliant output. There are common functions for drawing simple texts, tables, or graphics, as well as table access functions to directly output database contents. The pdflib is distributed under terms of the GNU General Public Licence. Further details are available from http://tools.tdbengine.org/pdflib/
EPDF - An Eiffel library that allows the generation of PDF documents. It supports PNG images, 14 base fonts, graphic operations, document compression, viewer preferences, document properties, and document outline. It was written by Paul G. Crismer and is distributed under the Eiffel Forum License.
Jpedal - Java PDF Extraction and Decoding Library - A Java library to extract content from Adobe's PDF file format and rasteriz it. Text fragments are extracted as XML elements with font and location information. Images are extracted in both their raw formats and their clipped and scaled formats as TIFF, PNG, or JPEG files. Jpedal is used in both commerical and open source projects and continues to be actively enhanced. It is distributed under the GNU General Public License.
PDFBox - A Java PDF library written by Ben Litchfield. PDFBox represents a web application and is distributed persuant to the BSD License (revised). It facilitates creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. PDFBox also includes several command line utilities. Features include: PDF to text extraction; document merge; encryption/decryption; Lucene search engine integration; fill in form data; PDF creation from text files; images from PDF pages, and PDF printing. Further details can be found at http://www.pdfbox.org.
Panda - Panda PDF Generator - A C library for PDF generation, by Michael Still. The Panda PDF Generator is intended to make PDFs on the fly for the Web, but may be used to generate any PDF document you want. It is distributed under the GNU Lesser General Public License. http://www.stillhq.com//panda/
PDF::ReportWriter - A PERL module to produce high-quality reports and output directly in PDF format, written by Daniel Kasak. It supports text formatting and alignment, unlimited grouping with group functions, intelligent page breaking, color support, image support, shaped cell backgrounds, and numeric formatting. It is distributed under the GNU Lesser General Public License.
phppdflib - A PHP class that allows dynamic generation of PDF files, written by Bill Moran. It focuses on being easy to use, without requiring special Web server configuration. It currently supports automatic JPEG and PNG image embedding, as well as manual embedding of bitmapped image formats. Several methods of text placement are supported. phppdflib operates in a web environment. It is distributed under the GNU General Public License. Further information can be found at http://www.potentialtech.com/ppl.php
PDF::API2 - A PERL module-chain for creation and modification of PDFs. It is distributed under the Artistic License. It features support for the 14 base PDF Core Fonts, TrueType fonts, and Adobe-Type1, with unicode mappings, embedding of bitmap images, compression via zlib, and a rich object-oriented API. PDF::API2 requires PERL 5.8.x and PERL 5.8.4 or later is recommended.
iText - Java classes to generate documents in PDF-format, by Bruno Lowagie. iText is a library that contains classes to generate documents in the PDF, XML, HTML, and RTF. It can also parse XML documents and convert them into any of these formats. Pages of existing PDF files can be imported and copied to new PDF documents. iText is intended to permit you to generate PDF files on the fly. It requires at least JDK 1.2. iText is distributed under the Mozilla Public License.
PyChart - A Python library for creating high-quality Postscript, PDF, or PNG scientific charts ready for publishing. It supports line charts, bar charts, range-fill charts, and pie charts. Written by Yasushi Saito and distributed under the terms of the GNU General Public License. It runs under Windows, and Linux/UNIX systems. Further details can be found at http://home.gna.org/pychart.
phppdflib - A PHP class that allows dynamic generation of PDF files, written by Bill Moran. It focuses on being easy to use, without requiring special Web server configuration. It currently supports automatic JPEG and PNG image embedding, as well as manual embedding of bitmapped image formats. Several methods of text placement are supported. phppdflib operates in a web environment. It is distributed under the GNU General Public License. Further information can be found at http://www.potentialtech.com/ppl.php.
Connla - A Java library for creating data collections which can be exported to TXT, CSV, HTML, XHTML, XML, PDF, and XLS formats. Runs in a web environment and is distributed under the GNU Lesser General Public License.
PC4P - PDF Class for PHP - A library of php-classes to speed up the creation of reports and letters with support for such things as tables, images, background-images, wordwrap, pagebreak, all kinds of text manipulations, and editing different pdf-pages simultaneously. Written by Alexander Wirtz and distributed under the The Apache License. Find further details at http://www.pc4p.net/