PDF: introductory/historical notes (all you need to crack it)
(A Response to +ORC's Message Regarding reversing PDF)

by Ragica

(1 November 1997)


Courtesy of fravia's searchlores.org

Well, read what Ragica wrote to me:
Hi Fravia+. I wrote yesterday (or whenever) and said I'd provide
some more PDF information in regards to +ORC's message about it.

I have wipped up a long mostely useless rant on the subject, however
it contains what I think is the biggest collection ever (in fact
the only collection ever so far!) of information regarding hacking
PDF and links to relavant information.

I don't know if it's anything you can use for your site, but it is
perhaps a starting point, and a good reference for anyone who
is interested tackling PDF in a more meaningful way.

It includes links to detailed information on PDF encryption,
other hack attempts, easy methods to defeat the OWNER (security)
password protected options, and so on.

A Response to +ORC's Message Regarding Hacking PDF - by Ragica
--------------------------------------------------

While I appreciate (very much) +ORC's perspective on the Adobe PDF
format, and wish that tools for creating PDF were more accessible
so that PDF would be more widespread than it already is, I would
like to perhaps add some information, clear up some misconceptions,
and even say a few things in Adobe's favour.


SOME INTRODUCTORY/HISTORICAL NOTES

It is true that Adobe is horribly protectionist in their
attitudes. I believe this is a historical artifact of their
organisation. They developed postscript and type 1 fonts, both
formats still after all this time industry standards.

Adobe did not, it seems to me, realise what Microsoft would do to
them when Microsoft exploited it's OS virtual monopoly to push
TrueType fonts. The superior (in most cases) Adobe Type 1 fonts
have all but dried up and disappeared as far as "regular" users are
concerned. Only professional typesetters and publishers continue to
use them for the most part -- although they are still used on
non-windows platforms to greater and lesser extents. Soon Type 1
fonts will be incorporated into the OpenType standard which is
supposed to be a joint project between Adobe and Microsoft, but if
you read the Microsoft Web pages (URL down below) concerning it 
you will be left with the distinct impression that it's all 
Microsoft's doing, their benevolent design to help us all...

But back to topic of PDF. I mention all this font/postscript stuff
just to point out that Adobe does have some historical experience
in these areas -- the areas of controlling and maintaining an
industry standard format, and also in being rolled over mercilessly
by the Microsoft Beast.


A HALF HEARTED DEFENCE OF ADOBE AND PDF

Now lets examine how they are handling their Portable Document
Format. It would be nice if they were out just to make the world a
better place for everyone to live in, as +ORC (as well as myself)
would like to see, however they are at the bottom line just
attempting to make money. However, this is not necessarily
completely intrinsically evil.

I would like to point out that while Adobe controls the PDF format
they have published the specifications and made them freely
available (I will list all my URL references at the end of this
message). Furthermore, while it costs 500 bucks to join their
Developer club, they also have released the Adobe Acrobat SDK for
free -- so theoretically, anyone else, any company or individual,
could create their own PDF creator/filter without even the need to
reverse engineer, or otherwise hack the format.

In fact we are seeing this happen already (not as quickly as we
might like, but it is happening). Many other major applications are
finally picking up the PDF format. Quark Xpress, and Corel for
example. Of course other Adobe products such as Illustrator and
Photoshop have PDF support, and Pagemaker comes bundled with
Acrobat Distiller and is ideal for creating PDF documents.

The thing which Adobe protects by keeping and developing PDF as
proprietary is purity. We have all seen (mostly to our horror) how
the HTML format has been completely mangled by competing commercial
forces. The standard has barely been adhered to, and the bullies of
the market have brutally forced (sometimes contradictory)
extensions into the format.

I would like to suggest that Adobe is doing us a favour to some
extent in developing PDF commercially and keeping the format under
their control. I would not be so quick to say this if they had not
released the format specifications and the SDK publicly. And I
would not say this if they had released the specifications
publicly and yet kept many secret undocumented functions back to
exploit like a certain large company beginning with the letter M is
famous for. It is true perhaps that in keeping the format
proprietary that the general public is not free to help develop it
(officially) in the future, but the trade off is that we are
ensured an excellent format, freely documented, which will not be
abused and corrupted the way HTML (a weak format to begin with)
has.


FREEWARE & PUBLIC DOMAIN PDF SOLUTIONS

These are all commercial products however, and there is even
already a freeware solution. The PD postscript interpreter
"GhostScript" which while a Unix program has been ported to other
platforms such as Win32, and OS/2, and even DOS. This program is,
granted, somewhat tricky to learn to use, and is not user friendly
(to say the least!) but it does work and will produce PDF files
from PostScript, and is very powerful besides. Further more all of
the PostScript utilities which comes with it (+ORC mentioned he'd
like a Text-to-PDF converter) can be used to produce PDF files,
such as Text-to-PS. It also now is bundled with many PDF specific
utilities.

Besides the above mentioned method of converting ASCII Text to PDF
there are at least two little stand alone programs which will do
the deed. One first is freeware, portable, with source code, and
command line operated:
http://www.ep.cs.nott.ac.uk:80/~pns/pdfcorner/text2pdf/

The second is free non-expiring demo-ware, windows only, and a 
VB (bleah! keep it away from me!) app:
http://www.emrg.com/download/gym101.zip

GhostScript is a command line oriented program, with a GUI viewer.
There is an add-on viewer for it however called GSview -- it can
read and display most PDF files. There is not necessarily any need
for Adobe's Acrobat Reader even if it is not wanted for whatever
reason.

Finally, if anyone out there just wants to make a quick PDF there
is a free service offered on the Net (as of the time of writing
this it has been running for a year or more) to create PDF files
for anyone. All you need to do is upload a PostScript file to his
FTP site and it automatically is run through Acrobat Distiller and
placed in his outgoing directory usually within a few minutes. I
will give the URL below.


HACKING PDF ENCRYPTION AND PDF PASSWORDS

I am not a very technical person I'm afraid, so I can not write a
lot of technical details about this aspect of PDF. However, I can
give some general information which may be helpful, and point to
some more technical sources for those interested in following up.

The locked PDF document is not the most secure thing on Earth. (-:

There are two types of passwords associated with a PDF file: an
OWNER password and USER password. The OWNER password controls the
security options, but does not prevent a PDF file from being loaded
and viewed. The USER password prevents a PDF from being decrypted
and loaded at all.

There are not yet any known cracks for the USER password, although
much about the encryption scheme is known. At very least brute
force crackers should be fairly easy to create for those into that
sort of thing.

If the PDF can be viewed, whether it has an OWNER password or not it
is completely vulnerable. Security options to disable printing,
marking/copying, adding notes, and so on are useless.

There are several approaches to stripping a viewable PDF file of
other security options. Here are three:

1.

The first method we will call the "Twiddle Method". This apparently
involves directly manipulating/editing the raw PDF file to modify
security options. I can't tell you how to do this, but can only
report that it has been done. To find out information about this go
to www.dejanews.com and enter the following power search on the OLD
news database: ~g comp.text.pdf & ~a laird & password

The person who has evidently done this hacking has the following
web page where he discusses it, however he does not share his tools,
code, or specific information. I do not know whether he is willing
to give this information out or not. Kevin Laird can be reached at:
http://www.ecn.purdue.edu/~laird/PDF/

Kevin Laird even has rigged up a CGI on his site where you can
submit the URL of a PDF file (along with the USER password if
needed) and his CGI will fetch the document and regardless of
any OWNER security settings convert the thing to a plain
PostScript file and send it back to your browser.

2.

The relatively painless way anyone can defeat the OWNER password is
by using GhostScript. Older versions of GhostScript required a
special source code patch which enabled bypassing the OWNER
password. With newer versions things are even simpler than that.
You can use the standard GhostScript distribution and just replace
the pdf_sec.ps file with a special one which gets you past the
OWNER password. Information about this GhostScript hack and where
to find it is here: http://www.ozemail.com.au/~geoffk/pdfencrypt/

3.

Finally, if you have acrobat distiller and the "print" option has
not been disabled, you can simply print the PDF file you are
viewing to a new postscript file and run it through distiller
effectively creating a new PDF file stripped of all security
settings. This method however will lose any SPECIAL PDF attributes
such as thumbnails, bookmarks, notes, or hyperlinks, but it's very
effective on basic PDF files.

For those who would like to go further and try their hand at
breaking the PDF encryption there is an excellent page which
details (more detail than the PDF specifications) some of the
aspects of the technical encryption method PDF employs. It is here:
http://www.hedgie.com/passwords/acrobat2.html

I believe people should have the right to encrypt whatever they
want and give the password or not give the password to whoever they
want and not have their privacy violated. However, people should
also understand the limitations of encryption methods and not be
fooled into a false sense of security when something is not secure.
I also find the tenancy seems to be for people to needlessly
encrypt PDF files simply because it is easy to do and the function
is there. If they had written their information to a text file
would they encrypt that with PGP and distribute it that way? No!
But for some reason they think it's "cool" to encrypt a PDF file.
People who misuse password protection, and those who are ignorant
of its weaknesses, deserve to be shown the error of their ways!

I'm sorry I'm not advanced enough myself to get into more technical
PDF cracking attempts myself, but hopeful the information in this
file will be of help to someone who is. I believe it is the most
comprehensive collection of PDF hacking resources and references to
information yet assembled.

Happy hacking.


A BIT MORE ABOUT THE PDF CONCEPT

There are a lot of misconceptions about the PDF format out there.
Most people don't know how it's created, and it is endlessly
frustrating how "Adobe Acrobat" (the entire "Pro" package used to
create/edit/publish PDF documents) and the mere "Adobe Acrobat
Reader" are constantly thought to be the same thing.

The best way to think about PDF is as "electronic paper". It is not
meant to be edited, it is basically, like a printed page, a
read-only format. PDF is based on the Adobe Postscript
format/language. PDF in fact basically *is* PostScript, but with
some extensions and modifications.

It is not strictly true that his is not strictly true that PDF can
not be edited -- anything in electronic form can be edited.
PostScript files even can be edited if you have the right tools. It
is just that PDF, like Postscript, is not designed to be edited, it
is designed for display/printing primarily.

PDF can be modified and edited will Adobe Illustrator quite easily
(if it's not password protected in any way). There are third party
also plug-ins available for Adobe Exchange which allow some text
editing. (Adobe Acrobat Exchange is part of the Adobe Acrobat full
package. It is like the Acrobat Reader, except it can edit
security options, open options, create hyperlinks and bookmarks, 
and other PDF touch-up related functions. When Acrobat Exchange 3 
was in beta it was released to the public as a time-limited demo/beta. 
This is long gone, but copies can still be found some places under the
filename, and a crack by MJ13 is available to remove the time limit).


PDF AND HTML OR TEXT

Some people complain that PDF is hard to handle, and hard to
convert to other formats. Of course, this is intentional! It is
intended as a primarily read-only professional document type-set
format!

However Adobe has released a plugin for the Acrobat Reader which
will export any PDF file to HTML or Plain Text... the results are
not always the best, but are generally readable.

This service is also available via the internet on the fly. You can
go to http://access.adobe.com and enter the name of any PDF file on
the net. Adobe will fetch it, convert it to HTML and send the HTML
to your browser. Any (non-password protected) PDF file on the net
can be viewed (although the results aren't necessarily pretty) in
any web browser (even Lynx) without needing the Acrobat Reader or
Plug-in. This was primarily designed so that people with visual
disabilities could access PDF documents more easily, but it can be
useful for anybody.

There are also a lot of commercial 3rd Party tools cropping up
these days. There is a plug-in called "Compose" for example which
will export PDF files to RTF and possibly other formats. If you
are interested in 3rd party tools for Acrobat check the links
from one of these pages:
http://www.tinaja.com/acrob01.html
http://www.pdf.org


ACROBAT PRO AS WAREZ

If you ever look around on "warez" pages you will see links to
"Adobe Acrobat" on most of them! These always link to the FREEWARE
Acrobat Reader. If you needed any more evidence on how lame nearly
all warez web pages are, there you go. The actual full Acrobat
package is next to impossible to find as 'warez' on the net. In
fact I've looked long and hard and have never yet found it in a
complete and uncorrupted format. Of course it's out there somewhere,
and I'm not exactly elite, but all I'm saying is it's damn hard to
find!

Anyone interested in a commercial package for creating PDF should
look into PageMaker 6.5 or FrameMaker 5.5, they are much easier to
find on the net and both have integrated PDF creation support. In
fact you can create PDF from any PostScript document with either of
these packages because they install full versions of Acrobat
Distiller. All you need to do is print your document from any
application to a PostScript file, then open that file in Acrobat
Distiller and a PDF file will be created.

Adobe has now publicly released a beta of a Word'97 macro for
creating complex acrobat documents from inside Microsoft Word 97 as
well. But you still need Acrobat Distiller to actually produce the
PDF file.


ALL THE LINKS!

The official Acrobat PDF 1.2 Specification and SDK:
--------------------------------------------------
http://www.adobe.com/supportservice/devrelations/PDFS/TN/PDFSPEC.PDF
http://www.adobe.com/supportservice/devrelations/PDFS/TN/PDFSPEC.TXT
ftp://ftp.adobe.com/pub/adobe/devrelations/devtechnotes/pdffiles/PDFSPEC.PDF
http://www.adobe.com/supportservice/devrelations/sdks.html

Microsoft & Adobe OpenType Information:
--------------------------------------
http://www.adobe.com/supportservice/devrelations/opentype/main.htm
http://www.microsoft.com/typography/users.htm


GhostScript/GSview, freeware PDF/PS tools & password patch:
----------------------------------------------------------
http://www.cs.wisc.edu/~ghost/
http://www.ozemail.com.au/~geoffk/pdfencrypt/
http://www.tinaja.com/post01.html


PDF Hacking/unprotecting:
------------------------
http://www.tinaja.com/text/insecure.html
http://www.ecn.purdue.edu/~laird/PDF/
http://www.hedgie.com/passwords/acrobat2.html


General PDF information, tools, and links:
-----------------------------------------
http://www.ep.cs.nott.ac.uk:80/~pns/pdfcorner/text2pdf/
http://www.tinaja.com/acrob01.html
http://www.pdf.org
http://access.adobe.com
http://www.adobe.com/prodindex/acrobat/main.thml
http://www.emrg.com/download/gym101.zip 


Adobat Distiller PDF Net Service:
--------------------------------
http://www.babinszki.com/distiller.htm

(c) Ragica 1997. All rights reserved
You are deep inside fravia's searchlores.org, choose your way out:

[basic.htm]  [advanced.htm]  [intro.htm]  [annoyanc.htm]  [pdffing.htm]