Download Complete Decompiling Java Nolan Godfrey PDF for All Chapters
Download Complete Decompiling Java Nolan Godfrey PDF for All Chapters
com
https://textbookfull.com/product/decompiling-java-nolan-
godfrey/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/agile-swift-swift-programming-using-
agile-tools-and-techniques-godfrey-nolan/
textboxfull.com
https://textbookfull.com/product/writing-for-university-godfrey/
textboxfull.com
https://textbookfull.com/product/fire-pump-arrangements-at-industrial-
facilities-nolan/
textboxfull.com
https://textbookfull.com/product/intelligence-analysis-
fundamentals-1st-edition-godfrey-garner/
textboxfull.com
Statistics for the Behavioral Sciences 4th Edition Susan
A. Nolan
https://textbookfull.com/product/statistics-for-the-behavioral-
sciences-4th-edition-susan-a-nolan/
textboxfull.com
https://textbookfull.com/product/nolan-black-hearts-mc-3-1st-edition-
ruth-colby-colby/
textboxfull.com
https://textbookfull.com/product/why-you-like-it-the-science-and-
culture-of-musical-taste-nolan-gasser/
textboxfull.com
https://textbookfull.com/product/hemingways-geographies-intimacy-
materiality-and-memory-1st-edition-laura-gruber-godfrey-auth/
textboxfull.com
https://textbookfull.com/product/business-networks-in-east-asian-
capitalisms-enduring-trends-emerging-patterns-1st-edition-jane-nolan/
textboxfull.com
Decompiling Java
GODFREY NOLAN
AU rights reserved. No part of this work may be reproduced or transmitted in any form or by any
means, electronic or mechanical, including photocopying, recording, or by any information storage
or retrieval system, without the prior written permission of the copyright owner and the publisher.
Printed and bound in the United States of America 9 8 7 6 5 4 3 2 1
Trademarked names may appear in this book. Rather than use a trademark symbol with every
occurrence of a trademarked name, we use the names only in an editorial fashion and to the
benefit of the trademark owner, with no intention of infringement of the trademark.
The information in this book is distributed on an "as is" basis, without warranty. Although every
precaution has been taken in the preparation of this work, neither the author(s) nor Apress shall
have any liability to any person or entity with respect to any 10ss or damage caused or alleged to
be caused directly or indirectly by the information contained in this work.
In memory ofHanpeter Van Vliet
Contents at a Glance
About the Author ................................................... ix
About the Technical Reviewer ..................................... xi
Acknowledgments ................................................... xiii
v
Contents
About the Author ................................................... ii
About the Technical Reviewer ..................................... xi
Acknowledgments ................................................... xiii
vii
Contents
viii
About the Author
Godfrey Nolan is President of RIIS LLC, where he specializes in web site
optimization. He has written numerous articles for different magazines and
newspapers in the US, the UK, and Ireland. Godfrey has had a healthy obsession
with reverse engineering bytecode ever since he wrote "Decompile Once, Run
Anywhere," which first appeared in Web Techniques in September 1997.
ix
About the
Technical Reviewer
John Zukowski is a freelance writer and strategic Java consultant for JZ Ventures,
Inc. His latest endeavor is to create a next-generation mobile phone platform
with SavaJe Technologies. Look for the 1.5 edition of his Definitive Guide to
Swing for Java 2 in the fall of 2004 (also published by Apress}.
xi
Acknowledgments
THERE ARE COUNTLESS PEOPLE I have to thank in some small way for helping me
with this book. Apologies if I've forgotten anyone.
• My wife, Nancy, and also my children, Rory and Dayna, for putting up with
all the times I've missed a family outing while writing this book. And we're
talking lots and lots of missed outings.
• Jonathon Kade, for all your hard work helping with the decompiler and
Chapter 6 in general.
• Gary Cornell, without whom this book would never have seen the light
of day.
• Tracy Brown Collins and Rebecca Rider at Apress, for putting up with my
countless missed deadlines. Do I need to say lots and lots again?
• John Zukowski, for all the helpful comments. And yes, I'm still ignoring the
one about having a comma in Hello World.
• Dave and Michelle Kowalske and all my other in-laws, for knowing when
not to ask, "Is that book finished yet?"
• Finally, to my parents, who have always taught me to aim high and who
have supported me when, more often than not, I fell flat on my face.
xiii
CHAPTER 1
Introduction
WHEN COREL BOUGHT WordPerfect for almost $200 million from the Novell
Corporation in the mid 1990s, nobody would have thought that in a matter of
months they would have been giving away the source code free. However, when
Corel ported WordPerfect to Java and released it as a beta product, a simple
program called Mocha1 could quickly and easily reverse engineer, or decompile,
significant portions of Corel's Office for Java back into source code.
Decompilation is the process that transforms machine-readable code into
a human readable format. When an executable, a Java class file, or a DLL is
decompiled, you don't quite get the original format; instead you get a type of
pseudo source code, often incomplete and almost always without the comments.
But often what you get is more than enough to understand the original code.
The purpose of this book is to address an unmet need in the programming
community. For some reason, the ability to decompile Java has been largely
ignored even though it is relatively easy for anyone with the appropriate mindset
to do. In this book, I would like to redress the balance by looking at what tools
and tricks of the trade are currently being employed by people who are trying to
recover source code and those who are trying to protect it using, for example,
obfuscation.
This book is for those who want to learn Java by decompilation, those who sim-
ply want to learn how to decompile Java into source code, those who want to protect
their code, and finally those who want to better understand Java bytecodes and the
Java VIrtual Machine (JVM) by building a Java decompiler.
This book takes your understanding of decompilers and obfuscators to the
next level by
• Using examples that show you what to do when an applet only partially
decompiles.
• Providing you with simple strategies you can use to show users how to
protect their code.
1. Mocha was one of the early Java decompilers. You'll see more on Mocha later in this chapter.
1
Chapter 1
2
Introduction
Take the analogy of idioms in human languages, which are often the most diffi-
cult part of a sentence or phrase to translate. My favorite idiom is Z:esprit d'escalier,
which literally translates as the wit ofthe staircase. But what it really means is that
perfect witty comment or comeback that pops into your head half an hour too late.
Similarly (and I know I'm stretching it a bit here) source code can often be translated
into machine code in more than one way. Java source code is designed for humans
and not computers, and often some steps may be redundant or can be performed
more quickly in a slightly different order. Because of these lost elements, few (if any)
decompilations result in the original source.
Why Java?
The original ]VM was designed to run on a TV cable set-top box. As such, it was
a very small stack machine that pushed and popped its instructions on and off
a stack using only a limited instruction set. This made the instructions very easy
to understand with relatively little practice. Because the compilation process
was a two-stage process, the ]VM als,o required the compiler to pass on a lot of
information, such as variable and method names, that would not otherwise be
available. These names could be almost as helpful as comments when you were
trying to understand decompiled source code.
The current design of the JVM is independent of the Java 2 Software Development
Kit (SDK). In other words, the language and libraries may change, but the ]VM
and the opcodes are fixed. This means that if Java is prone to decompilation now,
then it is always likely to be prone to decompilation. In many cases, as you shall
see, decompiling a Java class is as easy as running a simple DOS or Unix command.
3. dec comes from cc, which used to be the standard command-line command for compiling
C programs, and still is, if like me you're IDE impaired.
3
Chapterl
In the future, the JVM may very well be changed to stop decompilation,
but this would break any backward compatibility and all current Java code
would have to be recompiled. And although this has happened before in the
Microsoft world with different versions ofVisual Basic, a lot more companies
than Sun develop virtual machines.
JVMs are now available for almost every operating system and web browser. In
fact, Java applets and applications can run on any computer or chip from a main-
frame right down to a handheld or a smartcard as long as a JVM and appropriate
class libraries exists for that platform. So it's no longer as simple as changing one
JVM,
What makes this situation even more interesting is that companies that want
to Java enable their operating system or browser usually create their own JVMs.
Sun is now only really responsible for the JVM specification. It seems that things
have now progressed so far that any fundamental changes to the JVM specification
would have to be backward compatible. Modifying the JVM to prevent decompila-
tion would require significant surgery, and in all probability, it would break this
backward compatibility, thus ensuring that Java classes will decompile for the fore-
seeable future.
It's true that no such compatibility restrictions exist on the Java SDK, where
more and more functionality is added almost daily. And the first crop of decom-
pilers did dramatically fail when inner classes were first introduced in the Java
Development Kit (JDK) 1.1. However, this isn't really a surprise because Mocha
was already a year out of date when 1.1 was released and other decompilers were
quickly modified to recognize inner classes.
4
Introduction
So unlike other Java books, I don't expect that this book will go out of date
with the next release of the JDK. Sure, some extra features may be added, but the
underlying architecture will remain the same. Let's begin with a simple example
in Listing 1-1.
Listing 1-2 shows the output for a simple class file whose source is shown in
Listing 1-1 using javap, Sun's class file disassembler that came with the original
versions of Sun's JDK. You can decompile Java so easily because, as you'll see
later in the book, the NM is a simple stack machine with no registers and a lim-
ited number of high-level instructions or opcodes.
5
Chapter 1
Method Casting()
o aload_o
1 invokespecial #8 <Method java.lang.Object()>
4 return<
6
Introduction
have to go all the way back to ALGOL to find the earliest example of a decom-
piler. Donnelly and Englanderwrote.D-Neliac at the Naval Electronic Labs (NEL)
in 1960. Its primary function was to convert non-Neliac compiled programs into
Neliac compatible binaries. Neliac was an ALGOL-type language that stood for
the Navy Electronics Laboratory International ALGOL Compiler.
Over the years, there have been other decompilers for COBOL, Ada, Fortran,
and many other esoteric as well as mainstream languages running on IBM main-
frames, PDP/Us, and Univacs, among others. Probably the main reason for these
early developments was to translate software or convert binaries to run on dif-
ferent hardware.
More recently, reverse engineering and the Y2K problem have become the
acceptable face of decompilation. Converting legacy code to get around the Y2K
problem often required disassembly or full decompilation. Reverse engineering
is a huge growth area that has not disappeared since the tum of the millennium.
Problems caused by the Dow Jones hitting the 10-thousand mark-ah, such fond
memories-and the introduction of the Euro have all caused financial programs
to fall over.
Even without these developments reverse engineering techniques are being
used to analyze old code, which typically has thousands of incremental changes,
in order to remove any redundancies and convert these legacy systems into much
more efficient animals.
At a much more basic level, hexadecimal dumps of PC machine code have
always given programmers extra insight into how something is achieved or into
how to break any artificial restrictions placed on the software. Magazine CDs
were either time-bombed or had restricted copies of games; these could be patched
to change demonstration copies into full versions of the software using primitive
disassemblers such as the DOS debug command.
Anyone well versed in Assembler can learn to quickly spot patterns in code
and bypass the appropriate source code fragments. Pirate software is a huge
problem for the software industry; disassembling the code is just one technique
employed by the professional or amateur bootlegger. Hence the downfall of many
an arcane copy protection technique.
However, the DOS debug command and Hexidecimal editors are primitive tools
and it would probably be quicker to write the code from scratch than to try to re-
create the source code from Assembler. For many years now, traditional software
companies have also been involved in reverse engineering software. They have
studied new techniques, and their competition has copied these techniques all
over the world using reverse engineering and decompilation tools. Generally, this
is accomplished using in-house decompilers, which are not for public consump-
tion and are definitely not going to be sold over the counter.
It's likely that the first real Java decompiler was actually written in IBM and
not by Hanpeter Van Vliet, author of Mocha. Daniel Ford's whitepaper Jive: A Java
Decompiler, dated May 1996, appears in IBM Research's search engines. This
whitepaper just beat Mocha, which wasn't announced until July 1996.
7
Chapter 1
4. http://www.threedee.com/jcm/psystem/
8
Introduction
Oddly enough for a technical subject, this book also has a very human element.
HanpeterVan Vliet wrote the first public domain decompiler, Mocha, while recov-
ering from a cancer operation in the Netherlands. He also wrote an obfuscator
called Crema that attempted to protect an applet's source code. If Mocha was the
Uzi machine gun, then Crema was the bulletproof jacket. In a now classic Internet
marketing strategy, Mocha was free, whereas there was a small charge for Crema.
The beta version of Mocha caused a huge controversy when it was first made
available on Hanpeter's web site, especially after it was featured in a clnet article.
Because of the controversy, Hanpeter took the very honorable step of removing
Mocha from his web site. He then held a vote about whether or not Mocha should
once again be made available. The vote was ten to one in favor of Mocha, and
soon after it reappeared on Hanpeter's web site.
However, Mocha never made it out of beta, and while I was conducting some
research for a Web Techniques article on this very subject, I learned from his wife
that Hanpeter's throat cancer finally got him. He died at the age of 34 on New
Year's Eve, 1996.
The source code for both Crema and Mocha were sold to Borland shortly
before Hanpeter's death, with all proceeds going to Hanpeter's wife, Ingrid. Some
early versions of ]Builder shipped with an obfuscator, which was probably Crema
This attempted to protect Java code from decompilation by replacing ASCII variable
names with control characters.
I'll talk more about the host of other Java decompilers and obfuscators later
in the book.
Legal Issues
Before you start building your own decompiler, why don't you take this opportunity
to consider the legal implications of decompiling someone else's code for your own
enjoyment or benefit? Just because Java has taken decompiling technology out of
9
Chapter 1
some very serious propeller head territory and into more mainstream computing
doesn't make it any less likely that you or your company will get sued. It may make
it more fun, but you really should be careful.
To start with, why don't you try following this small set of ground rules:
• Do not decompile an applet, recompile it, and then pass it off as your own.
• Don't even think of trying to sell a recompiled applet to any third parties.
Over the past few years, big business has tilted the law firmly in its favor
when it comes to decompiling software. Companies can use a number of legal
mechanisms to stop you from decompiling their software; these would leave you
with little or no legal defense if you ever had to appear in a court oflaw if some-
one discovered that you had decompiled their programs. Patent law, copyright
law, anti-reverse engineering clauses in shrink.wrap licenses, as well as a number
of laws such as the Digital Millennium Copyright Act (DMCA) may all be used
against you. Different laws may apply in different countries or states; for exam-
ple, the "no reverse engineering clause" software license is a null and void clause
in the European Union (EU), but the basic concepts are the same-decompile
a program for the purpose of cloning the code into another competitive product
and you're probably breaking the law.
The secret here is that you shouldn't be standing, kneeling, or pressing down
very hard on the legitimate rights-that is, the copyright rights-of the original
author. That's not to say that conditions exist in which it is OK to decompile.
However, certain limited conditions do exist where the law actually favors decom-
pilation or reverse engineering through a concept known as fair use. From almost
the dawning of time, and certainly from the beginning of the industrial age, many
of humankind's greatest inventions have come from an individual who has created
something special while standing on the shoulders of giants. For example, both the
invention of the steam train and the common light bulb were relatively modest
incremental steps in technology. The underlying concepts were provided by other
people, and it was up to Stephenson or Edison to create the final object. You can
see an excellent example of the Stephenson's debt to many other inventors such as
James Watt in the following timeline of the invention of the Stephenson's Rocket at
http: I !Wt~N. usgennet. org/usa/topic/steam/timeline. html. This concept ofstanding
on the shoulders of giants is one of the reasons why patents first appeared-to
allow people to build on other creations while still giving the original inventor
some compensation for their initial idea for period of, say, 20 years.
10
Introduction
In the software arena, trade secrets are typically protected by copyright law
rather than through any patents. Sure, patents can protect certain elements of
a program, but it is highly unlikely that a complete program will be protected by
a patent or a series of patents. Software companies want to protect their invest-
ment, so they typically turn to copyright law or software licenses to prevent
people from essentially stealing their research and development efforts.
Copyright law is not rock solid; if it was, there would be no inducement to
patent an idea and the patent office would quickly go out of business. Copyright
protection does not extend to interfaces of computer programs, and a developer
can use the fair use defense if they can prove that they decompiled the program
to see how they could intemperate with any unpublished application program-
ming interfaces (APis) in the program.
If you are living in the EU, then more than likely you work under the EU
Directive on Legal Protection of Computer Programs. This states that you can
decompile programs under certain restrictive circumstances-for example,
when you are trying to understand the functional requirements you need to
create a compatible interface to your own program. Or, to put it another way,
if you need access to the internal calls of a third party program and the authors
refuse to divulge the APis at any price. Then, under the EU directive, you could
decompile the code to discover the APis. However, you'd have to make sure that
you were only going to use this information to create an interface to your own
program rather than create a competitive product. You also cannot reverse
engineer any areas that have been protected in any way.
For many years Microsoft's applications have allegedly gained unfair advan-
tage from underlying unpublished APis calls to Wmdows 3.1 and Wmdows 95
that are orders of magnitude quicker than the published APis. The Electronic
Frontier Foundation (EFF) has come up with a useful road map analogy to help
explain this. Say you are trying to travel from Detroit to New York, but your map
doesn't show any interstate routes. Sure, you'd eventually get there traveling on
the back roads, but the trip would be a lot shorter if you had the Microsoft map,
complete with interstates. If these conditions were true, the EU directive would
be grounds for disassemblingWmdows 2000 or Microsoft Office (MSOffice), but
you better hire a good lawyer before you try it. Personally, I don't buy it as I can't
believe MSOffice could possibly be any slower than it currently is, so if there are
any hidden APis, they certainly don't seem to be causing any impact on the
speed of any of the MSOffice applications.
There are precedents that allow legal decompilation in the US too. The most
famous case to date is Sega v. Accolade. 5 In 1992, Accolade won a case against
Sega that ruled that their unauthorized disassembly of the Sega object code was
not copyright infringement. Accolade reverse engineered Sega's binaries into an
intermediate code that allowed them to extract a software key. This key allowed
Accolade's games to interact with Sega Genesis video consoles. Obviously Sega
5. http://www.eff.org/Legal/Cases/sega_v_accolade_977f2d1510_decision.html
11
Chapter 1
was not going to give Accolade access to APis, or in this case, code, to unlock the
Sega game platform. The court ruled in favor of Accolade judging that the
reverse engineering constituted fair-use. But before you think this gives you
carte blanche to decompile code, you might like to know that Atari v. Nintendo6
went against Atari under very similar circumstances.
In conclusion-see you can tell this is the legal section-the court cases in the
US and the EU directive stress that under certain circumstances reverse engineer-
ing can be used to understand the interoperability and create a program interface.
It cannot be used to create a copy to sell as a competitive product. Most Java
decompilation will not fall into this interoperability category. It is far more likely
that the decompiler wants to pirate the code, or at best, understand the underlying
ideas and techniques behind the software.
It is not very clear if reverse engineering to discover how an applet was written
would constitute fair use. The US Copyright Act of 1976's exclusion of "any idea,
procedure, process, system, method of operation, concept, principle or discovery,
regardless of the form in which it is described" makes it sound like the beginning
of a defense for decompilation, and fear of the fair use clause is one of the reasons
why more and more software patents are being issued. Decompilation to pirate or
illegally sell the software cannot be defended.
However, from a developer's point of view, the situation looks bleak. The only
protection-in the form of a user's license-is about as useful as the laws against
copying music CDs or audiocassettes. It won't physically stop anyone from mak-
ing illegal copies and it doesn't act as any real deterrent for the home user. No
legal recourse will protect your code from a hacker, and it sometimes seems that
the people trying to create many of today's secure systems must feel like they are
standing on the shoulders of morons. You only have to look at the recent investi-
gation into eBook protection schemes7 and the whole DeCSS fiasco 8 to see how
paper-thin a lot of the recent so called secure systems really are.
Moral Issues
Decompiling Java is an excellent way to learning both the Java language and how
the NM works. It helps people climb up the Java learning curve because they
learn by seeing other people's programming techniques. The ability to decompile
applets or applications can make the difference between a basic understanding of
Java and an in-depth knowledge. Learning by example is one of the most power-
ful tools. It helps even more if you can pick your own examples and modify them
to your own needs.
6. http://cyber.law.harvard.edu/openlaw/DVD/cases/atarivnintendo.html
7. http://slashdot.org/article.pl?sid=01/07/17/130226
8. http://cyber.law.harvard.edu/openlaw/DVD/resources.html
12
Introduction
Protecting Yourself
Pirated software is a big headache for many software companies and big business
for others. At the very least, software pirates could use decompilers to remove any
licensing restrictions, but imagine the consequences if the technology was available
to decompile Office 2000, recompile it, and sell it as a new competitive product. To
a certain extent, that could easily have happened when Corel released the beta ver-
sion of Corel's Office for Java.
Perhaps this realization is starting to dawn on Java software houses. We are
beginning to see two price scales on Java components: one for the classes and
one for the source code. This is entirely speculative, but it seems that companies
such as Sitraka (now Quest) realized that a certain percentage of their users
would decompile their classes, and as a result, a few years ago Sitraka chose to
sell the source code for JClass as well as other components. This makes any
decompilation redundant as the code is provided along with the classes and it
also makes some money for the developer by charging a little extra for the
source code.
13
Chapter 1
But is all doom and gloom? Should you just resign yourselves to the fact that
Java code can be decompiled or is there anything you can do to protect your code?
Here are some options:
• License agreements
• Code fingerprinting
• Obfuscation
• Executable applications
• Server-side code
• Encryption
Although you'll look at all these in more detail later, you should know that
the first four only act as deterrents and the last four are effective, but have other
implications. Let me explain.
license agreements don't offer any real protection from a programmer who
wants to decompile your code.
Spreading protection schemes throughout your code, such as by using combi-
nations of getCodeBase and getDocumentBase or server authentication, is useless
because they can be simply commented out of the decompiled code.
Code fingerprinting is what happens when spurious code is used to watermark
or fingerprint source code, and it can be used in conjunction with license agree-
ments, but it is only really useful in a court of law. Better decompilation tools will
profile the code and remove any extra dummy code.
Obfuscation replaces the method names and variable names in a class file
with weird and wonderful names. This is an excellent deterrent, but the source
code is still visible and in conjunction with obfuscated code when the better
decompilers are used, so often this is not much better than compiling without
the debug flag. HoseMocha, another obfuscator, works by adding a spurious
pop bytecode after every return; it does nothing to the code but it does kill the
decompiler. However, developers can quickly modify their decompiler once
this becomes apparent, assuming they're still around to make the changes.
IPR protection schemes such as IBM's Cryptolope Live!, InterTrust's DigiBox,
and Breaker Technologies' SoftSEAL are normally used to sell HTML documents
or audio files on some pay-per-view basis or pay-per-group scheme. However,
because they typically have built in trusted HTML viewers, they allow Java applets
to be seen but not copied. Unfortunately IPR protection schemes are not cheap.
14
Introduction
Worse still, some of the clients are written in 100 percent pure Java and can
therefore be decompiled.
The safest protection for Java applications is to compile them into executables.
This is an option on many Java compilers-SuperCede, for example. Your code will
now be as safe as any C or C++ executables-read a lot safer-but it will no longer
be portable because it no longer uses the NM.
The safest protection for applets is to hide all the interesting code on the
web server and only use the applet as a thin, front-end graphical user interface
(GUI). This has a downside; it may increase your web server load to unaccept-
able levels.
Several attempts have been made to encrypt a classfile's content and then
decrypt it in the classloader. Although at first glance this seems like an excellent
approach, sooner or later the classfile's bytecode has to be decrypted in order to
be executed by the NM, at which point it can be intercepted and decompiled.
Book Outline
Decompiling Java is not a normal Java language book. In fact, it is the complete
opposite of a standard Java textbook where the author teaches you how to trans-
late ideas and concepts into Java. You're interested in turning the partially compiled
Java bytecodes back into source code so that you can see what the original pro-
grammer was thinking. I won't be covering the language structure in depth, except
where it relates to bytecodes and the NM. All emphasis will be on Java's low-level
design rather than on the language syntax.
In the first part of this book, Chapters 2 through 4, I'll unravel the Java classfile
format and show you how your Java code is stored as bytecode and subsequently
executed by the NM. You'll also look at the theory and practice of decompilation
and obfuscation. I'll present some of the decompiler's tricks of the trade and explain
how to unravel the Java bytecode of even the most awkward class. You'll look at the
different ways people try to protect the source code and, when appropriate, learn to
expose any flaws or underlying problems with the different techniques so that you'll
be suitably informed before you purchase any source code protection tools.
The second part of this book, Chapters 5 and 6, I will primarily focus on how
to write your own Java decompiler. You'll build an extendable Java bytecode
decompiler. You'll do this for two reasons. First, although the NM design is fixed,
the language is not. Many of the early decompilers cannot handle Java con-
structs that appeared in the JDK 1.1, such as inner classes. Second, one of my
own personal pet peeves is reading a technical computer book that stops when
things are just getting interesting. The really difficult problems are then left to
the reader as an exercise. For some unknown reason, this seems to be particu-
larly true of Internet-related books. Partly as a reaction against that mentality,
I'm going to go into decompilers in some detail with plenty of practical examples
in hopefully as approachable a manner as possible.
15
Chapter 1
And while we're on the subject of pet peeves-sorry, I'll try to keep them to
a minimum-I won't be covering a potted history of the Internet or indeed Java.
This has been covered too many times before. If you want to know about the
ARPANET and Oak, then I'm afraid you're going to have to look elsewhere.9
Conclusion
Java decompilation is one of the best learning tools for new Java programmers.
What better way to find out how to write code than by taking an example off the
Internet and decompiling it into source code? It's also a necessary tool when
some dotcom web developers have gone belly up and the only way to fix their
code is to decompile it yourself. But it's also a menace if you're trying to protect
the investment of countless hours of design and development.
The aim of this book is to create some dialog about decompilation and
source code protection. I also want to separate the fact from fiction and show
you how easy it is to decompile code and what measures you can take to protect
it. Both Sun and Microsoft will tell you that decompilation isn't an issue and that
a developer can always be trained to read a competitor's Assembler, but separate
the data from the instructions and this task becomes orders of magnitude easier.
Don't believe it? Then read on and decide for yourself.
9. Such as Core Java 2, 6th edition, by CayS. Horstmann and Gary Cornell (Prentice Hall PTR,
2002).
16
CHAPTER 2
Ghost 1n
•
the Machine
IF You'RE TRYING to understand just how good an obfuscator or decompiler really
is, then it helps to be able to see what's going on inside a classfile. Otherwise you're
relying on the word of a third-party vendor or, at best, a knowledgeable reviewer.
For most people, that's not good enough when you're trying to protect mission
critical code. At the very least, you should be able to talk intelligently about the
area and ask the obvious questions to understand just what's happening.
At this moment, all sorts of noises are coming out of Microsoft in Redmond
saying that there really isn't anything to worry about when it comes to decompil-
ing .NET code. Sure, hasn't everyone been doing it for years at the Assembly level?
Similar noises were made when Java was in its infancy.
So, in this chapter, you'll be pulling apart a Java classfile to lay the founda-
tion for the following chapters on obfuscation theory and to help you during
the design of your decompiler. In order to get to that stage, you need to under-
stand bytecodes, opcodes, classfiles, and how they relate to the Java Virtual
Machine (JVM).
Several very good books are on the market about the JVM. The best is Bill
Verner's Inside the Java Virtual Machine (McGraw-Hill, 1998). Some of the
book's chapters are available online at http://www.artima.com/insidejvm/ed2/.
If you can't find this book, then check out Verner's equally excellent "Under
the Hood" articles inJavaWorld. This series of articles was the original mater-
ial on which the book was based. Sun's Java Virtual Machine Specification
(2nd Edition), written by Tim Lindholm and Frank Yellin, is both comprehen-
sive and very informative for would-be decompiler writers. But because it is
a specification, it is not what you would call a good read. This book is available
online at http: I /java. sun. com/docs/books/vmspec or you can purchase it
(Addison-Wesley, 1999).
Oddly enough, I've yet to see a book that covers how to build a JVM; every
book published so far focuses on the abstract JVM rather than how someone
would implement one. With the rise of alternative JVMs from IBM and others,
I really expected to see at least one JVM book full of C code for converting byte-
code to executable native code, but it never came. Perhaps this is because it
17
Chapter2
would have a very limited audience and its sales would be in the hundreds rather
than the thousands.
However, my focus is very different from other JVM books. You could say I'm
approaching things from the completely opposite direction. Your task is to get
from bytecode to source, whereas everyone else wants to know how source is
translated into bytecode and ultimately executed. You should be much more
interested in how a classfile can be turned into source rather than how a classfile
is interpreted.
In this chapter, you'll be looking at how a classfile can be disassembled into
bytecodes and how these bytecodes can be turned into source. Of course, you
need to know how each bytecode functions, but you should be less interested in
what happens to them when they are within the ]VM, and my emphasis will dif-
fer accordingly.
18
Ghost in the Machine
architecture, as well as the high-level nature of many ofits instructions, all conspire
against the programmer in favor of the decompiler.
At this point, it is probably also worth mentioning the fragile superclass prob-
lem. When a new method is added in C++, all classes that reference that class need
to be recompiled. Java gets around this by putting all the necessary symbolic infor-
mation into the classfile. The NM then takes care of all the linking and final name
resolution, loading all the required classes-including any externally referenced
fields and methods-on the fly. This delayed linking or dynamic loading, possibly
more than anything else, is why Java is so much more prone to decompilation.
By the way, I'm going to ignore native methods in these discussions. Native
methods, of course, are when some native C or C++ code is incorporated into the
application. This spoils Java application portability, and is one surefire way to
prevent a Java program from being decompiled.
So without further ado, let's take a brief look at the design of the NM.
• Heap
• Method area
• Stack
Every application or applet has its own heap and method area and every thread
has its own register or program counter and program stack. Each program stack is
then further subdivided into stack frames, with each method having its own stack
frame. That's a lot of information for one paragraph, so in Figure 2-1, I illustrate this
in a simple diagram.
19
Other documents randomly have
different content
No doubt, Caleb profoundly agreed with this characterisation of
Letizia, held he up never so plump a protestant hand.
“Oh, do give your consent to our marriage,” he gurgled. “I know
that there is a difference of religion. But I have ventured to think once
or twice that you could overlook that difference. I have remarked
sometimes that you did not appear to attach very great importance to
your religion. I’ve even ventured to pray that you might come in time
to perceive the errors of Romanism. In fact, I have dreamed more
than once, ma’am, that you were washed in the blood of the Lamb.
However, do not imagine that I should try to influence Letizia to
become one of the Peculiar Children of God. I love her too dearly,
ma’am, to attempt any persuasion. From a business point of view—
and, after all, in these industrious times it is the business point of
view which is really important—from a business point of view the
match would not be a very bad one. I have a few humble savings,
the fruit of my long association with you in your enterprises.”
Caleb paused a moment and took a deep breath. He had reached
the critical point in his temptation of Madame Oriano, and he tried to
put into his tone the portentousness that his announcement seemed
to justify.
“Nor have I been idle in my spare time, ma’am. No, I have devoted
much of that spare time to study. I have been rewarded, ma’am. God
has been very good to me and blessed the humble talent with which
he entrusted me. Yes, ma’am. I have discovered a method of using
chlorate of potash in combination with various other chemicals which
will undoubtedly revolutionise the whole art of pyrotechny. Will you
consider me presumptuous, ma’am, when I tell you that I dream of
the moment when Fuller’s Fireworks shall become a byword all over
Great Britain for all that is best and brightest in the world of
pyrotechny?”
Madame Oriano’s eyes flashed like Chinese fire, and Caleb,
perceiving that he had made a false move, tried to retrieve his
position.
“Pray do not suppose that I was planning to set myself up as a
manufacturer of fireworks on my own. So long as you will have me,
ma’am, I shall continue to work for you, and if you consent to my
marrying your Letizia I shall put my new discovery at your service on
a business arrangement that will satisfy both parties.”
Madame Oriano pondered the proposal in silence for a minute.
“Yes, you can have Letizia,” she said at last.
Caleb picked up the hand that was hanging listlessly over the
coverlet and in the effusion of his gratitude saluted it with an oily
kiss.
“And you’ll do your best to make Letizia accept me as a husband?”
he pressed.
“If I say you can have Letizia, caro, you willa have her,” the mother
declared.
“You have made me the happiest man in England,” Caleb oozed.
Whereupon he walked on tiptoe from the room with a sense even
sharper than usual that he was one of the Lord’s chosen vessels, a
most peculiar child even among the Peculiar Children of God.
Just when the hot August day had hung two dusky sapphire lamps
in the window of the room, Madame Oriano, who had been lying all
the afternoon staring up at the shadows of the birds that flitted
across the ceiling, rang the bell and demanded her daughter’s
presence.
“Letizia, devi sposarti,” she said firmly.
“Get married, mamma? But I don’t want to be married for a long
time.”
“Non ci entra, cara. Devi sposarti. Sarebbe meglio—molto meglio.
Sei troppo sfrenata.”[7]
[7] “That doesn’t come into it, my dear. You must get
married. It would be better—much better. You are too
harum-scarum.”
“I don’t see why it should be so much better. I’m not so harum-
scarum as all that. Besides, you never married at my age. You never
married at all if it comes to that.”
“Lo so. Perciò dico che tu devi sposarti.”[8]
[8] “I know that. That’s why I say that you must get married.”
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com