Here’s my ten-word review of Joshua Bloch’s Effective Java: If You Write Java, You Need To Read This Book.
I wanted to say that up front, because I’m about to talk at great length about one of its shortcomings, which would give the impression, purely by wordcount, that I don’t think it’s a good book. This is just a reflection of how many more words it takes to explain a negative point than a positive one. The book is irrefutably useful and you should read it (if you write Java).
To give the issue a bit of context, here’s a fact that I was only dimly aware of before doing some mid-reading research:
If you get the security settings right, the Java Virtual Machine guarantees access control.
If you declare a field private, and you’re running a trusted JVM, and the security manager is set to disallow JNI and changing accessibility through reflection and a handful of other things, then you can run untrusted third-party code in the same process and even pass those objects to the untrusted code and it will never be able to get at the private field.
This is such a big deal that I don’t understand why it isn’t a bigger deal. It’d be impossible to make this kind of claim in a natively compiled language, because you can always pointer-arithmetic your way to the private data. Python makes only the vaguest of gestures in the direction of information hiding, and certainly doesn’t guarantee it. Maybe C# and the rest of the .NET family make some guarantees, I’m not sure (although whether you’d trust Microsoft to get the security right is a different story (but then, the same question should be asked of Sun Oracle)).
Here’s the issue though. This means that access control in Java actually has two different (although overlapping) purposes: encapsulation, and security. And they’re not the same thing.
Encapsulation is about reducing the complexity and increasing the abstraction of classes by hiding their implementation. Security is about stopping people from seeing or changing data that they shouldn’t be able to. Security is for protecting users from malicious programmers; encapsulation is for protecting programmers from themselves.
Security at this level isn’t always possible. If you’re writing a library for other people to use in a JVM that they control, you can’t expect to hide anything from them – they can turn the security manager way down, or change your bytecode, or run a modified JVM. (At university we had an assignment that involved black-box testing of algorithms that we were given as obfuscated JARs. I worked out that you could peek at some of the internals by rebuilding the Java standard libraries with String declared non-final, and passing in a subclass with some instrumentation added. And no, I didn’t actually do it.)
Most times that someone compromises their own system, they can only do damage to themselves. Of course, you probably still want to make your library as tight as possible so that it isn’t a security hole for other code that it interacts with. The point is that it can be dangerous, or at least an unnecessary programming burden, to rely on language guarantees for security if you’re not always going to have control of the platform.
On the other side, encapsulation isn’t always desirable. Well, maybe it is. There’s heated debate about this. Some people (many of them Java devotees) argue that programmers will find and use every available undocumented feature, and anything you inadvertently expose will doom you to support it forever. Others (e.g. a high proportion of Python fans) say that an API is as much a social contract as a technical one, and that if someone wants to work around an interface that doesn’t meet their needs then, well, we’re all adults, and they’re welcome to do so as long as they accept the consequences if the implementation changes.
The point is that encapsulation and security are different requirements. Making a field private because it’s an implementation detail is one decision; making a field private because using it would open a security hole is a very different decision. If you try to squeeze the concepts into one then at some point you’re going to make a poor decision. And now we get to my one (and minor) gripe with Effective Java: it doesn’t do enough to distinguish between them.
Some of Bloch’s points (e.g. Item 10: Always override toString
) are clearly about programmer-friendly abstractions. Other points (Item 76: Write readObject
methods defensively) are clearly about security – no programmer would (or could, reliably) exploit it just to get around API restrictions. In one place (Item 39: Make defensive copies when needed) he mentions that a particular security measure has a big enough performance hit that it can be valid to leave it open, if it’s in an environment where misuse will only hurt the (mis)user. But in other places it’s not so clear exactly what kind of advice he’s giving, which could lead readers to apply the advice in the wrong way.
Part of this might be that Bloch is writing from the perspective of someone who worked on the Java platform APIs, which sit in the part of the Venn diagram where encapsulation and security do overlap: they’re widely used, so any leaky implementation details are guaranteed to become a compatibility issue; and they’re the basis for every other API (even a trivial class extends Object
) and available to malicious code even on a trusted JVM, which effectively makes them part of the platform’s security guarantee. And I suppose you could argue (indeed, he says something similar to this in the introduction) that you don’t always know where your code will end up, so aiming for as much encapsulecurity as possible isn’t a bad thing.
And frankly that’s a pretty good argument. Which is why you still need to read this book.
(A few other things bugged me while reading it, but most of them were directed at Java rather than the book itself. There might still be another post or two in this topic.)