Etymon: It's not as scary as it sounds.

Thursday, July 29, 2004

It's not as scary as it sounds.

Greg Black comments that he took a look at Joe Armstrong's thesis I linked to below. Just in case his discription makes it sound intimidating, the error handling philosophy discussion --- let it crash --- is one section of chapter 4 (ie. about 3 pages). Of course, handling errors is only one step towards a reliable system.

In fact, the chapters of the thesis are largely approachable independently of each other. Chapters 2 and 4 (Architecture and Programming Principles) are particularly good in this regard.

In the meantime for those who are feeling too lazy to read the actual pdf, an executive summary:

We don't know how to write bug-free programs
So every substantial program will have bugs
Even if we are lucky enough to miss our bugs, unexpected interactions with the outside world (including the hardware we are running on) will cause periodic failures in any long-running process
So make sure any faults that do occur can't interfere with the execution of your program
Faults are nasty, subtle, vicious creatures with thousands of non-deterministic side-effects to compensate for
So the only safe way to handle a bug is to terminate the buggy process
So don't program defensively: Just let it crash, and make sure
1. Your runtime provides adequate logging/tracing/hot-upgrade support to detect/debug/repair the running system
2. You run multiple levels of supervisor/watchdog all the way from supervisor trees to automatic, hot-failover hardware clusters

Simple really ;)

Thursday, July 29, 2004

It's not as scary as it sounds.

No comments: