Etymon: 2004

Thursday, November 25, 2004

Well worth the time.

Went back and watched two of the Little Languages 2 Workshop presentations. Specifically Joe Armstrong and Todd Proebsting's presentations on "Concurrency Orientated Programming", and "Disruptive Programming Language Technologies" respectively. I highly recommend both these talks to anyone who is the slightest bit interested in programming languages, or has any serious involvement in development. It is just unfortunate that they are only available as real-video. Still if you have r-v installed, they are both well worth the time.

Note these are links to the two sessions of the workshop. It's convenient that both my favourite talks from this workshop happen to be first in each session.

Tuesday, November 23, 2004

The Theory of Classification.

Andrew Newman came across this set of articles on OO theory. I stumbled across them about a year ago, read the first few, changed jobs, and promptly forgot to go find them again. I remember them as well worth the time to read.

This is the first article in a regular series on object-oriented type theory, aimed specifically at non-theoreticians. The object-oriented notion of classification has for long been a fascinating issue for type theory, chiefly because no other programming paradigm has so sought to establish systematic sets of relationships between all of its types. Over the series, we shall seek to find answers to questions such as: What is the difference between a type and a class? What do we mean by the the notion of plug-in compatibility? What is the difference between subtyping and subclassing? Can we explain inheritance, method combination and template instantiation? Along the way, we shall survey a number of different historical approaches, such as subtyping, F-bounds, matching and state machines and seek to show how these models explain the differences in the behaviour of popular object-oriented languages such as Java, C++, Smalltalk and Eiffel. The series is titled "The Theory of Classification", because we believe that all of these concepts can be united in a single theoretical model, demonstrating that the object-oriented notion of class is a first-class mathematical concept!

Anyway, the links:

"The Theory of Classification"

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15.

Monday, November 15, 2004

Links found.

Just noting that I have updated the Arrows 'n Robots summary with links to citeseer and Wan's thesis page on the Yale-group pages. Also wanted to dump the url to the Yale group here so next time I can find it more easily.

http://haskell.cs.yale.edu/yale/ http://haskell.cs.yale.edu/yale/publications.html

Category Theory

I have had a few people ask me to explain the utility and signifigance of category theory. This proves rather difficult given I am a complete novice in this area myself; desprately trying to keep my head above water in the simplest of papers. Still I found this paragraph useful, and I decided to post it here so I can find it again.

Category theory unifies mathematical structures in a second, perhaps even more important, manner. Once a type of structure has been defined, it quickly becomes imperative to determine how new structures can be constructed out of the given one and how given structures can be decomposed into more elementary substructures. For instance, given two sets A and B, set theory allows us to construct their cartesian product A X B. For an example of the second sort, given a finite abelian group, it can be decomposed into a product of some of its subgroups. In both cases, it is necessary to know how structures of a certain kind combine. The nature of these combinations might appear to be considerably different when looked at from too close. Category theory reveals that many of these constructions are in fact special cases of objects in a category with what is called a "universal property". Indeed, from a categorical point of view, a set-theoretical cartesian product, a direct product of groups, a direct product of abelian groups, a product of topological spaces and a conjunction of propositions in a deductive system are all instances of a categorical concept: the categorical product. What characterizes the latter is a universal property. Formally, a product for two objects a and b in a category C is an object c of C together with two morphisms, called the projections, p: c → a and q: c → b such that, and this is the universal property, for all object d with morphisms f: d → a and g: d → b, there is a unique morphism h: d → c such that p o h = f and q o h = g. Notice that we have defined a product for a and b and not the product for a and b. Indeed, products and, in fact, every object with a universal property, are defined up to (a unique) isomorphism. Thus, in category theory, the nature of the elements constituting a certain construction is irrelevant. What matters is the way an object is related to the other objects of the category, that is, the morphisms going in and the morphisms going out, or, put differently, how certain structures can be mapped into it and how it can map its structure into other structures of the same kind.

Jean-Pierre Marquis, Stanford Encyclopedia of Philosophy: Category Theory

Friday, November 12, 2004

Arrows, Robots, and Functional Reactive Programming --- Summary

First some background reading from the bibliography I'll have to read eventually:

John Hughes. Generalising monads to arrows. Science of Computer Programming, 37:67-111, May 2000
Ross Paterson. A Notation for Arrow-based Libraries. In ICFP'01: International Conference on Functional Programming, pages 229-240, Firenze, Italy, 2001
Zhanyong Wan. Functional Reactive Programming for Real-Time Embedded Systems. PhD thesis, Department of Computer Science, Yale University, December 2002
Zhanyong Wan, Walid Taha, and Paul Hudak. Real-time FRP. In Proceedings of Sixth ACM SIGPLAN '00 Conference on Programming Language Design and Implementation (PLDI), pages 242-252, Vancouver, BC, Canada, June 2000. ACM, ACM Press.
Zhanyong Wan, Walid Taha, and Paul Hudak. Event-driven FRP. In Proceedings of Fourth International Symposium on Practical Aspects of Declarative Languages. ACM, Jan 2002

Naturally when you are using FRP the core concept is that of the signal. So we define signal to have the type Signal a = Time -> a, ie a function of Time returning a value of type 'a'; s(t). The robot used throughout the paper is a simple differential drive with a handful of basic sensors. So given two signals vr(t) -> vr::Signal Double and vl(t)vl::Signal Double which are the velocities of the right and left wheels respectively we can write an reactive program for calculating the robots position:

x     = (1 / 2) * integral ((vr + vl) * sin theta)
y     = (1 / 2) * integral ((vr + vl) * cos theta)
theta = (1 / l) * integral (vr - vl)

Of course the nice thing about this program is that it is a direct transliteration of the physics equations governing the motion of such a robot! This is why FRP is important.

Such programs are clear, consise, correct, and unfortunately completely unworkable in the general case. The problem is that while in the small (with simple toy examples) you couldn't get better than this, larger programs tend to lead to opaque space and time management with the resultant space and time leaks. The focus on this paper is how to retain most of the legibility and good domain match of FRP while preventing leaks. The FRP DSL developed in this paper is called Yampa, and it solves the leak problem by disallowing signals as first class values. Instead signals are manipulated indirectly via set of combinators called signal transformers or signal functions.

SF a b = Signal a -> Signal b

This of course is very similar to the Haskell IO or State monads, and reminisent of Hutton's Monadic Parser Combinators. On the other hand, as presaged by the subject, the authors use Arrows rather than Monads as the basis for their combinator library.

A good analogy for this idea is a state or IO monad, where the state is hidden, and a program consists of a linear sequencing of actions that are eventually run by an interpreter or the operating system. But in fact arrows are more general than monads, and in particular the composition of signal functions does not have to be completely linear

Specifically Arrow contains a fixedpoint combinator loop which provides access to recursion.

The basic combinators include.

lift function: arr :: ( a -> b ) -> SF a b: Takes a normal function from type 'a' to 'b', and returns a signal-function/transformer that will convert a signal of type 'Signal a' to 'Signal b'.
compose: (>>>) :: SF a b -> SF b c -> SF a c: Takes two signal functions and returns the composition of them
tuping: (&&&) :: SF a b -> SF a c -> SF a (b, c): This permits you to take one signal and apply two functions to it similtaniously producing two signals both independently derived from the input. And of course, once you have tupled outputs....
tuple-transform: (***) :: SF a b -> SF c d -> SF (a c) (b d): Take two signal functions and returns a transformer that applies them to the respective signals in a tupled signal.

Now arr, (>>>), and (&&&) form a minimal universal set. However to fit the Arrow framework defined by Hughes, we use the original minimal set defined as arr, (>>>), and first (which only takes one argument, and applies it to the first signal in the tuple, passing the second unchanged).

Some standard combinators provided:


identity :: SF a a
constant :: b -> SF a b
time :: SF a Time

Also the standard arrow combinators (remember, when thinking of FRP, just substitute 'SF' for 'a' in the type-sigs):



arr :: Arrow a => (b -> c) -> a b c
arr2 :: Arrow a => (b -> c -> d) -> a (b, c) d
(>>>) :: Arrow a => a b c -> a c d -> a b d
(<<<) :: Arrow a => a c d -> a b c -> a b d
first :: Arrow a => a b c -> a (b, d) (c, d)
second :: Arrow a => a b c -> a (d, b) (d, c)
(***) :: Arrow a => a b c -> a b' c' ->a (b, b') (c, c')
(&&&) :: Arrow a => a b c -> a b c' -> a b (c, c')
loop :: Arrow a => a (b, d) (c, d) -> a b c

Note that with loop, 'd' provides a channel for feedback, and hence recursion. Notice also the strong streamish feel to the api. I would like to see a Arrow based treatment of the unix-pipe concept. I think it could result in a powerful scripting framework. Also worth noting that some signal functions are stateful (integral is one example), and as such cannot be defined without access to the underlying signals (forbidden). Hence they must either be pre-defined, or defined as a combination of pre-defined stateful signal functions.

So now convert the original FRP expression for x(t) to raw arrow-form:

x = (1/2) * integral ((vr + vl) * cos theta)

given: vrSF :: SF RobotInput Velocity  (SF that extracts vr from a RobotInput)
   and vlSF :: SF RobotInput Velocity  (SF that extracts vl from a RobotInput)
   and thetaSF :: SF RobotInput Angle

xSF :: SF RobotInput Distance   -- Signal Function transforming RobotInput -> Distance
xSF = (((vrSF && vlSF) >>> arr2 (+)) &&& (thetaSF >>> arr cos)) >>> arr2 (*) >>> integral >>> arr (/2)
  or slightly more legible:
xSF = let v = (vrSF &&& vlSF) >>> arr2 (+)
          t = thetaSF >>> arr cos
      in (v &&& t) >>> arr2 (*) >>> integral >>> arr (/2)
  which isn't all that much better.

So what we have gained above is the ability to mediate access to signals behind a reasonably safe, yet complete abstraction. However the loss in clarity is striking. Fortunately there is a proposed arrow-syntax that is similar to Haskell's monadic do-syntax.

xSF' :: SF RobotInput Distance
xSF' = proc inp -> do
  vr    <- vrSF     -< inp
  vl    <- vlSF     -< inp
  theta <- thetaSF  -< inp
  i     <- integral -< (vr + vl) * cos theta
  returnA -< (i / 2)

Which while not as nice as the original FRP, has the triple advantage of being declarative, legible, and safe!

It is important to note that the arrow syntax allows one to get a handle on a signal's values (or samples), but not on the signals themselves. In other words, first recalling that a signal function SF a can be thought of as a type Signal a -> Signal b, which in turn can be thought of as type (Time -> a) -> (Time -> b), the syntax allows getting a handle on values of type a and b, but not on values of type Time -> a or Time -> b.

Although I can't reproduce it here, Fig 2 is dramatic. The correspondance between arrow-based signal-functions, and the signal-flow-diagrams shown is striking, and exciting. I have been excited about the potential of FRP to liberate UI programming from the tyranny of the state-machine; Fig 2 amazed at the clarity available. I am now, more than ever, convinced that I am on the right track.

Events diverge from Signals slightly in that while they are also temporal values, event's don't so much change from one value to another over time, as simply not exist between 'impulses' (in the signal processing sense). In fact their is an isomorphism between impulse-streams vs. continuous-signals and event-streams vs. signals. In previous papers the authors accepted this distinction as signifigant, and the two were treated seperately. However by noting an isomorphism between Event a and Maybe a they are able to define a signal of type Signal (Event b) ie. an event stream, which becomes a signal that can be thought of as alternating between Nothing and Just a. A signal function of type SF a (Event b) generates an event stream, and is called an event source.

There are a number of combinators for use with event-streams. I'm just going to list them with minimal discussion, for more than that see the paper.

switch, dSwitch :: SF a (b, Event c) -> (c -> SF a b) -> SF a b  
        -- switch or delayed-switch.  Note the function from the event payload to a new SF a b.

rSwitch, drSwitch :: SF a b -> SF (a, Event (SF a b)) b 
        -- note event is defined to carry an SF (possibly inserted by tag), this allows for recurring switch.

tag :: Event a -> b -> Event b

mergeBy :: (a -> a -> a) -> Event a -> Event a -> Event a
lMerge, rMerge :: Event a -> Event a -> Event a
  -- Different ways of handing event collision.

-- Useful event-sources:

edge :: SF Bool (Event ())
  -- Obtain an edge trigged event-stream from a (Signal Boolean)

never :: SF a (Event b)
now :: b -> SF a (Event b)
after :: Time -> b -> SF a (Event b)
repeatably :: Time -> b -> SF a (Event b)

-- Useful event-stream transform back to signal space:

hold :: a -> SF (Event a) a
accum :: a -> SF (Event (a -> a)) (Event a)

accumHold :: a -> SF (Event (a -> a)) a
accumHold init = accum init >>> hold init

Finally just to show how it goes together, the paper implements a few robot controllers. First you implement your robot's controller as a signal function from RobotInput to RobotOutput. So a FRP mapping from the robots sensor-signals to signals which the signal-function-processor will apply to the robot's actuators. This controller is actually parametised by the robots properties, but that isn't surprising:

type RobotController = RobotProperties -> SF RobotInput RobotOutput.

Finally the FRP engine is implemented as a function from a controller to IO(). In the case of the RobotSimulator implemented for the paper, the engine also takes a second controller and a world description.

runSim :: Maybe WorldTemplate -> RobotController -> RobotController -> IO ()

Of course, returning the IO monad, this can be trivially instansated as as Haskell main function.

All in all a fascinating paper --- well worth the time to read.

Thursday, November 11, 2004

Arrows, Robots, and Functional Reactive Programming --- Prelude

Now this is a fun paper. I mean, Category Theory, FRP, and you get to play with Robots! What more could you want? So lets take these three elements individually, and then I will try and summarise the paper.

Category Theory

My understanding of Category Theory is as a systematic treatment of abstraction and generalisation. It allows us to take collections of mathmatical proofs and theorms that have the same structure/form, abstract them into a categorical notation, and then generalise them by providing mappings from the categorial constructs to concepts in specific domains. This is naturally of signifigant interest to type-theorists who treat programs as proofs and want access to the power of parametic/polymorphic types and functions without sacrificing rigor. Arrows are a concept from category theory, a generalisation of Monads which beyond just lifting sequencing to a first-class object, allow parametised composition of sequencing. This gives us the ability to support the more complex temporal sequencing of operations required by...

Functional Reactive Programming

Or as that is a bit of a mouthful, FRP; is one of the most intuitive and widespread programming paradigms in use today. It is an approach to programming that has been made so unobtrusive that four of the five other members of my family use it (in the form of a spreadsheet) without realising they are programming. At a basic level, FRP introduces a new paramatised datatype into the language, the time-varying 'signal'. By then permitting functions to be defined that accept these signals and return time-varying signals, you eliminate the shared-state, duplicated-logic, fragmented code paths, and explicit state-machines spawned by current, traditional event-driven approaches. As a basis for comparison, just imagine comparing a traditional spreadsheet with one where to implement a formula for a cell you would have to attach an 'onChange' callback to each referenced cell that recalculated the value.

Robots

Robots have a number of constraints that have not been traditionally considered strengths of functional programming.

Large numbers of discrete and continuous time-varying inputs that are most definately *note* referentially transparent.
Real-time constraints that impose a requirement to avoid time-leaks, which encourages the use of imperative operation sequences. Execution of which can be easilly suspended and resumed as higher priority tasks require resources.
Extensive use of embedded controllers and limited compuational resources that place a premium on transparency of resource use and strict avoidance of space-leaks.
A history of research largely based on low level system languages (ie MIML) and assembly, which makes for a hostile reception for many of the assumptions of higher-level functional programming

Of course FRP is a natural fit to the first problem. However it has traditionally done so by trading off certainty and predictability in both time and space analysis. This paper attempts to address these two problems in order to allow the exploitation of FRP's natural (theoretical) fit to robotics to be exploited.

Bananas, Lenses, Envelopes, and Barbed Wire --- Finale

Finished the Bananas and Lenses paper on the weekend, along with another on FRP that I'll summarise later. Section 4 takes the previous work based on Lists from section 2 and combines it with the category-theory results from section 3 to develop a theory of cata/ana/hylo/para -morphisms over arbitary datatypes. Section 5 is a very brief extension to parametric types.

Unfortunately I simply don't have enough category theory to follow the discussion properly, let alone explain it here. I think it is probably time to make another visit to Amazon and pick up a book or two on category theory. I'm thinking probably Pierce's introduction, and possibly Crole's "Categories for Types" on the basis of the review there. I've already read Pierce's intro, and it is mercifully brief and yet even from a quick reading I found I remembered enough to follow fragments of the paper. Unfortunately I found it too brief to allow me to learn enough to follow the rest :/.

Anyway, this paper was just for fun; fun was had; and I come away with a fresh perspective on recursion. That counts as a win for me.

Wednesday, October 06, 2004

Back.

Got back on saturday from a week long holiday helping run a holiday computer-camp for high-school kids. Now mostly recovered, and will probably blog a little on it later.

In the meantime thought I might quickly point at an interesting Wired article I found on slashdot. Looking at one of the most interesting economic effects of the net, the growth of niche markets. Recommended reading The Long Tail.

Monday, September 20, 2004

Just how stupid do you think we are?

Seen on #prolog on freenode:

--> Initial_KK (noemail@CPE00e0292573fd-CM400026081604.cpe.net.cable.rogers.com) has joined #prolog
<Initial_KK> hi All
<Initial_KK> what's this program do?
<Initial_KK> member(X,[X|_]).
<Initial_KK> member(X,[_|T]) :- member(X,T).

My guess? Your Homework you lazy slob.

Friday, September 17, 2004

Orthogonality == Flexible ?

As a core developer of Kowari I am regularly faced with both the good and the unfortunate aspects of RDF. One thing I continually run up against is the seemingly arbitary distinctions between subject, predicate, and object nodes. In this particular case, I still can't find out why RDF won't permit blank nodes as predicates. The same goes for Literals as subjects and predicates. Having a string literal as a predicate or subject is a little strange, and almost certainly bad style, but a typed literal most certainly makes sense. If I want to make statements about the number 5 (LT 6, GT 4, isPrime, isOdd, etc) currently we either have to use bnodes, or fake URIs, so we get:

_:bnA <owl:sameAs> '5'^^<xmls:decimal>
_:bnB <owl:sameAs> '4'^^<xmls:decimal>
_:bnC <owl:sameAs> '6'^^<xmls:decimal>
_:bnA <lessThan> _:bnC
_:bnA <greaterThan> _:bnB
....

vs.

'5'^^<xmls:decimal> <lessThan> '6'^^<xmls:decimal>
'5'^^<xmls:decimal> <greaterThan> '4'^^<xmls:decimal>
...

This is particularly important when you consider what's involved in merging two such documents. Consider trying to insert a document containing primes, with a document containing prime-factors. As we are currently forced to use bnodes, and bnode-refid's are only valid within a single document we are going to have to stuff around doing bnode unification.

Finally the fact that a URI is itself perfectly fine as a typed literal, suggests that all we really need is bnodes and typed literals (and a fair bit of syntactic sugar, but that's a serialisation issue, so we can ignore that for semantics). At the end of the day, RDF is very cool, but I suspect a fully orthogonal triple-store is even cooler. I suppose it's just fortunate that Kowari is an RDF-store built on top of a generic triple-store ;)

Friday, September 10, 2004

Reading

Sometimes life throws you a curve ball. The past week I have been scrambling to help my parents respond to their landlords reasonable, yet highly inconvenient, desire to sell their house vacant possession. Consequently my blog has been neglected of late.

I enjoy reading essays, and I have definately enjoyed reading some of Malcolm Gladwell's contributions to the New Yorker (Gladwell also wrote The Tipping Point, a book I am going to have to read that discusses the effects of non-linearity in social systems). I particularly enjoyed Blowing Up, a look at non-linearity applied to the stock-market; The Mosquito Killer, a facinating look at the anti-malaria campaigns of the 20th century, and DDT in particular; The Dead Zone, the hunt for the Spanish Flu; The Coolhunt, a curious look into the 'cool hunters', a profession I never realised existed. Finally The Tipping Point, an article written four years before the publication of Gladwell's book, a reminder that non-linearity is a fact of life, and humans are particularly bad at anticipating the ramifications.

Thursday, September 02, 2004

Scheme Cookbook.

A few of my friends are also in the process of learning scheme. Stumbled across this interesting site which will not doubt be useful: The Scheme Cookbook. Greg, with his habit of implementing cat will likely be particularly interested in the chapter on Files and Directories.

Paul Graham is at it again.

Paul Graham has published yet another essay. This time the topic is, not inappropriately, essays. As expected, the essay is worth the time to read --- if nothing else his essays are entertaining. Graham presents the essay as a form of exploratory reasoning, attempting to differentiate it from the dissertation which he presents as far more rhetorical in nature. To illustrate this he draws an analogy between the relationship between thought and essay and between conversation and dialogue.

Fundamentally an essay is a train of thought-- but a cleaned-up train of thought, as dialogue is cleaned-up conversation. Real thought, like real conversation, is full of false starts. It would be exhausting to read. You need to cut and fill to emphasize the central thread, like an illustrator inking over a pencil drawing. But don't change so much that you lose the spontaneity of the original.

I couldn't help wondering if a better analogy would be the relationship between blog and essay. I certainly don't take the time to polish my blog posts Paul takes to polish his essays.

I personally found it thought provoking, and if that is a common experience, I guess that would make it a successful essay. I certainly wasn't surprised to see it dwell on Graham's other bug-bear, the shocking state of the modern high-school curricula. Still I find some irony in my main response: pondering the possibility that the internet may trigger a resurgence in the importance placed on rhetoric and the essayist in our society; this would certainly be no bad thing.

Wednesday, September 01, 2004

Equality and Equivalence.

My previous entry on the implementation of Object.equals() has attracted some comment, but the principle response was from Adrian Sutton, who objected to my example of a buggy equals implementation.

In one respect his criticism is valid; in general a subclass should delegate to its superclass for inheritited state equality. To do otherwise is to break encapsulation, and duplicate code. On the otherhand his complaint misses the point:

Class B is incorrectly assuming that it knows how to compare class A's data, and more importantly class B is changing the behavior of A in a way that is not compatible with the original model. Essentially, class A defines equality as having the same n1 values and class B is in the wrong for changing that. [emph added]

What Adrian misses is that if B is unable to meaningfully extend A's definition of equality then A should indicate that by declaring its implementation final, in the absence of a final declaration, B is entitled to extend it. I give an example of a meaningful use of final/instanceof in the Shape/Circle/Ellipse example at the end of the post, which I chose as it conveniently demonstrates all three cases associated with appropriate use of instanceof:

Abstract class implementing equals using instanceof to allow proper delegation of equality in subclasses.
Extensible concrete class, uses instanceof to allow subclassing; equals declared final to enforce contract
Subclassing the concrete class, overriding accessors permitting inherited (unoverridable) equals method to operate correctly (besides meeting the superclass contract ;).

On the other hand Adrian is mistaken when he suggests that the use of getClass() in equals is a bug. There are two flaws with his reasoning. The first arises from a confusion between syntax and semantics. Adrian offers a code fragment, suggesting that it would need A to use instanceof for correctness:

class C extends A {
  public getValue() {
    return n1;
  }
}

The confusion is in thinking this defines a new class. This is a discussion of semantics, hence any class offered as a counter example must needs be a new class semantically, while the above is strictly syntax. The easiest way to approach the semantics of a class is as a function constructing independent, concurrent, finite automata. Now C doesn't define any new states, and doesn't define any new behaviour (state transitions), semantically then it doesn't define a new class --- mathmatically, C is still isomorphic up to congruence with A, so in a very real sense they are the same class. What C actually defines is a function. A new, if rather clumsy, projection on A returning n1. I had better make it clear that I'm not suggesting there is a non-clumsy way of doing this, rather that the reason why it is valid to want an C to be equivalent to a A is that C doesn't define a new class, just a new function. Hence it is a completely different case to B, which is a non-congruent class. If you have a concrete class that must support subclassing, instanceof is a bug.

Moreover this is exactly what we would expect when we consider what it actually means for two objects to be equivalent. Remember, an object is an FA, and as such there is a well established body of work on the meaning of equivalence, anyone interested should read up on simulation and bisimulation. Informally however the key is recognising that the bundling of state and behaviour provided by an object is not just something we get to ignore when it becomes inconvenient, at least not without losing correctness. That if we are serious about OO, we are not just answering the question "are we currently in the same state?", but "are we in the same state, and will we remain in the same state in response to the same stimula?". In other words, it doesn't matter that B.n1 == A.n1, if A and B have different behaviours they aren't the same object.

It becomes particularly important to recognise this when you face the prospect of also implementing Comparable. The problem here is that the contract for compareTo is that of a Total Order. The best you can manage on A U B is a Pre-order, and once again you are left with the prospect of subtle and obscure bugs. From the javadoc for Comparable:

This interface imposes a total ordering on the objects of each class that implements it. This ordering is referred to as the class's natural ordering, and the class's compareTo method is referred to as its natural comparison method. ... For the mathematically inclined, the relation that defines the natural ordering on a given class C is:
       {(x, y) such that x.compareTo((Object)y) <= 0}.
The quotient for this total order is:
       {(x, y) such that x.compareTo((Object)y) == 0}.
It follows immediately from the contract for compareTo that the quotient is an equivalence relation on C, and that the natural ordering is a total order on C. When we say that a class's natural ordering is consistent with equals, we mean that the quotient for the natural ordering is the equivalence relation defined by the class's equals(Object) method:
     {(x, y) such that x.equals((Object)y)}.

Yet after all this, I do share Adrian's pain in this. The unfortunate fact is that if the author implemented A.equals properly, but forgot to include an accessor to A.n1, there is no clean way of gaining access to it. If you do find yourself in this fix, you are left with either direct bytecode hacking (the JVM doesn't know about the access permissions, so you can just access n1 directly), or package injection (if n1 doesn't have private scope). Alternatively, if the author has followed Adrian's advice, it becomes impossible to implement B correctly, and as Adrian's use of instanceof is wrong at a semantic level, no amount of bytecode hackery will save you.

So yet again we come back to my original point. Either you are implementing an abstract class; or you are writing a concrete class and can support either subclass equivalence, or polymorphic equality. "But I need both"!? Tough; Java doesn't support both, and a good engineer knows to work within the limitations of their tools. That a sizable segment of the java community continues to ignore the limitations of their language by presenting incorrect code, pretending that both are achievable, is a sad indictment on the state of that community.

Tuesday, August 31, 2004

Bananas, Lenses, Envelopes, and Barbed Wire --- Section 3

Apparently the paper gives all it's definitions in terms of the category CPO (Complete Partial Orders); not entirely sure what the difference is between a complete and an incomplete partial-order is, so the signifigance probably escapes me. Alot of this section is also reminisent of Piece's Introduction to Category Theory, so I should probably go back and reread that.

Functors: I remember functors being fun reading in Pierce. In this case we are interested in bifunctors and monofunctors; being types into types and functions into functions; and just functions into functions respectively. Both cases preserve id and compose.

Products: Standard category fare. We have the bifunctor (at least I assume it's a bifunctor, the paper dosn't say, but it meets the defn' above) || which takes two types and maps to the type of pairs of those two types (the paper's example: D||D' = {(d,d') | d <- D, d <- D'} --- I'm using <- instead of the 'element of' symbol, I really need to get a better blog editor, one I can type symbols into easily; (f||g)(x,x') = (f x, g x') ) and the expected projection and tupling combinators.

Sum: Interesting, latent-typing as a functor, or at least something that looks like latent-typing to me. Also interesting to note the use of bottom (_|_) to include undecidability.

 D|D'          = ({tag0}||D}) U ({tag1}||D'}) U {_|_}
(f|g) _|_      = _|_
(f|g) (tag0,x) = (tag0, f x)
(f|g) (tag1,x) = (tag1, g x)

Also worth noting the injection and selection operators:

i0 x = ({tag0},x)
i1 y = ({tag1},y)
f v g --- which applies f to arguments tagged 0,
                and g to arguments tagged 1.

Arrow: Some type of 'wrapping' function. The example given is (f -> g) h = g $ h $ f. This suggests some sort of injection into f -> g, but I'm not entirely sure how. Apparently this is related to curry, uncurry, and eval, but again I don't see the relationship.

Identity, Constants: Pretty much as expected. Lifting and Sectioning, will require a little more thought.

Interesting definition of a constant as a function of 1 -> A. 1 of course having only a single member void. Also given p::A->bool, and p? a = i0 a or i1 a if p a is true/false respectively, you have f v g $ p? modeling if p then f else g.

The rest of section 3 easily exceeds my grasp of category theory. We define a representation for categorical recursion, then move on to recursive types, specifically the least-fixed-point of a monofunctor. Three examples given are cons-lists, binary trees, and natural numbers.

Having covered to background theory, the paper moves on to definitions of our operators over arbitary data types. That will have to wait until I've recovered from section 3 ;).

Monday, August 30, 2004

Bananas, Lenses, Envelopes, and Barbed Wire.

Read the first two sections of "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire" over lunch. So far a very interesting paper, although looking ahead, I suspect it will probably endup exausting my understanding of category theory reasonably soon.

The paper has 6 sections. Section 1 is a short introduction, motivation, and review of the literature; section 2, a short introduction to the four recursion operators cata, ana, hylo, and para; section 3 draws from category theory to outline a theory of algebraic data types used in the rest of the paper; section 4, a formal definition of the operators introduced in Sect2 using the theory developed in Sect3; section 5 extends some of sect4 to parametized types; section 6 is the conclusion, acknowlegements, and references.

The paper aims to define a series of four 'recursion operators'; cata, ana, hylo, and para, and then use them as the basis for a functional programming calculus. These operators correspond to constructors for cata-, ana-, hylo-, and paramorphisms respectively. The reason for doing this is that unstructured recursion isn't amenable to structual analysis and manipulation, so it is preferable to generalise various recursive forms into various forms of structured recursion before trying to develop the calculus.

The examples given are all with respect to your standard inductive list data-type:

A* -> Nil | Cons(A A*)

(It's possibly worth noting that this is a direct algebraic analogue to your standard linked-list definition in an imperative language)

Cata

Constructs a catamorphism, more generally known as a right fold. Think of it as collapsing a list into a value. It takes two parameters, a binary function and an identity.

h = cata(b, .) 
  ->  h Nil          = b
      h (Cons(a,as)) = a . h(as)

and

h :: A* -> B
b :: B
. :: A -> B -> B

cata is denoted using 'banana brackets': (|b, a|). A good example of a catamorphism is length().

length = cata(0, (+) $ 1)  
  note: (+) $ 1  is the right assoc composition of addition and the 1
        function (which ignores its argument and returns 1)
 
proof by substitution:

length Nil = 0
length (Cons(a,as)) =  ((+) $ 1) a length(as)
                    => (+) 1(a) length(as)
                    => 1 + length(as)

Which is of course the standard definition of length::A* -> N

Ana

Constructs an anamorphism, more generally known as an unfold. This is naturally the inverse of a fold, and expands a value into a list. It takes two parameters, an induction function and a termination predicate.

h = ana(g, p) 
  ->  h b = | p b  -> Nil
            | else -> Cons(a, h b') where (a, b') = g b

and

h :: B -> A*
g :: B -> (A B)
p :: A -> boolean

ana is denoted with 'lense brackets' ([g, p]). A good example of an anamorphism is zip, an easier one to follow is count(n), which returns the list of numbers incrementing from n.

count n = ana(g, False) where False a = false,
                              g a = (a, a + 1)

Hylo

Constructs a hylomorphism. This is your classic inductive recursion from one value to another (think factorial, or fibonacci). This can be represented as an anamorphism that builds the recursive call-stack, followed by an catamorphism that folds the stack into a result.

h = hylo(c, ., g, p) 
  ->  h a = | p a  -> c
            | else -> b . (h a') where (b, a') = g a

and

h :: A -> C
c :: C
. :: B -> C -> C


g :: A -> (B A)
p :: A -> boolean

hylo is denoted using 'envelope brackets': [|(c, .), (g, p)|]. A good example of a hylomorphism is factorial.

fact n = [|(1, *),(g, p)|] where g n -> (n, n - 1),
                                 p = | 0    -> true
                                     | else -> false

substitute:

fact = hylo(c, ., g, p) 
  ->  fact n = | n == 0  -> 1
               | else    -> b * (fact a') where (b, a') = (n, n - 1)
            => | else    -> n * (fact (n - 1)).

Which is of course the classic factorial

Para

Constructs a paramorphism. This is the weakest description of any of the four. I can't really claim to understand para, but it is immediately recognisable as structured recursion over inductive datatypes. I'm hoping that the definition will be provided later in the paper, but unlike the other three, this intro section only provided examples. It does however cite another paper by one of the authors Meertens, Paramorphisms. The example for numbers (Num ::= 0 | 1 + num):

h = para(b, .) 
  ->  h 0       = b
      h (1 + n) = n . (h n)

para is denoted using 'barbed-wire brackets': [<b, .>]. Factorial can be written as a paramorphism as:

fact n = [< 1, (g) >] where n g m -> (1 + n) * m

Friday, August 27, 2004

Object.equals()

Like many languages Java provides two equality methods to test objects for equivalence. The first is accessed via the '==' operator and does a simple heap reference comparison. The other is a standard method on Object, equals() which is intended to be overridden by the programmer to perform a value-based comparison. Now the equivalence operator is defined by the language spec, and ultimately implemented by the JVM developers, hence not our concern. Unfortunately the java community has failed to properly document the subtleties involved in writing a correct equals method, and so I continue to come across incorrect implementations and worse, incorrect recommendations.

But first we have to understand what we are dealing with. Fundamentally when we are defining an equals method on a class A, what we are defining is an Equivalence Relation over the set of all objects of class A. This is important because it highlights that we can define our equals method to return anything we want provided we meet three rules.

Reflexivity: a.equals(a) -> true
Symmetry: a.equals(b) <-> b.equals(a)
Transitivity: a.equals(b) && b.equals(c) -> a.equals(c)

If we break any of these rules someone will suffer subtle and devious bugs.

Node: To save typing I'm going to leave out the hashCode method in each of these examples, but be aware if you override equals() you *must* override hashCode() to match.

One consequence of the requirements is that we must ensure that any test we use in our equals implementation meets these requirements as well. So lets consider the standard mistake:

class A {
  int n1;
  public boolean equals(Object o) {
    if (!(o instanceof A)) {
      return false;
    } else {
      A rhs = (A)o;
      return this.n1 == rhs.n1;
    }
  }
}

This is a very common implementation of equals, it has even made its way into a number of Java books. Worse, the bug only manifests in the presence of implementation inheritance of concrete classes.

class B extends A { 
  int n2;
  public boolean equals(Object o) {
    if (!(o instanceof B)) {
      return false;
    } else {
      B rhs = (B)o;
      return this.n1 == rhs.n1 && this.n2 == rhs.n2;
    }
  }
}

To demonstrate the bug lets define a simple function on A's:

static printEqual(A a1, A a2) {
  if (a1.equals(a2)) {
    System.out.println("EQUAL");
  } else {
    System.out.println("NOT-EQUAL");
  }
}

And test it:

  A test1 = new A();
  A test2 = new B();
  printEqual(test1, test2);
      => Prints "EQUAL"
  printEqual(test2, test1);
      => Prints "NOT-EQUAL" !!!!

And we've broken symmetry. This will bite you. There is no situation when an asymmetrical equality is not a bug. Unfortunately it is scary how many places on the net this is presented as the exemplar equals() implementation.

So what does a correct equals method look like. Well there are two. The first is very straight forward, and should be the default choice.

class A {
  int n1;
  public boolean equals(Object o) {
    if (o == this) {
      return true;
    } else if (o == null || !getClass().equals(o.getClass())) {
      return false;
    } else {
      A rhs = (A)o;
      return this.n1 == rhs.n1;
    }
  }
}

class B extends A { 
  int n2;
  public boolean equals(Object o) {
    if (!super.equals(o)) {
      return false;
    } else {
      B rhs = (B)o;
      return this.n2 == rhs.n2;


    }
  }
}

Now test1 != test2 AND test2 != test1. Also, by delegating from B to A at the start of B::equals() we ensure any inaccessible state in A is handled appropriately, and more importantly, that the equality code for A isn't duplicated in B. If you can't delegate from B->A like this, then it suggests that your B isn't actually an A, and you should check your class design (you probably should be using delegation rather than inheritance).

There are times however when you would really like an B to compare equal to an A. The classic example is the circle/ellipse or square/rectangle shape examples. It would be really nice if an ellipse with equal axes compared equal to an equivalent instance of a circle sub-class. The way we can achieve this is to recognise that the core requirement of a correct equals implementation is that the operations we use to implement meet the same reflexivity/symmetry/transitivity requirements as the method itself. Next note that instanceof is only asymmetric in the subclass; and in fact any operation in the subclass will lead to asymmetry! So if we do need subclassed equality, we can't permit a subclass to override the equals() method.

abstract class Shape {
  public boolean equals(Object o) {
    if (o == this) {
      return true;
    } else if (o instanceof Shape) {
      // !! Note: instanceof is safe here as Shape is abstract.
      Shape rhs = (Shape)o;
      return [shape specific comparison --- maybe position];
    } else {
      return false;
    }
  }
}

class Ellipse extends Shape { 
  int minor, major;

  public int getMinor() { return minor; }
  public int getMajor() { return major; }

  // !!! NOTE THE USE OF *final* to prevent asymmetry!
  // Also note the exclusive use of accessors.
  public final boolean equals(Object o) {
    if (!super.equals(o)) {
      return false;
    } else if (getClass().equals(o.getClass())) {
      Eclipse rhs = (Eclipse)o;
      return this.getMajor == rhs.getMajor()
          && this.getMinor == rhs.getMinor();
    } else {
      return false;
    }
  }
}

public Circle extends Ellipse {
  public Circle(int radius) { major = minor = radius; }
  public int getRadius()    { return major; }
}

Note that this is probably a rather clumbersome hierachy. A better approach would be:

public abstract class Ellipse {
  public abstract int getMajor();
  public abstract int getMinor();
  public final boolean equals(Object o) {
    if (!super.equals(o)) {
      return false;
    } else {
      Eclipse rhs = (Eclipse)o;
      return this.getMajor == rhs.getMajor()
          && this.getMinor == rhs.getMinor();
    }
  }
}

public class EllipseImpl {
  int major, minor;
  ...
}

public class CircleImpl {
  int radius;
  public int getMajor() { return radius; }
  ...
}

Note that an CircleImpl will still compare equal with an appropriate EllipseImpl.

Please. If you are programming Java, believe me, these really are the only two options open to you. I really am very tired of tracking down the subtle bugs that result when this advice is ignored.

Thursday, August 12, 2004

SICP Lectures.

A friend of mine is studying CS at UQ, and this year they are using The Structure and Interpretation of Computer Programs as a first year text. I am personally pleased to see UQ buck the 'vocational education' trend that has been decimating IT degree courses recently. OTOH it does make life a little tougher on the students, and although this is a good thing, there are ways of making life a little easier. One is to point students at the videos of SICP lectures given by Abelson and Sussman at MIT. I have watched them and they are very good, I highly recommend them to anyone approaching the book looking for some extra insight into the material. It is also worth pointing at the online version of the book in case anyone wasn't aware it was available.

Cale the Corporatist Corsair

Apparently the BSA are looking for a new name for their new mascot. I'm a little surprised at their honesty, personally I find a corporate extortionist weasel who attempts to indoctrinate tech-savvy kids into acquiescing to the corporatist annexation of their public-domain rights, rather appropriate for the BSA.

Personally I'm pushing for "Cale the Corporatist Corsair".

Monday, August 09, 2004

Why Java sucks.

Ok, so this could just have easily titled 'Why [C/C++/Ada/...] sucks', but I use Java more than either of these atm, so I went with Java. Back on lambda the 'Why types are interesting' thread continues unabated. However this post by Paul Snively grabbed my attention. It is definately worth reading in its entirety, but two paragraphs in particular stand out.

[Alan] Kay takes a hard opposite perspective to mine on the static typing vs. late binding question, saying, in summary (not a direct quote), "What to do until software engineering is invented... do late binding, which means being able to make the changes easily." Actually, perhaps it's not really opposite, since he could be interpreted to mean "either do software engineering if you can or make the cost of changes as low as possible, which is best done at the current state of the art with late binding." That is, get out of these tools that do neither software engineering nor dynamic change well (C, C++, Java...) and either head for software engineering (SML, O'Caml, Haskell, Concurrent Clean, Epigram, Alice...) or head for late binding (Common Lisp, Scheme, Smalltalk, Oz...). Formulated that way, I can't help but agree with him, and merely choose to focus on heading towards software engineering after spending decades in late binding.

At the end of the day, I want modern type theoretic ideas in my programming language so that, at compile time, I can:

Know that my code is deadlock-free.

Know that my code is race-condition free

Know that my code doesn't leak authority.

Know that my code is upgradable without violating encapsulation or leaking authority.

Will parallelize across multiple processors automagically when possible and serialize when it's not.

Will not throw an exception that will not be caught and handled.

Will have inferred all memory allocation and deallocation and do it automagically for me, letting me know if the dynamic characteristics of my code require the linking of a garbage collector.

Will let me know if I attempt to take the tail of an empty list. :-)

In case anyone dosn't realise, all the above points are already possible, although AFAIK not yet demonstrated possible at the same time.

Interesting scripting language for Java.

I had a quick look at groovy a few months back when it was announced on lambda. To be perfectly honest I wasn't particularly impressed. If I want a latent-typed, 'agile' scripting language, and don't want to fight too hard to be allowed to use it, I will use Jython; or if I'm writing for myself, or have a tech-savy boss, I'll use the scripting language, aka scheme (specifically: SISC or possibly JScheme). As far as I can tell it's just Yet Another Scripting Language. I mean, what's the point!?

Then I came across a comparison between groovy and another JVM scripting language, Nice. Bryn Keller does a quick comparison between groovy and nice on his blog. So what makes nice interesting, when groovy is so yawn-worthy? Well nice is statically typed! It looks like a serious attempt at dragging the java world into the world of real compile time type safety --- a world without NullPointer/ClassCastExceptions, and with the free-flowing syntactic style of your traditional latent-typed languages.

As an example compare the following (from the xoltar post):

Groovy

if (customer.orders.any { it.amount > 1000 && 
	it.product.type == "cheese"} ) {
  doSomething();
}

Nice

if (customers.orders.any(Order ord => ord.amount > 1000 && 
	ord.product.type == "cheese")) {
  doSomething();
}

Which given you can infer the type of ord from its use is intended to shortly become:

if (customers.orders.any(ord => ord.amount > 1000 && 
	ord.product.type == "cheese")) {
  doSomething();
}

Which differs from the groovy example in only the trivial matter of being able to name the parameter to the closure; and in the not-at-all-trivial matter that the compiler will ensure static type saftey at compile time!.

Just skimming the nice website, I still suspect I will find ML a nicer language, still a language targeted at interoperation with Java that supports: real parametric-types, proper closures, multiple-dispatch (goodbye visitor pattern ;), and tuples; all in a statically typed language with HM inferencing. What's not to like?

Saturday, August 07, 2004

Tired.

I just finished my third first-aid duty at the queensland royal show (aka Ekka). It was only 6 hours, so I don't know why I'm feeling so exhausted. For some reason duties seem to tire me out, this one wasn't even particularly eventful (fortunately); I wrote up incident reports for two almost identical injuries from one of the sideshow rides when I learnt there had been more than my two cases. Hopefully something will be done about it, I hate to see kids hurt. Was assigned ringside during the leadup and through the fireworks which was nice, although the fireworks were rather disappointing they were still fun.

I wrote some quick examples of continuation passing style over lunch before I went on duty, but I'm too tired to write the commentary now, so I'll do that tommorrow or monday and post part-3 then. Byron will be pleased though, one of them is a simple regex engine which I think comes out rather nicely using CPS.

For now it's bedtime I think ;).

Friday, August 06, 2004

Continuations - part 2

Back to my attempt to explain continuations.

What is a continuation? The short answer is "What the program should do next". Another way of putting it would be "Everything the program hasn't done yet, wrapped up in something that looks sort-of like a function". Now I'm going to make the possibly arrogant assumption that as that wasn't sufficient explaination for me, that wasn't sufficient here.

Let's go back and reconsider an LCO example, this time we want to write a sigma function: takes a non-empty list of numbers and returns the sum of the list.

; (sigma '(1 2 3 4)) -> 10
(define (sigma lst)
  (if (null? (cdr lst))
      (car lst)
      (+ (car lst) (sigma (cdr list)))))

Now the astute reader will be wondering why I've defined sigma like this, associating to the right, rather than to the left (which would have been simpler). The reason for this is that this function cannot be trivially converted into the accumulator form we used in the previous LCO example. More importantly, there are a number of recursive functions which are easier to define associating to the right than the left (I recommend trying to use append in place of + to define map in this manner). Now it would be nice to be able to manage this sort of function in constant stack, and this is a nice simple example to demonstrate how we can do exactly this. In case any readers are wondering what I mean by associating left vs. right, I suggest you compare the above to the left-associative version below and consider the order in which the elements of the list are combined.

(1 + (2 + (3 + 4))) - right assoc
(((1 + 2) + 3) + 4) - left assoc

; Left associative
(define (sigma lst)
  (letrec ((sigma-iter
    (lambda (lst accum)
       (if (null? lst)
           accum
           (sigma-iter (cdr lst) (+ accum (car lst)))))))
  (sigma-iter (cdr lst) (car lst))))

(Note we could have defined sigma-iter the same way we did earlier, as a global with (define...), however it really is a helper function, and so is better defined locally to sigma. --- besides, it also gives me a chance to introduce lambda, which we will be using later)

The secret to converting the right-associative version into LCO is to recognise that the last thing we want to do before returning is call sigma on the tail of the list. Now this looks difficult, as the last thing we need to do is the addition. Still, I'm a great believer in wishful thinking, so I'm going to pretend I have an function that will take the result of the recursive call and finish off everything I was supposed to do next for me. Then I can pass that into the next call to sigma and have it finish for me.

(define (sigma lst finisher)
  (if (null? (cdr lst))
      (finisher (car lst))
      (finisher (sigma (cdr lst) finishing-function))))

Opps, now we have a call to 'finisher' in the road. Still that's ok, we can just pretend that we have a new finishing-function each time that will both handle the addition for this call, and the call to the finisher we were passed.

(define (sigma lst finisher)
  (if (null? (cdr lst))
      (finisher (car lst))
      (sigma (cdr lst) new-finishing-function)))

Now this is looking better, all we have to do is work out how to define an appropriate finishing function and we're done. Before we do that however it is worth noting (as most of you will have noticed by now anyway) that the new-finishing-function's definition "finish off everything I want to do next" is just a rephrasing of the definition I gave for a continuation at the start.

Lets compare the non-tail-recursive call to the continuation based tail-recursive call:

; non-tail-recursive
(+ (car lst) (sigma (cdr lst)))
; tail-recursive
(sigma (cdr lst) continuation)

Remember the definition of the continuation is that it will be receiving the result of the recursive call to sigma as a parameter so lets call that x and substitute into the non-tail-recursive version:

; let x <- (sigma (cdr lst))
(+ (car lst) x)

This will calcuate the result of our call, but we have one thing left to do, we have to return the result to our caller. However our caller has finished (its call to us was a tail-call remember), and the destination of our result was passed as a continuation. So we don't actually return as much as make a tail-call to the continuation (note this is a major reason why LCO is so useful, if we didn't have LCO our stack frame would be leaked, but with LCO there is no leak).

; pass (+ (car lst) x) to the continuation.
(continuation (+ (car lst) x))

So what we need is to define a small function that can capture this idea and pass this into the recursive call to sigma.

; Scheme
(lambda (x) (continuation (+ (car lst) x)))

# Python - just in case it's useful understanding the closure above.
# NOTE: As python dosn't provide LCO this *will* leak stack.
lambda x: continuation(lst[0] + x)

Now put the pieces together and we're almost there:

(define (sigma lst continuation)
  (if (null? (cdr lst))
      (continuation (car lst))
      (sigma (cdr lst)
	     (lambda (x) (continuation (+ (car lst) x))))))

The next step is to restore the original signature, and make this CPS version a helper function (along the way we rename continuation to k, it's shorter ;).

(define (sigma-helper lst k)
  (if (null? (cdr lst))
      (k (car lst))
      (sigma (cdr lst) (lambda (x) (k (+ (car lst) x))))))

(define (sigma lst)
  (sigma-helper lst initial-continuation))

Final step.... what do we want as the initial-continuation? Well it will be passed the result of the outer sigma, and what we want is the result of the outer sigma.... sounds like a job for the most trivial function you can think of; also known as the I combinator, let me introduce id()

(define (id x) x)

(define sigma lst)
  (sigma-helper lst id))

And we're done. Lets step back and take a look at what we've done. We have taken a non-tail-recursive function and split into two pieces seperated by a tail-call. We execute the first half, and pass the second half as a parameter, a continuation, to the tail-call. It is worth noting three points:

There is nothing special about the precise point we decided to split the function, at any point we could have wrapped the rest of the function in a continuation and passed it into another function, we'll come back to this when we discuss call-with-current-continuation later
The continuation ends up being an ordinary function. We can save this function and call it as many times as we like; from other places than inside the sigma call. After all, once we have it, it's just a function. This dosn't make much sense with sigma, but other algorithms do make use of this trick. Again, we will come back to this with call/cc.
The sigma-helper function never returns! In every case the helper function substitutes return with a call to its continuation parameter. For this reason the sort of continuation used in this example is called a return-continuation. Every return in every function can be treated as a call to an implicit return-continuation. There are other uses for continuations, return-continuations are just the easiest to grasp, hence the subject of this post. Again, I'll come back to other uses when I discuss call/cc.

This technique where you pass return-continuations around instead of returning normally is called CPS. It is a powerful approach to programming that is a common transformation, especially in compilers for functional languages.

I would also appreciate any feedback on this introduction, both positive and suggestions for improvement. Hopefully I'll have time to do a worked example and a discussion on call/cc soon.

Thursday, August 05, 2004

Python and LCO.

Clinton makes a valid point I should have made clear myself. Python dosn't support LCO, and probably never will. I am only using python as a convenient language to provide anyone who dosn't already know scheme with a parallel implementation to help them pick up enough of the language to follow my later discussions on continuations.

I apologise to anyone who may have been confused.

Continuations - part 1.

Ok, so why did I write a post on last-call-optimisation? Well I find it very difficult to explain continuations to someone who dosn't understand LCO, so I thought I should post a quick example of LCO before this. Of course the reason behind this sudden urge to actually fulfil my promise to explain continuations here is Lambda again. This time an extended discussion on continuations sparked by Keith Devens' request for help understanding the things. Itself sparked by a particularly interesting analogy by Luke Palmer in news:perl.perl6.language, now forever dubbed the "Continuation Sandwich" --- the relevant part of his post is:

Say you're in the kitchen in front of the refrigerator, thinking about a sandwitch. You take a continuation right there and stick it in your pocket. Then you get some turkey and bread out of the refrigerator and make yourself a sandwitch, which is now sitting on the counter. You invoke the continuation in your pocket, and you find yourself standing in front of the refrigerator again, thinking about a sandwitch. But fortunately, there's a sandwitch on the counter, and all the materials used to make it are gone. So you eat it. :-)

Now before we begin I'm going to deal with continuation-passing-style and call-with-current-continuation seperately, but as a taste of what we're in for interested parties can check my CPS-in-Java example on the haskell.org wiki, or the most lucid description of call/cc in Chapter 13 of Teach Yourself Scheme in Fixnum Days.

In the meantime, it's 0021 hours here, so I'm going to sleep first, and finish this later.

Last-Call-Optimisation

First lets consider the old hoary chestnut, the factorial.

; Scheme
(define (fact n)
  (if (zero? n)
      1
      (* n (fact (- n 1)))))

# Python
define fact(n):
  if not n:
    return 1
  else:
    return n * fact(n - 1)

Note: I am going to have to use scheme later, so for the next few examples I'll be using scheme and python together to give non-schemers a basis for comparason.

Now this is a very straight forward recursive definition of the function. The only trouble is that we are going to grow the stack proportional to n. So we end up with the following (abbreviated) call-tree:

fact 4
  fact 3
    fact 2
      fact 1
        fact 0
        * 1 1
      * 2 1
    * 3 2
  * 6 4
-> 24

Note that for (fact 4) we end up with five stack frames on our call stack, and unfortunately there is no alternative as the code is written. We can however eliminate this efficiency by converting to an iterative version:

; Scheme
(define (fact-iter n accum)
  (if (zero? n)
      accum
      (fact-iter (- n 1) (* n accum))))

(define (fact n)
  (fact-iter n 1)

# Python
def factiter(n, accum):
  if n == 0:
    return accum
  else:
    return factiter(n - 1, n * accum)

def fact(n):
  return factiter(n, 1)

Now the call stack looks like:

fact 4
  fact-iter 4 1
    fact-iter 3 4
      fact-iter 2 12
        fact-iter 1 24
          fact-iter 0 24
-> 24

Which dosn't look very useful until you consider that as each call to fact-iter returns, it's caller simply passes the resulting value up the call-chain. In otherwords, the caller's stack-frame is never used after it calls the next fact-iter in the chain; and what is never used dosn't need to be kept. This is called last-call-optimisation, and is a very common optimisation used by many languages to make the above code as efficient as the while loop solution traditionally taught in imperative languages:

/* C */
int fact(n) {
  int accum = 1;
  while (n) {
    accum = accum * n;
    n--;
  }
  return accum;
}

In fact the traditional code generated by a LCO supporting compiler looks alot like this:

/* MIML ;) */
int fact(n) {
  int accum = 1;
factiter:
  if (n == 0)
    return accum;
  else
    accum = n * accum;
    n = n - 1;
    goto factiter;
}

Which is pretty much exactly what the while loop would compile to. This still leaves the question: "Why?", what's wrong with the while loop?

The answer to this is that factorial is defined recursively:

       / 0 -> 1
  n! = |
       \ n -> n * (n - 1)

and it is trivial to compare this definition to the first recursive example and check it is correct. The tail-recursive expression in the second example isn't as easy to confirm, but it is at least in a form amenable to a trivial inductive proof by inspection. Because the while loop dosn't abstract away the inductive step but incorperates it into the loop pre/post-conditions and invariants, it is less amenible to proof. In fact the three examples can be seen as reducing levels of abstraction: moving away from the problem domain (a maths definition) towards the raw metal; and almost a tautology, the closer we can stay to the problem domain the better.

Now this is great for linear algorithms, or algorithms that process linearly recursive datastructures (fancy name for linked-lists ;). However consider a pre-order traversal of a binary tree? This can be made tail-recusive, but it is a fair bit more involved, and the subject of my next post.

Wednesday, August 04, 2004

Rdf, indices, hypergraphs, and iTql

An rdf database is a set of declarative statements: [subject predicate object]. Now the traditional way of representing this is as a traditional directed-graph with 'object' nodes with 'predicate' edges linking them. This is a useful representation which successfully visualises most rdf-graphs. Unfortunately for visualisers, for rdf to be useful you need to be able to attach semantics to predicates, and the logical way to do this is to make statements. A quick example:

Fred parentOf John
John parentOf Lucy
John parentOf Jane

This becomes much more useful if we can utilise statements such as

parentOf   subsetOf    relativeOf
relativeOf hasProperty transitive
relativeOf hasProperty symmetric

which allows us to match against match(Jane relativeOf $x) and get back { x=Fred, John, Lucy, Jane }

This is all well and good until you try and visualise the union of these two graphs (often called base-facts, and schema). The first problem you face is multiple edges between the verticies - forcing you to move to a 'directed-graph with named edges'; not too hard to visualise, but it isn't looking too much like a traditional graph anymore. The real kicker comes when you try and visualise the relationship between parentOf and relativeOf, now you have an edge between two edges -- and we're not in Kansas anymore Toto.

So what do we have? It is certainly 'graph-like', but it long ago ceased to be a traditional graph. More importantly, we haven't started considering 'statings', 'named-graphs', or 'reification'. The real problem is that a predicate is a valid vertex in the graph; an equal partner (conceptually) with its more numerous subject and object bretheren. Fortunately we're not the first to consider this problem. First consider the definition of a graph (preempting my summary of Trudeau Ch 2).

A graph is an object consisting of two sets called its vertex set and its edge set. The vertex set is a finite nonempty set. The edge set is a set of two-element subsets of the vertex set.

Looking at the definition, it becomes quite obvious that what we need is for the edge set to contain three-element subsets of the vertex set. This conveniently has a name: a 3-hypergraph. More importantly it can be easilly expanded to support named-graphs by using a 4-hypergraph, and as Paul Gearon noted, this has immediate relevance to the indicies kowari uses. It also has immediate relevance to crystalising iTql's semantics, allowing us to integrate the trans(), walk(), and the new ???() graph constraints --- we're still arguing over a name for what I like to call exclude(); and the original Constraint constraint (include()???). Each of these then returns an element of the powerset of VxVxV (where V is the vertex set). The elements of these resolved constraints are then statements, and as such amenable to predicate logic; and iTql provides conjunction and disjunction operations (and, or) and returns a sum-of-products solution to the resulting compound predicate.

So iTql provides four levels of operations:

Graph Operations: union, intersection, and disjoint-union of 3-hypergraphs in the FROM clause. Note the result of this is actually a 4-hypergraph.
Graph Constraints: inclusive, transitive, reachability, and soon exclusive constraints against a specific 4-hypergraph (by default the graph returned from the FROM clause). Note these are parametised by variables, and the result is a set of variable bindings that substitute to statisfy the constraint. These form the primitive constraints in an iTql WHERE clause.
Predicate Logic Expressions: conjunction and disjunction of variable binding sets, and type specific boolean functions (ie. comparison operators >, <, ==, etc) forming the rest of the WHERE clause. The result of this is a product-of-sums we represent as an implementation of the 'Tuples' interface. It is probably worth noting two things. First, a Tuples fails to be a relation only because the individual variable binding sets have differing cardinalities; we fake this by padding with a unique UNBOUND value which carries DONT-CARE semantics. Second, this can only happen when the expression includes a non-union compatable disjunction.
Projection, Aggregation, and Subqueries: ie. the SELECT clause: a list of functions to apply to the padded-product-of-sums returned by the WHERE clause; the results of which are zip'ed into the conventional tabular format most application developers find most useful.

Saturday, July 31, 2004

Normal people in bad situations.

Clinton wonders aloud about the behaviour of the Khmer Rouge interrogators. Unfortunately this sort of behaviour is both normal, and predicted; anyone who doesn't understand this needs to take the time to read about such experiments as "The Stanford Prison Experiment"

Welcome to the Stanford Prison Experiment web site, which features an extensive slide show and information about this classic psychology experiment, including parallels with the recent abuse of Iraqi prisoners. What happens when you put good people in an evil place? Does humanity win over evil, or does evil triumph? These are some of the questions we posed in this dramatic simulation of prison life conducted in the summer of 1971 at Stanford University.

Friday, July 30, 2004

Why types? -- lambda, the main course

First the surreal, then possibly one of the most interesting threads I think I've seen on lambda. Probably why I'm now clocking 10 hours catching up on lambda, and not only am I still at it, but I'm still motivated.

In Why type systems are interesting, Anton van Straaten lifted a comment by Luke Gorrie to a toplevel discussion:

If someone could next show what makes type systems so interesting in terms that I'll understand then I'll be well on my way into the modern functional programming world.. :-)

and in doing so launched a fascinating thread that has seen 44 posts so far and launched at least one independent thread. I seriously recommend this thread to anyone who considers themselves a serious programmer. A few highlights:

Anton managed to summarise the thread in the introduction:

When you write code, even in the most dynamic language, you always know something about the types of the values you're operating on. You couldn't write useful code otherwise - if you doubt that, try coming up with some examples of operations you can perform on a value without knowing something about the type of that value. Type systems provide a way to encode, manipulate, communicate and check this type knowledge. In an earlier thread, you mentioned that you write the types of functions in the comments. That's an informal type system! Why do you do use a type system? Because type systems are useful, important, and as a result, interesting.

Note that the first paragraph is deceptively simple. Consider the most common example of a generic data-structure, the list. If you have a truely hetrogeneous list, one whose contents are truely untyped, you your list is actually indistinguishable from a natural number! The only operations you can perform on a completely typeless list form the monoid ([], Lists, concat) which of course form a monoid isomorphic with (0, Naturals, +), and supports the Isomorphic Arrow length() mapping lists and naturals. For a fuller discussion of this I recommend Basic Category Theory for Computer Scientists, Benjamin Pierce (and before I get asked, I borrowed it from my local university library).

Anyone who has been reading lambda much immediately recognises Frank Atanassow, his writing style is distinctive, and his opinions both strongly held, and consistent. One of the really exciting things about this thread was that Frank's comment Rebuttal finally managed to clarify for me, what he has been talking about these past months. Although if you haven't been reading lambda much and intend to read that post I would strongly advise you read No contraints[sic], and keep in mind Anton's comment in Constraints

if b then 42 else "ha!" becomes something like: if b then Num 42 else Str "ha!" ...where Num and Str are type constructors for the Univ type. This is kind of ironic, considering that it's the latter language that's supposed to be able to inference types. You can think of dynamically-typed languages as a shortcut syntax for creating programs which rely on a universal type. This demonstrates that statically-typed languages can be less expressive than dynamically-typed ones. emphasis mine

It was also very amusing to watch a curveball from Julian Squires who wanted to know how static type systems interact with his desire to actively develop an executing application (think smalltalk, or possibly php) in Re: Software development as a static activity. Personally I'm not sure any existing languages combine a modern static type system with the ability to do runtime modifications. Still even if you only consider static type checking a safety net, surely one time you would really really like to have it is when doing something as delicate as runtime development.

Just surreal.

The past couple of evengings have been 'catch up on lambda' evenings. So I only just stumbled across one of the most surreal threads I think I've ever read. Old computer science and technical books worth searching for certainly looked promising. It started will, and I will definately have to keep an eye out for "ACM Turing Award Lectures: The First Twenty Years". Then someone mentioned generating functions and type derivatives, and before too long I end up ambushed with

The |x| < 1 condition is a red herring in this case. It is quite standard to work with generating functions in the context of the ring of formal power series k[[x]] with coefficients in a field k. (You can generalize the heck out of this definition.) A formal power series is a polynomial except for the fact that we don't require the elements of its coefficient sequence to become zero after a certain point. We define addition in this ring in the obvious way and multiplication by convolution of the coefficient sequences, just like you would in a polynomial ring.

Ouch! My poor head. I still need warning and a good runup when approaching this sort of thing.

Trudeau: Ch 1. Maths, Games, and Education.

I'm currently reading chapter 4 of Trudeau, but lets do this properly and I'll start with chapter 1.

Chapter 1 is the introduction, and like all introductions everywhere, is largely content free --- or more precisely, is the realm of opinion and motivation. But, before I discuss Trudeau's opinion of why I should read this book, Introduction to Graph Theory has what is probably the most excuberant blurb I think I have ever read on a textbook.

This delightfully written introduction to graph theory offers a stimulating excursion into pure mathematics. It is aimed at those whom the author calls "the mathematically traumatized," but it is a treasury of challenging fun ... "The topics are so well motivated, the exposition so lucid and delightful, that the book's appeal should be virtually universal....[sic] Every library should have several copies"

Wow, somebody was certainly in a good mood. You could almost be forgiven for forgetting that you're actually reading about a textbook. I think this gives new meaning to 'rave review'.

Anyway, back on topic, Trudeau would like us to approach pure mathematics as a game.

Basically pure mathmatics is a box of games. At last count it contained more than eighty of them. One of the is called "Euclidean geometry".

He then goes on to approach the closed-world nature of pure maths; something I as an engineer struggle to deal with.

Games have one more feature in common with pure mathematics. It is subtle but important. It is that the objects with which a game is played have no meaning outside the context of the game. ... You may balk at this, since for historical reasons, chessmen have names and shapes that imply an essential correspondence with the external world. The key word is "essential"; there is indeed a correspondence, but it is inessential. ...this interpretation of the gaem, like all others, has nothing to do with chess per se.

So in a similar way, while the words plane, point, and line suggest real-world analogues, it is only an implied interpretation. In Euclidean geometry they are axiomatic, with specific properties, but without definition.

...no one knows what planes, points or lines are, except to say that they are objects which are related to one another in accordance with the axioms. The three words are merely convenient names for the three types of object geometers play with. Any other names would do as well.

Still the most interesting paragraph in the introduction is right at the beginning. Along with a number of my peers, I cruised through High School mathematics. In fact I largely finished read Lord of the Rings, and War and Peace in my maths classes. It seemed that my teachers would take about a week to teach a days work, and yet it was obvious that dispite the slow pace many of my classmates were still being left behind. Naturally as a geek, I have spent a fair amount of time analysing and self analysing to try and understand the difference. While I was still at school, I just assumed I was just more intelligent. However experience has since taught me I wasn't. Which reopens the question, "What was I doing that made me so much more effective at learning High School maths?". I certainly didn't work harder. I believe Trudeau's introduction gives us a clue:

To understand what mathematics is, you need to understand what pure mathematics is. Unfortunately, most people have either seen no pure mathematics at all, or so little that they have no real feeling for it. Consequently most people don't really udnerstand mathematics; I think this is why many people are afraid of mathematics and quick to proclaim themselves mathematically stupid. Of course, since pure mathematics is the foundation of applied mathematics, you can see the pure mathematics beneath the applications if you look hard enough. But what people see, and remember, is a matter of emphasis. People are told about bridges and missiles and computers. Usually they don't hear about the fascinating intellectual game that lies beneath it all.

My current hypothesis is that my gift was the ability to see past the smokescreen of 'applications', and rote-work, to the fundamental patterns and structure of the pure mathematics underneath. This would also explain why I find such a return on time spent studying Computer Science --- and consequently my interest in language semantics and hence why I'm reading a book on graph theory largely for fun. So assuming I am right, can this insight into how a 'naturally gifted' high-school maths student attains their aptitude be used to improve the pedalogy of our existing maths curriculums?

Shiny Papers.

I'm a reading magpie. I find it almost impossible to browse citeseer, or a amazon without walking away with at least a half dozen new books, and twenty or thirty new papers to read. Citeseer is particularly bad... 'Ooo shiny paper', , , and I have yet another pdf to add to my already impossibly huge collection. So I've decided to be a little more systematic about my reading.

I'm going to start posting summaries of the articles, papers, and books I'm reading. This will hopefully result in a permanant record for my own reference, and if I'm lucky, suggestions for additional reading.

So for starters I thought I would post my current (heavilly abbreviated) reading list.

Essentials of Programming Languages, Daniel Freidman: I'll be working through this one concurrently with the other papers as I want to work the examples, and that will slow down the reading. I would be interested in advice; this is a textbook widely used in CS courses, would it be inappropriate for me to post my solutions?
Types and Programming Languages, Benjamin Pierce: I will be starting this as soon as I finish EOPL. Again I expect to work the examples, and a similar question arises, except that this is a graduate text so hopefully less chance of encouraging cheating

And concurrently with this:

Introduction to Graph Theory, Richard J. Trudeau
Making reliable distributed systems in the presesnce of software errors, Joe Armstrong
The Polyadic PI Calculus : A Tutorial, Robin Milner: I'll be rereading this one in order to summarise it properly. I'm finding it too useful as a framework for thinking about OOP to not want a more concrete reference to my own understanding of this topic.
Genuinely Functional User Interfaces, Antony Courtney, Conal Elliott: This is my primary area of interest, so I am particularly interested in any interesting references on FRP, Arrows, and FRP/Arrows applied to GUI's
Arrows, Robots, and Functional Reactive Programming, Paul Hudak, Henrik Nilsson, Antony Courtney, John Peterson

There are more, but that will do for now. I'll keep this list updated as I find other 'shiny papers', receive recommendations, or actually finish reading some of them ;).

Thursday, July 29, 2004

It's not as scary as it sounds.

Greg Black comments that he took a look at Joe Armstrong's thesis I linked to below. Just in case his discription makes it sound intimidating, the error handling philosophy discussion --- let it crash --- is one section of chapter 4 (ie. about 3 pages). Of course, handling errors is only one step towards a reliable system.

In fact, the chapters of the thesis are largely approachable independently of each other. Chapters 2 and 4 (Architecture and Programming Principles) are particularly good in this regard.

In the meantime for those who are feeling too lazy to read the actual pdf, an executive summary:

We don't know how to write bug-free programs
So every substantial program will have bugs
Even if we are lucky enough to miss our bugs, unexpected interactions with the outside world (including the hardware we are running on) will cause periodic failures in any long-running process
So make sure any faults that do occur can't interfere with the execution of your program
Faults are nasty, subtle, vicious creatures with thousands of non-deterministic side-effects to compensate for
So the only safe way to handle a bug is to terminate the buggy process
So don't program defensively: Just let it crash, and make sure
1. Your runtime provides adequate logging/tracing/hot-upgrade support to detect/debug/repair the running system
2. You run multiple levels of supervisor/watchdog all the way from supervisor trees to automatic, hot-failover hardware clusters

Simple really ;)

Wednesday, July 28, 2004

Let it crash

Martin Pool links to a usenix paper from last year on Crash only software. It is an interesting paper, the first I have seen applying this principle to J2EE. Still I was rather disappointed that the only prior work on supervisorary fault recovery cited seem to be their own. For instance the erlang community has made this a fundamental part of their culture, and the approach is pervasive from the language primitves thru standard libraries to system architecture recommendations and community norms.

Anyone who might be interested in more detail on how this philosophy works in practice is encouraged to check out Joe Armstrong's PhD thesis Making reliable systems in the presence of software errors.

Thursday, July 22, 2004

The importance of checking references.

Jeff Pollock has responded to Paul Gearon's post on Jeff's recent comments to the RDF DAWG list. In doing so he makes three points, I will address his other two points another time (if Paul dosn't beat me to it), but wanted to make a quick note here regarding the first.

Jeff's quote from the dawg charter didn't ring true to me in light of various comments on the dawg list, so I went and checked. Jeff wrote:

the DAWG charter says "It is a requirement that the query language be compatible with an XQuery context"

The charter reads:

There is a requirement for RDF data to be accessable within an XML Query context.

And in fact even this snipet looses signifigant, and important context. A true contextual quote reads:

At this stage, it is not clear to what extent XQuery technology is applicable to the task of querying RDF datasets. The RDF DAWG should aim to maximize W3C technology re-use, while also taking account of differences between the RDF graph data model and the XQuery data model. There is a requirement for RDF data to be accessable within an XML Query context. The working group should specify at least one mechanism for exposing RDF query facilities in an XQuery environment; that is, a way to take a piece of RDF Query abstract syntax and map it into a piece of XML Query using some form of extension to XQuery. ... While the data model of the query language of this protocol is dissimilar to that of XQuery, a non-XML concrete syntax might reuse syntactic elements from XQuery to aid learning time, even if XQuery is not chosen as the strawman.

Note that the sentence Jeff quoted is referring to an extension to XQuery to allow an RDF query to be embedded in, and accessed by, an XQuery. This is very different from a requirement for compatibility, and falls far short of requiring the use of an XQuery syntax for the rdf query language.

It is also worth noting here that the charter explictly recognises the difficulty in mapping XML-infoset semantics to rdf.

At this stage, it is not clear to what extent XQuery technology is applicable to the task of querying RDF datasets. ... While the data model of the query language of this protocol is dissimilar to that of XQuery, ... While RDF query addresses a different data domain (the RDF graph),

On the other hand, the charter also naturally recognises the value in cohesive standards that leverage off each other:

The RDF DAWG should aim to maximize W3C technology re-use, while also taking account of differences between the RDF graph data model and the XQuery data model. ... a non-XML concrete syntax might reuse syntactic elements from XQuery to aid learning time ... any query mechanisms defined by this group must leverage off, and, where possible, interoperate with, the related XML recommendations

In conclusion I will repeat what I said in my first post on this subject.

XML has a document centric, single-rooted, hierachial view of its data; and XQuery reflects that. RDF on the other hand is a open-world, non-rooted, cyclic directional graph-based view. It is not immediately obvious how to reconcile these two approaches; neither is it apparent that a syntax designed to query the former, will be appropriate for the latter.

I would have absolutely not problem with discovering that XQuery can be married with the semantics required by an RDF query language, in fact that would be great.

Update Andy Seabourne has posted a link to a paper by the jena team which nicely illustrates how BRDF can be made accessable to XQuery/XSLT.