Etymon: November 2004

Thursday, November 25, 2004

Well worth the time.

Went back and watched two of the Little Languages 2 Workshop presentations. Specifically Joe Armstrong and Todd Proebsting's presentations on "Concurrency Orientated Programming", and "Disruptive Programming Language Technologies" respectively. I highly recommend both these talks to anyone who is the slightest bit interested in programming languages, or has any serious involvement in development. It is just unfortunate that they are only available as real-video. Still if you have r-v installed, they are both well worth the time.

Note these are links to the two sessions of the workshop. It's convenient that both my favourite talks from this workshop happen to be first in each session.

Tuesday, November 23, 2004

The Theory of Classification.

Andrew Newman came across this set of articles on OO theory. I stumbled across them about a year ago, read the first few, changed jobs, and promptly forgot to go find them again. I remember them as well worth the time to read.

This is the first article in a regular series on object-oriented type theory, aimed specifically at non-theoreticians. The object-oriented notion of classification has for long been a fascinating issue for type theory, chiefly because no other programming paradigm has so sought to establish systematic sets of relationships between all of its types. Over the series, we shall seek to find answers to questions such as: What is the difference between a type and a class? What do we mean by the the notion of plug-in compatibility? What is the difference between subtyping and subclassing? Can we explain inheritance, method combination and template instantiation? Along the way, we shall survey a number of different historical approaches, such as subtyping, F-bounds, matching and state machines and seek to show how these models explain the differences in the behaviour of popular object-oriented languages such as Java, C++, Smalltalk and Eiffel. The series is titled "The Theory of Classification", because we believe that all of these concepts can be united in a single theoretical model, demonstrating that the object-oriented notion of class is a first-class mathematical concept!

Anyway, the links:

"The Theory of Classification"

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15.

Monday, November 15, 2004

Links found.

Just noting that I have updated the Arrows 'n Robots summary with links to citeseer and Wan's thesis page on the Yale-group pages. Also wanted to dump the url to the Yale group here so next time I can find it more easily.

http://haskell.cs.yale.edu/yale/ http://haskell.cs.yale.edu/yale/publications.html

Category Theory

I have had a few people ask me to explain the utility and signifigance of category theory. This proves rather difficult given I am a complete novice in this area myself; desprately trying to keep my head above water in the simplest of papers. Still I found this paragraph useful, and I decided to post it here so I can find it again.

Category theory unifies mathematical structures in a second, perhaps even more important, manner. Once a type of structure has been defined, it quickly becomes imperative to determine how new structures can be constructed out of the given one and how given structures can be decomposed into more elementary substructures. For instance, given two sets A and B, set theory allows us to construct their cartesian product A X B. For an example of the second sort, given a finite abelian group, it can be decomposed into a product of some of its subgroups. In both cases, it is necessary to know how structures of a certain kind combine. The nature of these combinations might appear to be considerably different when looked at from too close. Category theory reveals that many of these constructions are in fact special cases of objects in a category with what is called a "universal property". Indeed, from a categorical point of view, a set-theoretical cartesian product, a direct product of groups, a direct product of abelian groups, a product of topological spaces and a conjunction of propositions in a deductive system are all instances of a categorical concept: the categorical product. What characterizes the latter is a universal property. Formally, a product for two objects a and b in a category C is an object c of C together with two morphisms, called the projections, p: c → a and q: c → b such that, and this is the universal property, for all object d with morphisms f: d → a and g: d → b, there is a unique morphism h: d → c such that p o h = f and q o h = g. Notice that we have defined a product for a and b and not the product for a and b. Indeed, products and, in fact, every object with a universal property, are defined up to (a unique) isomorphism. Thus, in category theory, the nature of the elements constituting a certain construction is irrelevant. What matters is the way an object is related to the other objects of the category, that is, the morphisms going in and the morphisms going out, or, put differently, how certain structures can be mapped into it and how it can map its structure into other structures of the same kind.

Jean-Pierre Marquis, Stanford Encyclopedia of Philosophy: Category Theory

Friday, November 12, 2004

Arrows, Robots, and Functional Reactive Programming --- Summary

First some background reading from the bibliography I'll have to read eventually:

John Hughes. Generalising monads to arrows. Science of Computer Programming, 37:67-111, May 2000
Ross Paterson. A Notation for Arrow-based Libraries. In ICFP'01: International Conference on Functional Programming, pages 229-240, Firenze, Italy, 2001
Zhanyong Wan. Functional Reactive Programming for Real-Time Embedded Systems. PhD thesis, Department of Computer Science, Yale University, December 2002
Zhanyong Wan, Walid Taha, and Paul Hudak. Real-time FRP. In Proceedings of Sixth ACM SIGPLAN '00 Conference on Programming Language Design and Implementation (PLDI), pages 242-252, Vancouver, BC, Canada, June 2000. ACM, ACM Press.
Zhanyong Wan, Walid Taha, and Paul Hudak. Event-driven FRP. In Proceedings of Fourth International Symposium on Practical Aspects of Declarative Languages. ACM, Jan 2002

Naturally when you are using FRP the core concept is that of the signal. So we define signal to have the type Signal a = Time -> a, ie a function of Time returning a value of type 'a'; s(t). The robot used throughout the paper is a simple differential drive with a handful of basic sensors. So given two signals vr(t) -> vr::Signal Double and vl(t)vl::Signal Double which are the velocities of the right and left wheels respectively we can write an reactive program for calculating the robots position:

x     = (1 / 2) * integral ((vr + vl) * sin theta)
y     = (1 / 2) * integral ((vr + vl) * cos theta)
theta = (1 / l) * integral (vr - vl)

Of course the nice thing about this program is that it is a direct transliteration of the physics equations governing the motion of such a robot! This is why FRP is important.

Such programs are clear, consise, correct, and unfortunately completely unworkable in the general case. The problem is that while in the small (with simple toy examples) you couldn't get better than this, larger programs tend to lead to opaque space and time management with the resultant space and time leaks. The focus on this paper is how to retain most of the legibility and good domain match of FRP while preventing leaks. The FRP DSL developed in this paper is called Yampa, and it solves the leak problem by disallowing signals as first class values. Instead signals are manipulated indirectly via set of combinators called signal transformers or signal functions.

SF a b = Signal a -> Signal b

This of course is very similar to the Haskell IO or State monads, and reminisent of Hutton's Monadic Parser Combinators. On the other hand, as presaged by the subject, the authors use Arrows rather than Monads as the basis for their combinator library.

A good analogy for this idea is a state or IO monad, where the state is hidden, and a program consists of a linear sequencing of actions that are eventually run by an interpreter or the operating system. But in fact arrows are more general than monads, and in particular the composition of signal functions does not have to be completely linear

Specifically Arrow contains a fixedpoint combinator loop which provides access to recursion.

The basic combinators include.

lift function: arr :: ( a -> b ) -> SF a b: Takes a normal function from type 'a' to 'b', and returns a signal-function/transformer that will convert a signal of type 'Signal a' to 'Signal b'.
compose: (>>>) :: SF a b -> SF b c -> SF a c: Takes two signal functions and returns the composition of them
tuping: (&&&) :: SF a b -> SF a c -> SF a (b, c): This permits you to take one signal and apply two functions to it similtaniously producing two signals both independently derived from the input. And of course, once you have tupled outputs....
tuple-transform: (***) :: SF a b -> SF c d -> SF (a c) (b d): Take two signal functions and returns a transformer that applies them to the respective signals in a tupled signal.

Now arr, (>>>), and (&&&) form a minimal universal set. However to fit the Arrow framework defined by Hughes, we use the original minimal set defined as arr, (>>>), and first (which only takes one argument, and applies it to the first signal in the tuple, passing the second unchanged).

Some standard combinators provided:


identity :: SF a a
constant :: b -> SF a b
time :: SF a Time

Also the standard arrow combinators (remember, when thinking of FRP, just substitute 'SF' for 'a' in the type-sigs):



arr :: Arrow a => (b -> c) -> a b c
arr2 :: Arrow a => (b -> c -> d) -> a (b, c) d
(>>>) :: Arrow a => a b c -> a c d -> a b d
(<<<) :: Arrow a => a c d -> a b c -> a b d
first :: Arrow a => a b c -> a (b, d) (c, d)
second :: Arrow a => a b c -> a (d, b) (d, c)
(***) :: Arrow a => a b c -> a b' c' ->a (b, b') (c, c')
(&&&) :: Arrow a => a b c -> a b c' -> a b (c, c')
loop :: Arrow a => a (b, d) (c, d) -> a b c

Note that with loop, 'd' provides a channel for feedback, and hence recursion. Notice also the strong streamish feel to the api. I would like to see a Arrow based treatment of the unix-pipe concept. I think it could result in a powerful scripting framework. Also worth noting that some signal functions are stateful (integral is one example), and as such cannot be defined without access to the underlying signals (forbidden). Hence they must either be pre-defined, or defined as a combination of pre-defined stateful signal functions.

So now convert the original FRP expression for x(t) to raw arrow-form:

x = (1/2) * integral ((vr + vl) * cos theta)

given: vrSF :: SF RobotInput Velocity  (SF that extracts vr from a RobotInput)
   and vlSF :: SF RobotInput Velocity  (SF that extracts vl from a RobotInput)
   and thetaSF :: SF RobotInput Angle

xSF :: SF RobotInput Distance   -- Signal Function transforming RobotInput -> Distance
xSF = (((vrSF && vlSF) >>> arr2 (+)) &&& (thetaSF >>> arr cos)) >>> arr2 (*) >>> integral >>> arr (/2)
  or slightly more legible:
xSF = let v = (vrSF &&& vlSF) >>> arr2 (+)
          t = thetaSF >>> arr cos
      in (v &&& t) >>> arr2 (*) >>> integral >>> arr (/2)
  which isn't all that much better.

So what we have gained above is the ability to mediate access to signals behind a reasonably safe, yet complete abstraction. However the loss in clarity is striking. Fortunately there is a proposed arrow-syntax that is similar to Haskell's monadic do-syntax.

xSF' :: SF RobotInput Distance
xSF' = proc inp -> do
  vr    <- vrSF     -< inp
  vl    <- vlSF     -< inp
  theta <- thetaSF  -< inp
  i     <- integral -< (vr + vl) * cos theta
  returnA -< (i / 2)

Which while not as nice as the original FRP, has the triple advantage of being declarative, legible, and safe!

It is important to note that the arrow syntax allows one to get a handle on a signal's values (or samples), but not on the signals themselves. In other words, first recalling that a signal function SF a can be thought of as a type Signal a -> Signal b, which in turn can be thought of as type (Time -> a) -> (Time -> b), the syntax allows getting a handle on values of type a and b, but not on values of type Time -> a or Time -> b.

Although I can't reproduce it here, Fig 2 is dramatic. The correspondance between arrow-based signal-functions, and the signal-flow-diagrams shown is striking, and exciting. I have been excited about the potential of FRP to liberate UI programming from the tyranny of the state-machine; Fig 2 amazed at the clarity available. I am now, more than ever, convinced that I am on the right track.

Events diverge from Signals slightly in that while they are also temporal values, event's don't so much change from one value to another over time, as simply not exist between 'impulses' (in the signal processing sense). In fact their is an isomorphism between impulse-streams vs. continuous-signals and event-streams vs. signals. In previous papers the authors accepted this distinction as signifigant, and the two were treated seperately. However by noting an isomorphism between Event a and Maybe a they are able to define a signal of type Signal (Event b) ie. an event stream, which becomes a signal that can be thought of as alternating between Nothing and Just a. A signal function of type SF a (Event b) generates an event stream, and is called an event source.

There are a number of combinators for use with event-streams. I'm just going to list them with minimal discussion, for more than that see the paper.

switch, dSwitch :: SF a (b, Event c) -> (c -> SF a b) -> SF a b  
        -- switch or delayed-switch.  Note the function from the event payload to a new SF a b.

rSwitch, drSwitch :: SF a b -> SF (a, Event (SF a b)) b 
        -- note event is defined to carry an SF (possibly inserted by tag), this allows for recurring switch.

tag :: Event a -> b -> Event b

mergeBy :: (a -> a -> a) -> Event a -> Event a -> Event a
lMerge, rMerge :: Event a -> Event a -> Event a
  -- Different ways of handing event collision.

-- Useful event-sources:

edge :: SF Bool (Event ())
  -- Obtain an edge trigged event-stream from a (Signal Boolean)

never :: SF a (Event b)
now :: b -> SF a (Event b)
after :: Time -> b -> SF a (Event b)
repeatably :: Time -> b -> SF a (Event b)

-- Useful event-stream transform back to signal space:

hold :: a -> SF (Event a) a
accum :: a -> SF (Event (a -> a)) (Event a)

accumHold :: a -> SF (Event (a -> a)) a
accumHold init = accum init >>> hold init

Finally just to show how it goes together, the paper implements a few robot controllers. First you implement your robot's controller as a signal function from RobotInput to RobotOutput. So a FRP mapping from the robots sensor-signals to signals which the signal-function-processor will apply to the robot's actuators. This controller is actually parametised by the robots properties, but that isn't surprising:

type RobotController = RobotProperties -> SF RobotInput RobotOutput.

Finally the FRP engine is implemented as a function from a controller to IO(). In the case of the RobotSimulator implemented for the paper, the engine also takes a second controller and a world description.

runSim :: Maybe WorldTemplate -> RobotController -> RobotController -> IO ()

Of course, returning the IO monad, this can be trivially instansated as as Haskell main function.

All in all a fascinating paper --- well worth the time to read.

Thursday, November 11, 2004

Arrows, Robots, and Functional Reactive Programming --- Prelude

Now this is a fun paper. I mean, Category Theory, FRP, and you get to play with Robots! What more could you want? So lets take these three elements individually, and then I will try and summarise the paper.

Category Theory

My understanding of Category Theory is as a systematic treatment of abstraction and generalisation. It allows us to take collections of mathmatical proofs and theorms that have the same structure/form, abstract them into a categorical notation, and then generalise them by providing mappings from the categorial constructs to concepts in specific domains. This is naturally of signifigant interest to type-theorists who treat programs as proofs and want access to the power of parametic/polymorphic types and functions without sacrificing rigor. Arrows are a concept from category theory, a generalisation of Monads which beyond just lifting sequencing to a first-class object, allow parametised composition of sequencing. This gives us the ability to support the more complex temporal sequencing of operations required by...

Functional Reactive Programming

Or as that is a bit of a mouthful, FRP; is one of the most intuitive and widespread programming paradigms in use today. It is an approach to programming that has been made so unobtrusive that four of the five other members of my family use it (in the form of a spreadsheet) without realising they are programming. At a basic level, FRP introduces a new paramatised datatype into the language, the time-varying 'signal'. By then permitting functions to be defined that accept these signals and return time-varying signals, you eliminate the shared-state, duplicated-logic, fragmented code paths, and explicit state-machines spawned by current, traditional event-driven approaches. As a basis for comparison, just imagine comparing a traditional spreadsheet with one where to implement a formula for a cell you would have to attach an 'onChange' callback to each referenced cell that recalculated the value.

Robots

Robots have a number of constraints that have not been traditionally considered strengths of functional programming.

Large numbers of discrete and continuous time-varying inputs that are most definately *note* referentially transparent.
Real-time constraints that impose a requirement to avoid time-leaks, which encourages the use of imperative operation sequences. Execution of which can be easilly suspended and resumed as higher priority tasks require resources.
Extensive use of embedded controllers and limited compuational resources that place a premium on transparency of resource use and strict avoidance of space-leaks.
A history of research largely based on low level system languages (ie MIML) and assembly, which makes for a hostile reception for many of the assumptions of higher-level functional programming

Of course FRP is a natural fit to the first problem. However it has traditionally done so by trading off certainty and predictability in both time and space analysis. This paper attempts to address these two problems in order to allow the exploitation of FRP's natural (theoretical) fit to robotics to be exploited.

Bananas, Lenses, Envelopes, and Barbed Wire --- Finale

Finished the Bananas and Lenses paper on the weekend, along with another on FRP that I'll summarise later. Section 4 takes the previous work based on Lists from section 2 and combines it with the category-theory results from section 3 to develop a theory of cata/ana/hylo/para -morphisms over arbitary datatypes. Section 5 is a very brief extension to parametric types.

Unfortunately I simply don't have enough category theory to follow the discussion properly, let alone explain it here. I think it is probably time to make another visit to Amazon and pick up a book or two on category theory. I'm thinking probably Pierce's introduction, and possibly Crole's "Categories for Types" on the basis of the review there. I've already read Pierce's intro, and it is mercifully brief and yet even from a quick reading I found I remembered enough to follow fragments of the paper. Unfortunately I found it too brief to allow me to learn enough to follow the rest :/.

Anyway, this paper was just for fun; fun was had; and I come away with a fresh perspective on recursion. That counts as a win for me.