Etymon

Monday, January 23, 2006

Loops Considered Harmful (personal reprise)

Paul Gearon posted a short rant on Java's verbosity and general clunkiness. The problem is to write a function that takes a string and returns a string containing the distinct characters of that string in sorted order. While Paul does provide a solution in java, he only alludes to the sort of function you might expect to write. So I thought I might have a go at the problem:

Assuming we have a string type, and a sorted-set in our standard library:

(lambda str (reverse-list->string
                (sorted-set-fold (compose cons flip) '()
                     (string-fold insert (sorted-set) str))))

Note: The reverse reduces the complexity from O(n^2 log m) to O(n log m)
3 lines. No mess, no fuss.

Tuesday, January 10, 2006

Trouble in Kowari-land

Times of transition are always hard, and cultures do not always mix without conflict. The past year has been a time of transition for Kowari, as we have had to move from a close-knit circle of developers mostly working in the same room, to a distributed open-source community. It has of course been complicated by the need to try and reach an understanding with the new copyright owners of much of the kowari codebase, Northrup Grumman.

Yesterday I had the unpleasant duty of accepting the resignation of a good friend, colleague, and co-developer/admin from the Kowari project after he received legal threats from Northrup. I remember watching these sorts of tussles go down in my years within the OSS community. I have always felt sympathy for those volunteers who find themselves in the way of corporate lawyers. Now as the maintainer who has to sit and wait for NG's response I realise just how important the freedom in free-software really is. The MPL 1.1 is now the only thing standing between me and the loss of the Kowari project.

It's not really surprising that this story is starting to find its way onto the blogsphere. I would like to thank the SOFA community for their note of support. In response to Clark Parsia's questions about the legal status of kowari:

The copyright in the vast majority of Kowari as of Jan 2005, and consequently the majority of Kowari now was owned by Tucana Technologies, and hence now by NG. However the majority of the work done in 2005 was done by parties unassociated with NG, and this work is copyright the authors (or in some cases the parties that paid for the work). Of particular note is the work done by Paul Gearon on the krule engine, and the work commissioned by the DSTO on the query rewriting. The former particularly is perfectly capable of independent existence, and while it currently lives in the kowari sourcetree is not derivative.
There have been no releases since Tucana went insolvent. It took 6 months to reorganise after Tucana's fold, and another 3-4 months to prepare for release. We were ready for release in November, however held off at the explicit (and inconvenient) request of NG in an effort to support there integration into the community. I should be clear here; The 9th of January release date was requested by Northrup, and acceeded to by the Kowari administrators as a sign of good-faith. We had originally planned to release 1.1 in November.
IANAL, but my understanding of the MPL --- and my personal first-hand knowledge of the express intent of Tucana when the MPL was chosen as a license --- is that anyone who cares to has the right to take a snapshot of the sourceforge repository; label it MyKowariRelease1.1; and release it to the public. And indeed if open-source means anything, it means the right to fork. However at the present point in time, I am the administrator of the kowari project, and Northrup have the same right to fork from Kowari as anyone else does.

As is clear in Clark Parsia's blog post above, this uncertainty is hurting Kowari so I will do whatever I can to ensure it is resolved promptly. All I can ask is to please be patient and lets see if we can't help Northrup to become a productive member of the opensource community.

Friday, December 23, 2005

Humourous post on ocaml-list

Ok so the fact that I found this hilarious probably suggests I need my head examined... I still thought I'd share it though.

From: hars@bik-gmbh.de
Subject: Re: [Caml-list] Camlp4 in a Unix pipe
Message-Id: <4383605A.7030309@bik-gmbh.de>

Alessandro Baretta wrote:
Cool command line! How did you discover the '-' option? 

Out of habit? In the the unix shell monad, "foo >>= bar" happens to be written
as "foo | bar -" ("return" is written "cat", in case you wondered. So "Hey,
that is a useless use of cat!" really means 'Hey, you ignored the monad laws!")

Yours, Florian (time to go home, I know).

So yeah at this point I probably have to conceed I'm strange :).

Friday, December 16, 2005

Alternative Logics

Found via Google Blogs: A Neighbourhood of Infinity: Alternative Logic.

An interesting read.

Google Blogging and an interesting assertion.

Read Paul's post on the latest addition to Google Labs. Installed Google Blogger, and WOW this is cool. Naturally first stop was to check out etymon (disappointing), and second to visit sites I visit regularly. Stumbled upon this quote recounted by Martin Fowler; would love to see some measurements.

While I'm on the topic of concurrency I should mention my far too brief chat with Doug Lea. He commented that multi-threaded Java these days far outperforms C, due to the memory management and a garbage collector. If I recall correctly he said "only 12 times faster than C means you haven't started optimizing".

Tuesday, December 13, 2005

Reading List Reprise

When I started this blog I posted my reading list, having recently finished Types and Programming Languages I thought it was probably time to revisit the reading list.

Essentials of Programming Languages, Daniel Freidman: I finished this early in the year. Unfortunately I never did finish working the examples beyond the first couple of chapters. There is some very interesting material in this book, particularly the last couple of chapters.
Types and Programming Languages, Benjamin Pierce: I recently finished this (back in October), and loved it. This book is illuminating, and it definately achieved its goal. This book is a superb example of how to introduce a reader to a field, and provide the necessary preparation for accessing further works.
Introduction to Graph Theory, Richard J. Trudeau: Read, and yes it was a fun book, but unfortunately did not ever really reach the depth I needed for it to be immediately applicable. What I received from this book was mostly vocab.
Making reliable distributed systems in the presesnce of software errors, Joe Armstrong: I know Greg was annoyed by some trivial errors in some of the example code; however I am very glad I've read this thesis. This is the closest I have seen to practical self-reparing systems. An gentle and easy read, it lays out a practicioners framework for designing reliable systems. Highly recommended.
The Polyadic PI Calculus : A Tutorial, Robin Milner: Never did finish reading this as I ended up borrowing Milner's book and reading it instead. I'm curious to know if there has been any work on the relationship between PI-calculus processes and continuations.
Genuinely Functional User Interfaces, Antony Courtney, Conal Elliott: As I said this is my primary area of interest, and it remains the long-term focus of my efforts. This paper remains a core influence on my interest in Type-Theory, Category Theory, and Language Semantics
Arrows, Robots, and Functional Reactive Programming, Paul Hudak, Henrik Nilsson, Antony Courtney, John Peterson: Opps, that's right I was going to read this one... I'll have to carry this one over to my new reading list...

Other works read worthy of note include:

Purely Functional Data Structures, Chris Okasaki: This book stands alongside my copies of Cormen, Date, Brooks, and K&R without shame. Only a year old and already starting to wear from use. Personally I consider it essential reading for anyone serious about programming. I am seriously considering buying a second copy so I can lend one without losing it from my library. This book is important enough that I will be posting seperately about it.

Monday, December 12, 2005

Found on Lambda

... I'm guessing you won't like my new Redneck Scheme dialect, in which call-with-current-continuation is renamed y'all-come-back-soon-now-y'hear?... Anton van Straaten

Wednesday, December 07, 2005

Kowari insert/select and blank-nodes.

I was asked recently to justify Kowari's decision to maintain blank-node identity across an insert/select command. My response was along the lines of treating kowari as a 4-graph partitionable as rdf-compliant 3-graphs (actually technically a superset as we don't restrict the domains of subjects and predicates as rdf does). At this point the entire 4-graph is within the scope of the existential, and consequently it is valid to retain blank-node identity between the rdf-partitions (kowari models).

However it did send me back to the RDF-SEMANTICS spec again where I found the following in 1.5. Blank Nodes as Existential Variables

This effectively treats all blank nodes as having the same meaning as existentially quantified variables in the RDF graph in which they occur, and which have the scope of the entire graph. In terms of the N-Triples syntax, this amounts to the convention that would place the quantifiers just outside, or at the outer edge of, the N-Triples document corresponding to the graph. This in turn means that there is a subtle but important distinction in meaning between the operation of forming the union of two graphs and that of forming the merge. The simple union of two graphs corresponds to the conjunction ( 'and' ) of all the triples in the graphs, maintaining the identity of any blank nodes which occur in both graphs. This is appropriate when the information in the graphs comes from a single source, or where one is derived from the other by means of some valid inference process, as for example when applying an inference rule to add a triple to a graph. Merging two graphs treats the blank nodes in each graph as being existentially quantified in that graph, so that no blank node from one graph is allowed to stray into the scope of the other graph's surrounding quantifier. This is appropriate when the graphs come from different sources and there is no justification for assuming that a blank node in one refers to the same entity as any blank node in the other.

(emphasis added)

Later in 2. Simple Entailment between RDF graphs we come across the Merging Lemma

Merging lemma. The merge of a set S of RDF graphs is entailed by S, and entails every member of S. [Proof] This means that a set of graphs can be treated as equivalent to its merge, i.e. a single graph, as far as the model theory is concerned. This can be used to simplify the terminology somewhat: for example, the definition of S entails E, above, can be paraphrased by saying that S entails E when every interpretation which satisfies S also satisfies E. The example given in section 1.5 shows that it is not the case, in general, that the simple union of a set of graphs is entailed by the set. The main result for simple RDF inference is:

This is relevant because an alternative view of kowari is as a set of named graphs.

Last time I read this document was when I first started with Tucana, and hadn't heard of RDF before. It is amazing how much more understandable the rdf-semantics document once you've spent some time working with RDF.

Wednesday, August 17, 2005

Persistent Datastructures

Many moons ago I wrote a short tutorial on pthreads. It was adopted (with permission) by a number of universities to provide a brief introduction for their students doing pthread based coursework. It strikes me as bizzare, but the tutorial remains the top google hit for 'pthreads tutorial'. Consequently I periodically get requests for help with threading issues.

When I wrote the tutorial I was still enamoured with threads, an infatuation that didn't last very long :). Since then, my experience working with pthreads, auto-parallelising compilers, java, erlang, kowari, and 10 years of professional programming, have fundamentally changed my perspective on how to approach concurrency.

Concurrency bad
See 1
Shared-memory concurrency very very bad
see 3

However we can't always avoid concurrency so:

Process isolation good
Message-Passing/Rendevous good
Co-routine/Continuation good

Again, sometimes we can't avoid shared-memory concurrency if we want to make good use of some modern cpu architectures so:

Locks are bad - you've accepted all the pain and suffering associated with shared-memory concurrency, and now you're going to SERIALISE operations!!

Assuming you are stuck using shared-memory threads, what do you do now? Well first hope like hell you have a garbage collector. If you don't it is going to hurt enough I would suggest you give up now and go back and reconsider the reasons why you discounted the other options above -- anything is better than this. If you do have a garbage collector, start reading up on persistent datastructures.

Persistent datastructures are very important in safe efficient thread-based design because they don't require locking. They are called persistant because they are immutable. Once you have a reference to a structure that structure is guaranteed const. What this means is that any method that would mutate an empherial (ie, what you are used to) structure returns a new structure.

As an example consider a list

x = []              | x = null 
y = x.prepend(1)    | y = { 1, x }
z = y.prepend(2)    { z = { 2, y }

Now if you need the rest of your program to see the updated list you use some sort of reference cell. The simplest is just a one element queue. This is a classic reader-writer problem; except you don't have to worry about evicting readers before the writer does his thing.

RefCell:
  init(T data);   // Initialises the cell with data.  All subsequent access to data must originate with this cell.
  T read();       // Returns the current value of data (internally access is protected by the 'access lock').
  T writeLock();  // Obtains the 'write lock' (or blocks), and returns the current value of data.
  unLock(T data); // Updates the current value of data under the 'access lock', and releases the 'write lock'.

Note there is almost no contention for readers, no book-keeping, and once a reader has a reference to data, all further access to the structure is lock-free.

However take another look at it. Step back and think about what this is and you will realise that what we have written here is a transaction based datastore. What this means is that you can now go examine all your transaction management theory and introduce multiple-writers, lock-stealing, etc, et al. Now as long as T is persistent, you are atomic, consistent, and isolated. If you ensure unLock flushes the structure to disk you now have an ACID datastore (surprised? :). Add consistency checks to the mutation operations and you can call it a database :).

Granted, without a garbage collector, memory management is a bitch. The standard reference on persistent structures is Chris Okasaki. A quick google will get you a fair way; however citeseer is the motherlode. If you have access to a university library, I would lookup his book (Purely Functional Data Structures), which will save you alot of searching on citeseer :).

Another thing worth noting is that it is worth noting is that once you leave the ephemeral space, you need to start making decisions beyond the simple "Vector/List/Hash/Tree". There are alot more tradeoffs to consider. Do you really need the efficient random-access of a vector, or are you actually using it as a queue, or dqueue? A persistent queue can be implemented with O(1) amortized operations, but the best I have been able to find (quickly) for a true 'vector' is log(i) access (i = index) [citeseer]. Just as importantly, a persistant queue is very simple, almost identical in fact to an efficient ephemeral queue; whereas a persistant random-access-list much more complicated than a vector.

Saturday, June 04, 2005

Using Jabber as a communications framework

Adrian Sutton asks about using Jabber as a communications framework for developing a bug-tracking/QA application. Two items immediately spring to mind.

ACME: Toward a LL Testing Framework for Large Distributed Systems
- Dana Moore and Bill Wright
ejabberd and J-EAI - which uses it as it's message bus.

I haven't had any experience with jabber except as a casual user, however these are projects that have caught my eye as things I would like to investigate when/if I ever have time. So Adrian, if you do check them out, let me know how you go :).

Wednesday, May 25, 2005

Silly old lady

There was an old lady who swallowed a bird;
How absurd, to swallow a bird!
She swallowed the bird to catch the spider
That wriggled and jiggled and wiggled inside her.
She swallowed the spider to catch the fly.
But I dunno why she swallowed that fly -
Perhaps she'll die

Thursday, January 13, 2005

Long Time no See

I realise it has been almost two months since I last blogged, I can only plead two excuses for this lapse. The first reason is that I have met an incredible woman and have dedicated a substantial amount of time recently in getting to know her better. The second is that on the 26th of December, Tucana Technologies (my employer) closed its doors. This is an incredible blow as without reservation, this was the best job I have ever had. I can only hope that I will be able to find something half as involving, challanging, and inspiring as my work on Kowari and RDF. Anyway, unless I should suddenly find myself with employment I should be able to address some of the unfinished posts I have lined up for this blog over the next week or two.

On that note, I have spent some time trying to install GHC under OSX 10.2, unfortunately the only install I have been able to find and install thus far is for 10.3. If anyone out there knows of a package or install method for 10.2 it would be much appreciated.

Finally if anyone is interested in employing a Senior Computer Systems Engineer with a strong background in metadata, RDF, and in the use and design of more languages than you can poke a stick at, feel free to contact me on andrae@humbug.org.au; CV available on request.

Thursday, November 25, 2004

Well worth the time.

Went back and watched two of the Little Languages 2 Workshop presentations. Specifically Joe Armstrong and Todd Proebsting's presentations on "Concurrency Orientated Programming", and "Disruptive Programming Language Technologies" respectively. I highly recommend both these talks to anyone who is the slightest bit interested in programming languages, or has any serious involvement in development. It is just unfortunate that they are only available as real-video. Still if you have r-v installed, they are both well worth the time.

Note these are links to the two sessions of the workshop. It's convenient that both my favourite talks from this workshop happen to be first in each session.

Tuesday, November 23, 2004

The Theory of Classification.

Andrew Newman came across this set of articles on OO theory. I stumbled across them about a year ago, read the first few, changed jobs, and promptly forgot to go find them again. I remember them as well worth the time to read.

This is the first article in a regular series on object-oriented type theory, aimed specifically at non-theoreticians. The object-oriented notion of classification has for long been a fascinating issue for type theory, chiefly because no other programming paradigm has so sought to establish systematic sets of relationships between all of its types. Over the series, we shall seek to find answers to questions such as: What is the difference between a type and a class? What do we mean by the the notion of plug-in compatibility? What is the difference between subtyping and subclassing? Can we explain inheritance, method combination and template instantiation? Along the way, we shall survey a number of different historical approaches, such as subtyping, F-bounds, matching and state machines and seek to show how these models explain the differences in the behaviour of popular object-oriented languages such as Java, C++, Smalltalk and Eiffel. The series is titled "The Theory of Classification", because we believe that all of these concepts can be united in a single theoretical model, demonstrating that the object-oriented notion of class is a first-class mathematical concept!

Anyway, the links:

"The Theory of Classification"

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15.

Monday, November 15, 2004

Links found.

Just noting that I have updated the Arrows 'n Robots summary with links to citeseer and Wan's thesis page on the Yale-group pages. Also wanted to dump the url to the Yale group here so next time I can find it more easily.

http://haskell.cs.yale.edu/yale/ http://haskell.cs.yale.edu/yale/publications.html

Category Theory

I have had a few people ask me to explain the utility and signifigance of category theory. This proves rather difficult given I am a complete novice in this area myself; desprately trying to keep my head above water in the simplest of papers. Still I found this paragraph useful, and I decided to post it here so I can find it again.

Category theory unifies mathematical structures in a second, perhaps even more important, manner. Once a type of structure has been defined, it quickly becomes imperative to determine how new structures can be constructed out of the given one and how given structures can be decomposed into more elementary substructures. For instance, given two sets A and B, set theory allows us to construct their cartesian product A X B. For an example of the second sort, given a finite abelian group, it can be decomposed into a product of some of its subgroups. In both cases, it is necessary to know how structures of a certain kind combine. The nature of these combinations might appear to be considerably different when looked at from too close. Category theory reveals that many of these constructions are in fact special cases of objects in a category with what is called a "universal property". Indeed, from a categorical point of view, a set-theoretical cartesian product, a direct product of groups, a direct product of abelian groups, a product of topological spaces and a conjunction of propositions in a deductive system are all instances of a categorical concept: the categorical product. What characterizes the latter is a universal property. Formally, a product for two objects a and b in a category C is an object c of C together with two morphisms, called the projections, p: c → a and q: c → b such that, and this is the universal property, for all object d with morphisms f: d → a and g: d → b, there is a unique morphism h: d → c such that p o h = f and q o h = g. Notice that we have defined a product for a and b and not the product for a and b. Indeed, products and, in fact, every object with a universal property, are defined up to (a unique) isomorphism. Thus, in category theory, the nature of the elements constituting a certain construction is irrelevant. What matters is the way an object is related to the other objects of the category, that is, the morphisms going in and the morphisms going out, or, put differently, how certain structures can be mapped into it and how it can map its structure into other structures of the same kind.

Jean-Pierre Marquis, Stanford Encyclopedia of Philosophy: Category Theory

Friday, November 12, 2004

Arrows, Robots, and Functional Reactive Programming --- Summary

First some background reading from the bibliography I'll have to read eventually:

John Hughes. Generalising monads to arrows. Science of Computer Programming, 37:67-111, May 2000
Ross Paterson. A Notation for Arrow-based Libraries. In ICFP'01: International Conference on Functional Programming, pages 229-240, Firenze, Italy, 2001
Zhanyong Wan. Functional Reactive Programming for Real-Time Embedded Systems. PhD thesis, Department of Computer Science, Yale University, December 2002
Zhanyong Wan, Walid Taha, and Paul Hudak. Real-time FRP. In Proceedings of Sixth ACM SIGPLAN '00 Conference on Programming Language Design and Implementation (PLDI), pages 242-252, Vancouver, BC, Canada, June 2000. ACM, ACM Press.
Zhanyong Wan, Walid Taha, and Paul Hudak. Event-driven FRP. In Proceedings of Fourth International Symposium on Practical Aspects of Declarative Languages. ACM, Jan 2002

Naturally when you are using FRP the core concept is that of the signal. So we define signal to have the type Signal a = Time -> a, ie a function of Time returning a value of type 'a'; s(t). The robot used throughout the paper is a simple differential drive with a handful of basic sensors. So given two signals vr(t) -> vr::Signal Double and vl(t)vl::Signal Double which are the velocities of the right and left wheels respectively we can write an reactive program for calculating the robots position:

x     = (1 / 2) * integral ((vr + vl) * sin theta)
y     = (1 / 2) * integral ((vr + vl) * cos theta)
theta = (1 / l) * integral (vr - vl)

Of course the nice thing about this program is that it is a direct transliteration of the physics equations governing the motion of such a robot! This is why FRP is important.

Such programs are clear, consise, correct, and unfortunately completely unworkable in the general case. The problem is that while in the small (with simple toy examples) you couldn't get better than this, larger programs tend to lead to opaque space and time management with the resultant space and time leaks. The focus on this paper is how to retain most of the legibility and good domain match of FRP while preventing leaks. The FRP DSL developed in this paper is called Yampa, and it solves the leak problem by disallowing signals as first class values. Instead signals are manipulated indirectly via set of combinators called signal transformers or signal functions.

SF a b = Signal a -> Signal b

This of course is very similar to the Haskell IO or State monads, and reminisent of Hutton's Monadic Parser Combinators. On the other hand, as presaged by the subject, the authors use Arrows rather than Monads as the basis for their combinator library.

A good analogy for this idea is a state or IO monad, where the state is hidden, and a program consists of a linear sequencing of actions that are eventually run by an interpreter or the operating system. But in fact arrows are more general than monads, and in particular the composition of signal functions does not have to be completely linear

Specifically Arrow contains a fixedpoint combinator loop which provides access to recursion.

The basic combinators include.

lift function: arr :: ( a -> b ) -> SF a b: Takes a normal function from type 'a' to 'b', and returns a signal-function/transformer that will convert a signal of type 'Signal a' to 'Signal b'.
compose: (>>>) :: SF a b -> SF b c -> SF a c: Takes two signal functions and returns the composition of them
tuping: (&&&) :: SF a b -> SF a c -> SF a (b, c): This permits you to take one signal and apply two functions to it similtaniously producing two signals both independently derived from the input. And of course, once you have tupled outputs....
tuple-transform: (***) :: SF a b -> SF c d -> SF (a c) (b d): Take two signal functions and returns a transformer that applies them to the respective signals in a tupled signal.

Now arr, (>>>), and (&&&) form a minimal universal set. However to fit the Arrow framework defined by Hughes, we use the original minimal set defined as arr, (>>>), and first (which only takes one argument, and applies it to the first signal in the tuple, passing the second unchanged).

Some standard combinators provided:


identity :: SF a a
constant :: b -> SF a b
time :: SF a Time

Also the standard arrow combinators (remember, when thinking of FRP, just substitute 'SF' for 'a' in the type-sigs):



arr :: Arrow a => (b -> c) -> a b c
arr2 :: Arrow a => (b -> c -> d) -> a (b, c) d
(>>>) :: Arrow a => a b c -> a c d -> a b d
(<<<) :: Arrow a => a c d -> a b c -> a b d
first :: Arrow a => a b c -> a (b, d) (c, d)
second :: Arrow a => a b c -> a (d, b) (d, c)
(***) :: Arrow a => a b c -> a b' c' ->a (b, b') (c, c')
(&&&) :: Arrow a => a b c -> a b c' -> a b (c, c')
loop :: Arrow a => a (b, d) (c, d) -> a b c

Note that with loop, 'd' provides a channel for feedback, and hence recursion. Notice also the strong streamish feel to the api. I would like to see a Arrow based treatment of the unix-pipe concept. I think it could result in a powerful scripting framework. Also worth noting that some signal functions are stateful (integral is one example), and as such cannot be defined without access to the underlying signals (forbidden). Hence they must either be pre-defined, or defined as a combination of pre-defined stateful signal functions.

So now convert the original FRP expression for x(t) to raw arrow-form:

x = (1/2) * integral ((vr + vl) * cos theta)

given: vrSF :: SF RobotInput Velocity  (SF that extracts vr from a RobotInput)
   and vlSF :: SF RobotInput Velocity  (SF that extracts vl from a RobotInput)
   and thetaSF :: SF RobotInput Angle

xSF :: SF RobotInput Distance   -- Signal Function transforming RobotInput -> Distance
xSF = (((vrSF && vlSF) >>> arr2 (+)) &&& (thetaSF >>> arr cos)) >>> arr2 (*) >>> integral >>> arr (/2)
  or slightly more legible:
xSF = let v = (vrSF &&& vlSF) >>> arr2 (+)
          t = thetaSF >>> arr cos
      in (v &&& t) >>> arr2 (*) >>> integral >>> arr (/2)
  which isn't all that much better.

So what we have gained above is the ability to mediate access to signals behind a reasonably safe, yet complete abstraction. However the loss in clarity is striking. Fortunately there is a proposed arrow-syntax that is similar to Haskell's monadic do-syntax.

xSF' :: SF RobotInput Distance
xSF' = proc inp -> do
  vr    <- vrSF     -< inp
  vl    <- vlSF     -< inp
  theta <- thetaSF  -< inp
  i     <- integral -< (vr + vl) * cos theta
  returnA -< (i / 2)

Which while not as nice as the original FRP, has the triple advantage of being declarative, legible, and safe!

It is important to note that the arrow syntax allows one to get a handle on a signal's values (or samples), but not on the signals themselves. In other words, first recalling that a signal function SF a can be thought of as a type Signal a -> Signal b, which in turn can be thought of as type (Time -> a) -> (Time -> b), the syntax allows getting a handle on values of type a and b, but not on values of type Time -> a or Time -> b.

Although I can't reproduce it here, Fig 2 is dramatic. The correspondance between arrow-based signal-functions, and the signal-flow-diagrams shown is striking, and exciting. I have been excited about the potential of FRP to liberate UI programming from the tyranny of the state-machine; Fig 2 amazed at the clarity available. I am now, more than ever, convinced that I am on the right track.

Events diverge from Signals slightly in that while they are also temporal values, event's don't so much change from one value to another over time, as simply not exist between 'impulses' (in the signal processing sense). In fact their is an isomorphism between impulse-streams vs. continuous-signals and event-streams vs. signals. In previous papers the authors accepted this distinction as signifigant, and the two were treated seperately. However by noting an isomorphism between Event a and Maybe a they are able to define a signal of type Signal (Event b) ie. an event stream, which becomes a signal that can be thought of as alternating between Nothing and Just a. A signal function of type SF a (Event b) generates an event stream, and is called an event source.

There are a number of combinators for use with event-streams. I'm just going to list them with minimal discussion, for more than that see the paper.

switch, dSwitch :: SF a (b, Event c) -> (c -> SF a b) -> SF a b  
        -- switch or delayed-switch.  Note the function from the event payload to a new SF a b.

rSwitch, drSwitch :: SF a b -> SF (a, Event (SF a b)) b 
        -- note event is defined to carry an SF (possibly inserted by tag), this allows for recurring switch.

tag :: Event a -> b -> Event b

mergeBy :: (a -> a -> a) -> Event a -> Event a -> Event a
lMerge, rMerge :: Event a -> Event a -> Event a
  -- Different ways of handing event collision.

-- Useful event-sources:

edge :: SF Bool (Event ())
  -- Obtain an edge trigged event-stream from a (Signal Boolean)

never :: SF a (Event b)
now :: b -> SF a (Event b)
after :: Time -> b -> SF a (Event b)
repeatably :: Time -> b -> SF a (Event b)

-- Useful event-stream transform back to signal space:

hold :: a -> SF (Event a) a
accum :: a -> SF (Event (a -> a)) (Event a)

accumHold :: a -> SF (Event (a -> a)) a
accumHold init = accum init >>> hold init

Finally just to show how it goes together, the paper implements a few robot controllers. First you implement your robot's controller as a signal function from RobotInput to RobotOutput. So a FRP mapping from the robots sensor-signals to signals which the signal-function-processor will apply to the robot's actuators. This controller is actually parametised by the robots properties, but that isn't surprising:

type RobotController = RobotProperties -> SF RobotInput RobotOutput.

Finally the FRP engine is implemented as a function from a controller to IO(). In the case of the RobotSimulator implemented for the paper, the engine also takes a second controller and a world description.

runSim :: Maybe WorldTemplate -> RobotController -> RobotController -> IO ()

Of course, returning the IO monad, this can be trivially instansated as as Haskell main function.

All in all a fascinating paper --- well worth the time to read.

Thursday, November 11, 2004

Arrows, Robots, and Functional Reactive Programming --- Prelude

Now this is a fun paper. I mean, Category Theory, FRP, and you get to play with Robots! What more could you want? So lets take these three elements individually, and then I will try and summarise the paper.

Category Theory

My understanding of Category Theory is as a systematic treatment of abstraction and generalisation. It allows us to take collections of mathmatical proofs and theorms that have the same structure/form, abstract them into a categorical notation, and then generalise them by providing mappings from the categorial constructs to concepts in specific domains. This is naturally of signifigant interest to type-theorists who treat programs as proofs and want access to the power of parametic/polymorphic types and functions without sacrificing rigor. Arrows are a concept from category theory, a generalisation of Monads which beyond just lifting sequencing to a first-class object, allow parametised composition of sequencing. This gives us the ability to support the more complex temporal sequencing of operations required by...

Functional Reactive Programming

Or as that is a bit of a mouthful, FRP; is one of the most intuitive and widespread programming paradigms in use today. It is an approach to programming that has been made so unobtrusive that four of the five other members of my family use it (in the form of a spreadsheet) without realising they are programming. At a basic level, FRP introduces a new paramatised datatype into the language, the time-varying 'signal'. By then permitting functions to be defined that accept these signals and return time-varying signals, you eliminate the shared-state, duplicated-logic, fragmented code paths, and explicit state-machines spawned by current, traditional event-driven approaches. As a basis for comparison, just imagine comparing a traditional spreadsheet with one where to implement a formula for a cell you would have to attach an 'onChange' callback to each referenced cell that recalculated the value.

Robots

Robots have a number of constraints that have not been traditionally considered strengths of functional programming.

Large numbers of discrete and continuous time-varying inputs that are most definately *note* referentially transparent.
Real-time constraints that impose a requirement to avoid time-leaks, which encourages the use of imperative operation sequences. Execution of which can be easilly suspended and resumed as higher priority tasks require resources.
Extensive use of embedded controllers and limited compuational resources that place a premium on transparency of resource use and strict avoidance of space-leaks.
A history of research largely based on low level system languages (ie MIML) and assembly, which makes for a hostile reception for many of the assumptions of higher-level functional programming

Of course FRP is a natural fit to the first problem. However it has traditionally done so by trading off certainty and predictability in both time and space analysis. This paper attempts to address these two problems in order to allow the exploitation of FRP's natural (theoretical) fit to robotics to be exploited.

Bananas, Lenses, Envelopes, and Barbed Wire --- Finale

Finished the Bananas and Lenses paper on the weekend, along with another on FRP that I'll summarise later. Section 4 takes the previous work based on Lists from section 2 and combines it with the category-theory results from section 3 to develop a theory of cata/ana/hylo/para -morphisms over arbitary datatypes. Section 5 is a very brief extension to parametric types.

Unfortunately I simply don't have enough category theory to follow the discussion properly, let alone explain it here. I think it is probably time to make another visit to Amazon and pick up a book or two on category theory. I'm thinking probably Pierce's introduction, and possibly Crole's "Categories for Types" on the basis of the review there. I've already read Pierce's intro, and it is mercifully brief and yet even from a quick reading I found I remembered enough to follow fragments of the paper. Unfortunately I found it too brief to allow me to learn enough to follow the rest :/.

Anyway, this paper was just for fun; fun was had; and I come away with a fresh perspective on recursion. That counts as a win for me.

Wednesday, October 06, 2004

Back.

Got back on saturday from a week long holiday helping run a holiday computer-camp for high-school kids. Now mostly recovered, and will probably blog a little on it later.

In the meantime thought I might quickly point at an interesting Wired article I found on slashdot. Looking at one of the most interesting economic effects of the net, the growth of niche markets. Recommended reading The Long Tail.