Thursday, April 20, 2006

What is Object-Oriented Programming

Ok, so I've spent the past few days ragging on various applications of object-oriented programming. So what do I consider OOP good for? Well the answer naturally depends on what you consider OOP to be. So before I attempt to ask "What is it good for?", I will prepare the way by looking at "What is Object Oriented Programming"?

The biggest problem with giving a definition for object-oriented programming is that there are so many to choose from. Worse, the vast majority consist of little more than vague handwaving giving little or no thought to the existence of non-OO techniques. Consider the list of features listed by Johnathan Rees on Paul Graham's site.

Consider the first example. Encapsulation is not a distinguishing feature of OOP. Not only is it common in numerous non-OO languages, but it is not supported by many OO languages including both Smalltalk and Python! Moreover encapsulation is a friendly name given to existentially quantified types, which are the underlying theory used to understand Modules. Ie. when you see a feature offering encapsulation, what you are really seeing is a module/package system (possibly in disguise). Yes that does mean that Java/C#/C++ classes are doing double duty as both a type definition AND a module system. Yes that is the cause of several problems you are probably experiencing. No it doesn't have to be that way, there are languages that successfully seperate the two orthoginal concepts.

I make use of theory here because it allows me to avoid handwaving. Specifically it allows me to avoid the trap of promoting specific language features to the status of paradigm definition just because they are favoured, or even common in OOP languages. Using this approach I can quickly eliminate Encapsulation, Protection, Ad hoc Polymorphism, Parametric Polymorphism, and Sum-of-Product-of-Function from Ree's list. This leaves us with only "Everything is an object", "All you can do is send a message", Specification Inheritance, and Implementation Inheritance to consider. So lets map these to theory and see what we have left:

Everything is an object
This is literally a meaningless phrase as either it is strictly a matter of syntax (2.add(3) vs. 2 + 3); or it is a matter of purity, at which point it begs the question.
All You can do is send a message
Message Passing. You can take your pick of various algebras and calculi to model this. CSP was popular for a while; Process Algebra has its admirers; but my favourate is the pi-calculus.
Specification Inheritance
This is subtype-polymorphism using subsumption.
Implementation Inheritance
Open Recursion. Not alot of use if you don't have subtype-polymorphism as well, and proves difficult to model when you do. The simplest treatments use a least-fixed-type operator applied to a 'self' abstraction.

Ultimately the above three concepts coalese into two conceptualisations of OOP.

The first focuses on the 'object', considering them as identifiable, autonomous, finite automatons communicating via message-passing. Conceptually we are talking about highly concurrent actor based models. This approach is the once taken by Simula, Erlang, Pick, and friends. However while it corresponds closest to the original conception of OOP, it is barely recognisable in modern mainstream OOP languages. In fact these languages are more likely to be described as concurrency-oriented than object-oriented.

The second focuses on the 'class', considering OOP in terms inheritance and subtyping. Conceptually we are talking about languages that provide records containing function typed attributes that are parametised by a recursive 'self' parameter that provides for 'late-binding'. The conventional way to model this theoretically is via Abardi and Cardelli's Object Calculus (or you could try one of the papers available online ie A Theory of Primitive Objects (second order systems) or An Imperative Object Calculus). Representitive languages include all the usual suspects: Java, C++, C#, Smalltalk, etc.

While concurrent agents with message passing might be the original definition of OO, as mentioned above most programmers who consider themselves OO wouldn't recognise this as OO; certainly almost none of the languages traditionally considered OO would fit this definition. So it is pointless to pursue it further. That leaves us with a single definition.

Object Oriented Programming is any programming based on a combination of subtype polymorphism and open recursion

Translated back into more common terms - programming based on a combination of

  1. Polymorphic functions that accept as actual parameters values of any subtype of their formal parameters, and
  2. Late binding, based on a distinguished 'self' parameter to functions that provides access to attributes of a record/object based on its runtime type.
Which is consise, precise, and coincides with our intuition --- even if it does leave out a few 'traditional' aspects commonly included in definitions of OOP. Now to produce an object oriented language you need to add at least the idea of functions and function application without which you are not turing complete and you will need records, without which you can't define subtyping. These don't belong to any paradigm rather they are what it means to be a programming language. You probably want to add a store and references, but again these do not belong to OO rather they belong to the super-paradigm Iterative Programming. There are all sorts of features you can add to your language but only two, subtype-polymorphism and open recursion, make you Object Oriented.


Sean Geoghegan said...

I wonder if subtype polymorphism is only relevant to statically typed OO programming languages. In a dynamically typed OO language, like Ruby and Python (and I assume Smalltalk though I have no experience with it) the type of the parameters to a method are irrelevant, it is only important that they respond to the 'messages' required by that method.

This leaves you with just open recursion. Which might actually be a rather elegant definition of OO, for those who know what it is. ;)

Anonymous said...

Great blog, I'm really enjoying the theoretical analysis of familiar mainstream concepts.

I'm still struggling with the distinction between subtype polymorphism and constrained parametric polymorphism. Does subtype polymorphism mean existential quantification of each argument to a function by their formal type, whereas traditional bounded parametric polymorphism insists on the same type? I ask because Haskell (a language pretty far from OO) typeclasses appear to imply some kind of subtyping thing going on (though dispatch is done on type instead of value).

gippy honey said...

The difference between a successful person and others is not a lack of strength, not a lack of knowledge, but rather a lack of will.
Provigil Online

gippy honey said...

Design is not just what it looks like and feels like. Design is how it works.
Personal Trainers Liverpool

gippy honey said...

Innovation distinguishes between a leader and a follower.
Glass Curtains Spain

gippy honey said...

You can't just ask customers what they want and then try to give that to them. By the time you get it built, they'll want something new.

gippy honey said...

Being the richest man in the cemetery doesn't matter to me. Going to bed at night saying we've done something wonderful, that's what matters to me.
Modafinil Online