Featherweight Musings

Wednesday, July 22, 2009

A rant - empirical analysis

Having typed up my ECOOP notes and gone through the proceedings a bit, I have actually got a little bit angry about the state of affairs concerning empirical analysis.

A few years ago it seemed that any kind of empirical analysis was a bonus and as long as you had a problem and a solution and a soundness proof you were good to go. The obvious problem is that there is nothing to say that your solution addresses the problem, that the problem is serious, or that the solution is practical.

Bit by bit reviewers have started asking for some empirical evidence to answer these questions. Can someone tell me why this fashion started?

I don't have too much of a problem with this, except that we are doing research here, and some of these crazy ideas which aren't practical and don't really solve any problem lead to solutions that are and do. If you press too hard for practicality you lose the paradigm shifting ideas that only seem practical in ten years time.

What has made me angry, is that the pendulum seems to have swung too far: it now seems really hard to get something published without empirical analysis, which makes it hard to be productive unless you have an army of students and constrains some good, but not immediately practical, ideas. Furthermore, there seem to be a few papers published at ECOOP which seem to be half a paper - just the evaluation or motivation section of a decent paper, without any real idea. These seem to have been published on the strength of the empirical analysis alone, without having asked how much of a contribution this analysis actually is.

I agree there is a need for analysing existing work, but some just seems unnecessary and, frankly, boring.

I count four papers at ECOOP that I would reject for being pointless (there is at least one paper (probably more, but I can't be bothered to look through the proceedings again) that is pure analysis and I think is useful). No names, obviously.

I'd be very interested to here othe peoples views on this, cheers!

ECOOP day 3

Tobias gave the first technical talk and was surprisingly bright-eyed; apparently that's what being Swedish does for you. The paper is entitled Loci: Simple Thread-Locality for Java. There is a very simple ownership system (two annotations - shared and thread) where threads own their thread local data and the system restricts references accordingly, thus data is kept thread-local. It's all enforced statically, and they have a compiler. Their empirical evaluation found that most classes could be made thread-local.

Next was Failboxes: Provably Safe Exception Handling presented by Bart Jacobs the younger. These are essentially a safer version of exceptions (although they are layered on top of exceptions, they're not a replacement). The problem with standard exceptions is that side effects made inside a try block will remain even if the try block fails and an exception is thrown. Fail boxes allow non-lexical scoping of expressions and so failure can prevent execution of dependent expressions which are outside the try block.

Weirdest talk of the conference award goes to Yossi Gil presenting "Are we ready for a Safer Construction Environment". The idea is small but sweet and is a restriction on the way objects can be constructed. The problem is that during object construction, objects are 'half-baked' that is, construction has not finished, fields may not have been initialised, and invariants may not have been established. They propose restricting construction to their hard-hat model, which means that means no virtual method calls and no exposure of this to external methods.

Now, whilst this is a solution, it's not very clever, it seems to be along the lines of "if you do nothing, you won't get hurt" and to me seems like it should be a paragraph in a 'how to program' textbook, not a full paper in a PL conference. But, they have done loads of corpus analysis to confirm the obvious, and since that seems to be in fashion now (why? Can someone please tell me why?!), it has got published. Having said that it is something that I would bear in mind if I were making a system where invariants at construction was important, so I guess it should be published somehow...

Christian Haack then talked about Type-Based Object Immutability with Flexible Initialisation. This seems like a lightweight ownership system for immutability. I think the key of it is that you enforce owners-as-dominators (or something like it) during initialisation to ensure an object is initialised 'privately' and then enforce a kind of owners-as-modifiers (it's not total immutability) afterwards. Which is nice as far as it goes. But I feel there is a sweet spot with the immutability/uniqueness/ownership combination which has not quite been hit, although this and other papers are getting close.

Finally, Ali Ibrahim (I think) talked about Remote Batch Invocation for Compositional Object Services. This is an optimisation of RMI so that remote calls are batched together for the sake of efficiency. They introduce a batch keyword to the language, and within the batch scope, the order of execution can be changed so that remote calls are batched. It's all bit cleverer than that, for example, they do 're-forestation' of data structures to improve batching of loops. They have an implementation and plenty of empirical analysis.

Tuesday, July 21, 2009

ECOOP day 2

Today's invited talk was by Dave Ungar and it felt more like a self-help tutorial than an academic talk. In fact all three of the talks I attended had a whiff of esteem boosting and soul searching about them: are we academics really that insecure in our endeavours or about out role in computer science?

Anyway, Dave still seemed pretty bitter about the lack of success of Self and the way it was handled at Sun, which might well be justified, but didn't fit into the therapeutic tone of the rest of the talk. The points I took from the talk were: interpreters are better than compilers (and this needs technologies such as JIT to realise), design languages on principles rather than examples, and simplicity is better than expressiveness.

The first technical talk of the day was Adding State and Visibility Control to Traits Using Lexical Nesting. The authors seem to back off the founding principles of traits by adding stateful traits (which are now exactly like classes or maybe mixins?) and nesting of objects to give visibility control. Composition is achieved by delegation and delegate methods are automatically generated. Of course, with state you then get back all the old multiple inheritance problems, which are addressed to some extent. So although nice ideas and well-thought out it seems to go completely against Dave Ungar's maxims of simplicity before expressivity and principles before examples.

I liked Modular Visitor Components:..., although I'm not sure if it's even close to practical. This is a possible solution to an extension of the expression problem - the expression families problem. I wasn't quite clear what this was or to what extent previous solutions address it, in particular where virtual classes fall down. The solution was implemented in Scala using a collection of quite terrifying PL technologies - higher order type parameters, variance, traits, and self types. The result is a suitably complex type system that is nicely expressive.

The other joint-best paper of the conference was Debugging Method Names. They find bugs in names such as getString not returning a value or not returning something of type String or setting something etc. They do all this automatically by analysing a whole bunch of programs, which is certainly clever but didn't really seem like a programming languages problem, or even that hard from the talk, maybe the paper better identifies the problems.

Monday, July 20, 2009

ECOOP Day 3 - Day 1 of the main conference

The first invited talk (slides available here, which you should check out for the graphs, if nothing else) of the conference was from Simon Peyton Jones and was about type classes in Haskell, and, more generally, the relation between Haskell and OO langs. The talk was really good, SPJ has so much energy and gives a really, really good presentation. The big point that I took away was that Haskell has universal polymorphism (parametric polymorphism) and OO langs (i.e., Java) typically have subtyping (inclusion polymorphism). I would generalise so far as to say that the primary form of polymorphism in functional languages is universal quantification, and the primary form of polymorphism in OO languages is existential quantification, but then you all know I have a soft spot for the old backwards 'E'.

The talk started with a quick overview of Haskell and then motivated type classes as a solution to the problem of how to do universally quantified functions, but restricted to certain types (eg don't want to have an equals function that can take functions as arguments). I won't pretend to fully understand type classes, but they seem cool. What is really cool, is SPJ's claim that Wadler and Blott just plucked the idea out of thin air!

The next part of the talk compared Haskell type classes to OO dispatch. Both are essentially forms of dynamic dispatch, but type classes can be compiled away, which virtual methods cannot be. Furthermore, type classes dispatch based only on a value's type, whereas virtual methods dispatch on the value itself (by which I assume he means runtime types). Type class dispatch is different to Java overloading too, because overloading is a static mechanism (it can be compiled down to no choice at all at runtime), but type classes dispatch at runtime (even though the mechanism is compiled out statically - actually now I write this I'm not 100% sure how this is difference from OO dispatch, other than OO has subtyping...).

SPJ also talked about things he envies in OO-land. First up was the power of the dot, which seemed to come down to IDE support and overloading - the latter of which no-one in OO-land seems to like anymore! Next were heterogeneous lists, which I think is just another request for existential types, just like subtyping is. He then discussed future work a bit - the most interesting sounding bit of which is type functions.

The first technical paper was Coinductive Type Systems for Object-Oriented Languages, which was joint-best paper of the conference, but went completely over my head. I will return to it later though.

Next, Checking framework Interactions with Relationships presented by Ciera Jaspan. Their FUSION system allows developers to specify relationship constraints, effectively multi-object invariants, which are implicitly time-sensitive (by which I mean that there is no explicit notion of time in their system, but that order matters, these are not static invariants which must be maintained: the developer can specify "x then y" or "x must follow y"). Checking can be sound or complete, but not both, or a pragmatic variant which is neither, but hopefully the most useful in practise.

After lunch, Nick Mitchell talked about Making Sense of Large Heaps, which, I predict, will be a paper that gets a lot of references. He presented a tool, Yeti, which analyses heaps by breaking them into 'data structures' and 'sections'. The essential concepts used in the analysis are those of ownership and domination, and it is interesting to see ownership used to analyse, rather than prescribe.

The last session of the day was great, starting with Implicit Dynamic Frames:... by Jan Smans et al. It is a nice approach to verification by specifying access permissions in pre- and post-conditions. Pure methods and 'predicates' - pure methods which abstract assertions rather than expressions - allow the language to support abstraction.

The next talk - Fine-Grained Access Control with Object-Sensitive Roles, presented by Dan Marino - was great, using Britney Spears' medical notes as motivation. I am very surprised that nothing like this has been done before, it seems so sensible. The idea is that rather than just using roles to do security, you parametrise the roles by the ids of objects so that you get per-object, rather than per-role, security. Good idea, good execution. Also, nice use of effects and existential quantification.

Last talk of the day was Practical API Protocol Checking with Access Permissions presented by Nels Beckman. This uses typestates and access permissions to specify protocols which can be checked by static analysis. This paper is an evaluation of that work, and lo and behold, it works! There are some interesting points raised along the way, such as hasNext methods in JDBC not being pure (which I've come across a while ago, but can't remember where).

Wednesday, July 15, 2009

ECOOP day 2: IWACO

Viktor Kuncak gave the invited talk, the theme of which is his effort to get "verified abstractions on developers' desks". This boiled down to verifying library code, the hard part of which is accounting for the different state in the library vs. the application. He talked about his tool Jahob, which apparently does the business.

I chaired the first technical session of the day at IWACO, and it was the first session I've chaired at a workshop/conference, so I was a bit nervous. But all the talks were good and it went OK.

The first talk was by Nels Beckman and was about using access permissions to reduce STM overhead. This was a great little talk and had a great presentation of the various kinds of access and how they relate. The relation to STM was straightforward, but looked useful and they had empirical evidence to back it up - although 9% seems not too much of an improvement. It was also interesting that their optimisations incurred more overhead in some cases.

Eric Kerfoot presented the next talk, which used a very simple ownership scheme to avoid deadlock. Although the ownership aspects of this work were not that interesting (and probably not flexible enough for real life), it is good to see an application of ownership. The only problem being that the work seemed to be able to introduce more deadlocks if the wait clauses never become true.

William Retert presented the last talk of the session (and my favourite paper from the workshop) which compared everything even vaguely IWACO related using fractional permissions. This is a nice thing to do and something I'm very interested in seeing. I like being able to see exactly how these different mechanisms relate to each other. Unfortunately, other than encoding them all with fractional permissions, there wasn't much effort to then explain how these permissions relate at an intuitive level. I would have liked to see how their owners-as-effectors property relates to owners-as-modifiers, it is not quite a generalisation. I think it would also be interesting to extend the permissions beyond reading/writing to referencing so that owners-as-dominators could be expressed.

After lunch, Sophia presented work on flexible object invariants. The big point is that object invariants should be first class, and not just treated as syntactic sugar for pre- and post-conditions. An interesting example of invariants that can't be enforced in a tree-like structure are PIPs - Priority Inheritance Protocols, which involve cyclic structures.

I gave my first talk on wildcards in OGJ which went pretty well, and people seemed to 'get', although there was a bit of an atmosphere of "this is interesting, but so what?".

After the coffee break Paley spoke about combining ownership with relationships, in particular tackling the question of how relationships may be mutated with respect to their ownership properties. Finally I gave my second talk on comparing Universes and Ownership Types. Unfortunately, by this time I was mentally and physically exhausted and I was barely coherent for most of the talk and didn't handle the questions too well, which is a shame because the questions were really on point and I could have given more enlightening answers.

Monday, July 13, 2009

ECOOP day 1 - FTfJP and SToP

And so to the conference; first day - the FTfJP and SToP workshops. The former good, the latter great, it really seemed a workshop in the right place at the right time.

I spent the morning at FTfJP, including presenting my paper on wildcards and subtyping, which was surprisingly well received. It seemed to get people thinking and I got good questions. I was very worried that it would be a hard sell and people wouldn't see the point, but thankfully I was proved wrong.

The invited talk was by Todd Millstein and was on his generic effects system. Unfortunately I was rather tired and couldn't pay much attention, but it was a good talk on an interesting topic. I highly recommend the paper which (I think) was at TLDI this year. The most interesting thing I took away from the talk was the relationship between effects and permissions, which is something I've been thinking about myself rather a lot recently.

The first technical talk was A Typed Intermediate Language for Supporting Interfaces by Juan Chen. What I found interesting was the use of existential types for class types which I've also made use of. It is interesting here because she is using them for, essentially, implementation reasons in the intermediate language; specifically to give types to iTables for dispatching method calls on interface types.

Next up was Range Parameterized Types: Use-site Variance without the Existential Questions by Bruno De Fraine which proposes a simpler version of Java wildcards. The system is less complex, but this should not be surprising as it is less complex. In particular there seems no difference from existential types which cannot be unpacked over a scope (i.e., wildcards capture). Although the formalism is direct, it is significantly more complicated (and less elegant) than an existential types system.

Reuben Rowe then presented Semantic Predicate Types for Class-based Object Oriented Programming. This proposed adapting and generalising intersection types to the OO world as predicates and presented an approximation theorem which, I'm afraid, went slightly over my head. I will be reading the paper though...

After lunch I switched workshop and Attended SToP which was a proper workshop (as opposed to a mini-conference) and had a lot of discussion; it was, therefore, running an hour behind schedule. Which meant that I could catch the second half of Phil Wadler's talk. This reminded me how good a speaker he was and also how cool the blame thing is. On the downside, the talk was very similar to the one I saw at ESOP a few months ago. It included the same division of subtyping into four variations and higher order casts, but added polymorphism. This turns out to be surprisingly difficult - it interacts badly with some of the subtyping.

Next was a talk on Diamondback Ruby which I remember being interesting, but my notes of the talk are terrible.

The next talk (I'm on the plane now and so can't look up the titles/authors, sorry) was about typing in scheme and had the interesting conclusion that type checking extended with gradual typing improved expressivity, even when no dynamic typing was used!

Next up: Atsushi Igarashi's student with progress towards gradual typing in FGJ. Mathias didn't seem to think that this was gradual typing at all. They used a type dynamic, denoted with a question mark, but which didn't correspond (with generics) to wildcards. This seemed unnecessarily confusing.

The final talk (presented by Gregor and his crazy hat) analysed Javascript programs in order to find out how many (and which) dynamic features were used. The conclusion seemed to be quite a lot in real life, but in quite fixed patterns. Also that Sunspider is not a realistic set of benchmarks.

Friday, July 10, 2009

Banquet Speech

Last night was the conference banquet and William Cook gave the speech and it was good. I thought I would blog about it now, while it's fresh in my mind, even though this post will be out of order.

The speech involved William doing a strip tease from industry suit to academic shorts and t-shirt; luckily he stopped there (but kudos to SPJ - "we want more!").

The theme of the talk was industry and academia and also a bit of motivation for academics. There were many interesting points raised, including: industry are not as slow/conservative as we think; industry have lots of ideas - problems and solutions; industry need complete, not partial, solutions; and many more.

One point I thought was intersting was about the WWW: William wondered why it was an industrial, not academic invention, and suggested this was a shortcoming of academia: apparently, the fundamentals were done in academia, but putting it all together was not "publishable" and so was never tackled. I think that this is a perfect synergy between academia and industry. Academia tackled the fundamental problems, which it was it does best - small, difficult, abstract, and, presumably, not commericially viable problems., and insustry used the academic research to produce a 'product' - something academia couldn't do easily because it requires a lot of work, and shouldn't do, because it involves work that is alredy well-understood.

The other thing that made me think from William's talk was his idea that one should have a "purpose" in research or anything else. I tend to have short term goals, but realise that I have no longer term purpose in my research. I thimk, this has been something I've been looking for, subconciously, but haven't yet found. And he is right, that nearly all academics who I look up to, have obvious purposes in their work (although often more than one, it seems). So the search is on!

Wednesday, July 08, 2009

Ecoop '09

Ah, ECOOP is here again. This year, the world's sweatiest conference on object-oriented programming is being held in Genova, Italy. It's a lovely city full of very old and expensive-looking buildings, mostly old palaces - I am amazed that there could be so many palaces in one city! The conference is being held in one such palace and it is an awe-inspiring place for a conference - the conference room would suit the Queen better than a bunch of shabby academics. The food is also much better than normal - even the coffee is drinkable. On the downside, there is no aircon, so it is very hot and sweaty, there are not enough power points, the wifi has issues, and there are three toilets to be shared amongst 200 or so delegates - not good. It is also expensive, but hey.

I arrived on friday and headed to the aquarium, which was amazing. Dolphins, sharks, turtles, penguins, seals, and many, many fish - it was great to hang out with underwater stuff again. Also hummingbirds which were so cool, buzzing around and hovering and twisting around in the air, one of the nicest things I've seen really.

I then met up with Dave Clarke and the Swedes (although Dave is really swedish, it just doesn't say that in his passport), it was great to catch up with everyone, just a shame Nik and Maria weren't sticking around for the conference.

So, in the next few posts I'll go over some of the talks that I enjoyed. But first some general impressions: much less language design stuff - there are barely any formal systems in the proceedings and those are not really proposing new features. SToP (Scripts To Programs) was the most popular workshop, and gradual/hybrid typing seems to be very much in fashion right now. Ownership seems to be losing peoples' interest - which makes me sad, but I guess there has been a lot of work in the area. In fact I would say people are looking beyond static type systems in general a lot more (although there is still lots of interest in verification). Effects seems to be everywhere though, or at least effect-like things, which made me think of effects. A lot of empirical studies of one kind or another are around. Dynamic languages (not just dynamic typing) seem to be popular, and reflection in particular is getting talked about (although I haven't seen any papers). Much fewer industry poeple, although I guess that is to be expected with the economy, and ECOOP hasn't had a strong industry presence for a while. More people have multiple papers.