I discussed yesterday with Raffael Krebs, a master student at SCG, who was disappointed of his design in Java that lacked extensibility. The main problem was that he wanted parallel class hierachies, which you can’t do easily in Java.
This problem isn’t new. Let’s illustrate the problem and explore a bit the various design in pure Java, then with existing language extension.
Below is a sample situation. The core of the model is composed of Document, Item, Text and Image. The Reader reads data from a file and creates the model. The client uses the model. To be generic, it must be possible to specialize the model with as little effort as possible, that is, provide reuse in a natural way. One would like conceptually to specialize (that is, subclass) the model for instance for HTML documents.
This conceptual view can not be achieved with Java. There are several problems: (1) multiple inheritance is not supported, (2) the Reader class still references the core model and will not create instance of the more specialized one, and (3) in most statically-type languages, parameters in a subclass must be covariant to those in its superclass.
The two first problems can be circumvented with interfaces and factories. The third one might require a downcast in add(), if specialized classes have added methods. If only existing methods have been overriden, traditional polymorphism is enough.
Generics provide a partial solution to the problem. Document can accept a parameteric type <A extends Item>, so that HtmlDocument can be defined as an extension of Document<HtmlItem> (see this answer for an example). The downcast in method add() disappears. Item is however abstract, and the system assumes there are two implementations of it, Text and Image. With generics, the class Document can not create instances of these with regular instantiations new Text() or new Image(). First it would not comply necessary with the parametric type <A extends Item>, second, these instantiations would not be rewired correctly to their corresponding class in the other family. If classes were first-class, one could at least define abstract static factory methods createText() and createImage() in Item. This is however not possible in Java, and static method can not be overriden, because classes are not first-class.
In a dynamicaly typed language, the usage of interface vanished, as well as the third issue. However, in both case problem (1) and (2) remain.
The lack of multiple inheritance in this case leads to code duplication between the classes for which not inheritance relationship exist. To avoid such code duplication, delegation could be used and an instance of HtmlText could delegate to an instance of BaseText. As pure delegation is not supported, this require writing boilerplate forwarding methods, which is still clusmy. Another way to avoid code duplication would be to factor the duplicated code into traits or mixin that can be reused among unrelated classes.
No matter what, this design is ugly–and it is really found in practice.
Revisiting the three problems
The problems discussed before can be rephrased into three separated questions:
– How can the reader produce the right kind of document (either basic, or html) without invasive changes?
– How can variations of a class hierarchy be created with minimal efforts and no boilderplate code ?
– How can the type system be aware of logical relationship between types, e.g. HtmlDocument always goes with HtmlItem?
The fourth problem
One fourth problem not depicted but that happens in practice, is that two class hierarchy relate to each other, but in slightly incompatible ways. This typically happens due to software evolution, when method are renamed, signature are changed, etc. An example would be to change the “data” in Item from binary byte to base64 string. While certain kinds of changes can be accomodated by providing convertion functions (they act as wrapper), this bloats the design and prevent clean evolution. Sometimes it is impossible to create an inheritance relationship between the two variants of the classes, which means that whole graphs must be adapted back and forth from one representation to the other. We state the fourth problem as follows:
– How can graph of objects be converted from one logical representation to another easily, whithout that the convertion logic bloats the code?
The fifth problem
One fifth problem not discussed so far is actually a requirement:
– How can the new families be introduced a posteriori, without entailing modification or invalidation of exisitng code.
This problem is usually referred to as modularity.
These problems are clearly related to each others, as is showed by the example. A definitive solution to all three of them is to my knowledge still not available, but there has been progress in programming language to provide means to tackle part of them.
- Design patterns (strategy, factory)
- Static & dynamic reuse mechanisms (delegation, traits, talents)
- Family polymorphism
- Virtual classes (newspeak/gbeta, dependency injection, first-class class)
- Object Algebras
- Local rebinding (classbox)
- Open classes
- Multi-dimensional dispatch (subject-oriented programming, context-oriented programming, worlds, namespace selector)
- Non-standard type coercion (translation polymorphism in Object Team & lifted Java, expander)
- Type versioning (type hot swapping, upgradeJ, revision classes)
It seems to me there is room for something actually missing that would solve the problem depicted above–or myabe it exist but I am not aware of it?