Direkt zum Hauptbereich

A Lesson to Learn from Compiler Construction

Some few days ago, I was struck by an insight. This insight concerned a certain way of structuring a software problem in order to reduce complexity. I realized that I used this pattern subconsciously for quite some time. However, I wasn't really aware of it until I had a discussion with one of my diploma students end of 2005. We had a discussion about the component framework he was supposed to built. I was bubbling over with ideas -- and the diploma student became more and more frightened about all these features, I was so enthusiastic about. Neither did he comprehend what I was after, nor did he see the slightest chance to get all this implemented. Suddenly, it occurred to me that there was a way out of this complexity: "Gosh, why do I torture this poor guy with all my ideas. What I'm really after is something much more simpler. If I break things down, I just need some core functions implemented, which make up the basic component framework and let it run. The rest, my fancy features, can be understood as a way of use of this simple component framework."

As it turned out, the simple component framework consists of the notion of a component, which can be composed of other components, communication channels between components and that's it, basically. The more complex component framework, which I originally had in my mind, has some nice "add-ons": two-way interfaces called ports, communication channels to model distribution called complex connectors, typed ports, stuff like that. (It's an extended version of the ROOM language, the Real-Time Object-Oriented Modeling language. Some of its concepts made it into the UML 2.0 standard.)

While the simple component framework is relatively easy to built, the more advanced component framework is not built at all. Instead, the conceptions of the advanced framework and their allowed combination of use are defined by a configuration format. The format sets the rules, what kind of arrangements are allowed, and enforces contextual consistency. It's a declarative approach. A concrete configuration describes a setup according to the rules and conventions. For example, two ports can be connected by a connector only if the associated communication protocols match.

The point is that this configuration can be translated into a configuration of the "simple" component framework. Ports and complex connectors, for example, can be mapped to components of the simple component framework. Of course, ports and complex connectors are not equal to components, otherwise you wouldn't have come up with these conceptions in the advanced component framework. There is semantics attached to these concepts! A port serves a certain purpose, it's a kind of interface, whereas a component is just a component. The purpose of a simple component is just to do what its behavioral specification promises to do. In this sense, you are allowed to do whatever you want to do with a component. A port, however, must be used in connection with a "hosting" component. A port is to be used in a certain way for a certain purpose, which makes up its semantics. The key issue to understand is that ports (and complex connectors as well) can be mapped to components for the purpose of execution! But there is a price to be paid: the loss of semantics within the simple component framework. To compensate this loss of semantics, one could allow to annotate components in the simple component framework with meta-information. This is also a very helpful technique in order to trace debug information back to its source in the advanced component framework.

What I explained here is a clear cut between a simple operational level and a more complex configurational level. This directs the resolution of complexity to different levels. If you think about it for some time, you might notice that this technique is used e.g. in compiler construction. Java is a configuration schema enforcing certain rules and conventions. Classes, methods, attributes and so on serve a certain purpose. But they are all translated to a very basic level of commands, a ByteCode format, which can be executed by a virtual machinery, the Java Virtual Machine.

After I have written all this down, I ask myself: What the heck was the reason I was struck by an insight? Am I stupid? Isn't all this obvious? It is, if you design a compiler for a language. But sometimes, when you approach a problem from a different angle, not having language design issues in mind, you might overlook this. I read a book about ROOM, which discusses implementation issues of the ROOM component language in quite detail. But while didactically excellent, it uses a more complicated approach, which didn't let me realize that there is another road down the hill. Luckily, now I'm enlightened! ;-)

Beliebte Posts aus diesem Blog

Lidl und der Kassen-Bug

Es gibt Fehler, im Informatiker-Jargon "Bugs", die etwas anrühriges haben. Ich bat den Menschen an der Kasse bei Lidl um einen Moment Geduld und meine Kinder um Ruhe, um nicht den wunderbaren Moment zu verpassen, bei dem es passierte. Der Lidl-Mensch fluchte kurz auf -- und ich war entzückt! "Einen Moment, davon muss ich ein Foto machen!" Und dann machte ich noch eines. Ich bin heute extra für diesen Fehler zu Lidl gepilgert -- ich wollte es mit eigenen Augen sehen. Gestern hat mir ein Student (vielen Dank Herr Breyer) von diesem Fehler in einer EMail berichtet. Ein richtig schöner Fehler, ein Klassiker geradezu. Ein Fehler, den man selten zu Gesicht bekommt, so einer mit Museumswert. Dafür wäre ich sogar noch weiter gereist als bis zum nächsten Lidl. Der Fehler tritt auf, wenn Sie an der Kasse Waren im Wert von 0 Euro (Null Euro) bezahlen. Dann streikt das System. Die kurze Einkaufsliste dazu: Geben Sie zwei Pfandflaschen zurück und Lidl steht mit 50 Cent bei Ihne...

Syntax und Semantik

Was ist Syntax, was ist Semantik? Diese zwei Begriffe beschäftigen mich immer wieder, siehe zum Beispiel auch " Uniform Syntax " (23. Feb. 2007). Beide Begriffe spielen eine entscheidende Rolle bei jeder Art von maschinell-verarbeitbarer Sprache. Vom Dritten im Bunde, der Pragmatik, will ich an dieser Stelle ganz absehen. Die Syntax bezieht sich auf die Form und die Struktur von Zeichen in einer Sprache, ohne auf die Bedeutung der verwendeten Zeichen in den Formen und Strukturen einzugehen. Syntaktisch korrekte Ausdrücke werden auch als "wohlgeformt" ( well-formed ) bezeichnet. Die Semantik befasst sich mit der Bedeutung syntaktisch korrekter Zeichenfolgen einer Sprache. Im Zusammenhang mit Programmiersprachen bedeutet Semantik die Beschreibung des Verhaltens, das mit einer Interpretation (Auslegung) eines syntaktisch korrekten Ausdrucks verbunden ist. [Die obigen Begriffserläuterungen sind angelehnt an das Buch von Kenneth Slonneger und Barry L. Kurtz: Formal Syn...

Factor @ Heilbronn University

It was an experiment -- and it went much better than I had imagined: I used Factor (a concatenative programming language) as the subject of study in a project week at Heilbronn University in a course called "Software Engineering of Complex Systems" (SECS). Maybe we are the first university in the world, where concatenative languages in general and Factor in specific are used and studied. Factor is the most mature concatenative programming language around. Its creator, Slava Pestov, and some few developers have done an excellent job. Why concatenative programming? Why Factor? Over the years I experimented with a lot of different languages and approaches. I ran experiments using Python, Scheme and also Prolog in my course. It turned out that I found myself mainly teaching how to program in Python, Scheme or Prolog (which still is something valuable for the students) instead of covering my main issue of concern: mastering complexity. In another approach I used XML as a tool ...