Direkt zum Hauptbereich

Concatenative Programming via Rewriting in Scheme

The kernel of a concatenative programming language can be easily implemented using a rewriting system. As a proof-of-concept the presented implementation fully relies on Scheme's rewriting system at macro expand time! All one needs is define-syntax and -- in order to provide access to Scheme's built-in procedures -- a single defmacro.

(Apologies, if the formatting makes the code below hard to read. If you like to get the Scheme code, just drop me an email.)

Rewriting rules concerning the data stack (ds) are written as follows:

(rule swap : (@d ... snd fst) ==> (@d ... fst snd))
^^^^ ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
word datastack before datastack after

Note that the topmost element of the data stack is the rightmost element in the notation above.

Rewriting rules concerning the data _and_ the program stack (ps) are written as

(rule call : (@d ... (@q ...)) (@p ...) ==> (@d ...) (@q ... @p ...))
^^^^ ^^^^^^^^^^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^^^^^^^^
word ds before ps before ds after ps after

Note that the topmost element of the program stack is the leftmost(!) element. Letting the data stack and the program stack "touch" their heads, makes it easier to describe the process when a word or quotation moves from the top of the program stack to the top of the data stack. The item just appears to move over to the other side.

Quotations are written as lists. The notation for patterns is according to the conventions for syntax-rules in Scheme. The use of a "@" in patterns has no special syntactic meaning, it's just a personal preference.

To run a concatenative program type

> (run () (2 3 plus))
^^ ^^^^^^^^^^
ds ps

If you type this in at the console, the result just shows the resulting data stack since the program stack is empty.

Here are some examples and their results (indicated by "-->"):

> (run (1 2) (3 4)) --> (1 2 3 4)
> (run () (2 1 swap dup 1 plus)) --> (1 2 3)
> (run (1 2) ((plus dup) call)) --> (3 3)
> (run () (4 5 1 2 eq (plus) (minus) ifte)) --> (-1)
> (run () (4 5 1 2 eq neg (plus) (minus) ifte)) --> (9)

This is the way, I implemented the rewriting system with Scheme macros only. I use PLT Scheme, choose language "Pretty Big".

First, we need to import "defmacro". In addition, we need a nice macro hack from Oleg Kiselyov, see "How to write symbol? with syntax-rules". We use it in the definition of 'run'.

(require mzlib/defmacro)

(define-syntax symbol??
(syntax-rules ()
((symbol?? (x . y) kt kf) kf) ; It's a pair, not a symbol
((symbol?? #(x ...) kt kf) kf) ; It's a vector, not a symbol
((symbol?? maybe-symbol kt kf)
(let-syntax
((test
(syntax-rules ()
((test maybe-symbol t f) t)
((test x t f) f))))
(test abracadabra kt kf)))))

The rule macro transforms rewrite rules into new macros. For example, the rule for 'swap'

(rule swap : (@d ... snd fst) ==> (@d ... fst snd))

creates the following new macro

((swap (@d ... snd fst) ps) (run (@d ... fst snd) ps))

That means, whenever an s-expression with 'swap' being leftmost is encountered, such like '(swap (1 2 3) (4 5))', the macro is applied. In the example, the result is '(run (1 3 2) (4 5))'.

(define-syntax rule
(syntax-rules (: ==>)
((_ word : (@d1 ...) ==> (@d2 ...))
(define-syntax word
(syntax-rules ()
((_ (@d1 ...) ps) (run (@d2 ...) ps)))))
((_ word : (@d1 ...) ==> (@d2 ...)
(@d3 ...) ==> (@d4 ...))
(define-syntax word
(syntax-rules ()
((_ (@d1 ...) ps) (run (@d2 ...) ps))
((_ (@d3 ...) ps) (run (@d4 ...) ps)))))
((_ word : (@d1 ...) (@p1 ...) ==> (@d2 ...) (@p2 ...))
(define-syntax word
(syntax-rules ()
((_ (@d1 ...) (@p1 ...)) (run (@d2 ...) (@p2 ...))))))
((_ word : (@d1 ...) (@p1 ...) ==> (@d2 ...) (@p2 ...)
(@d3 ...) (@p3 ...) ==> (@d4 ...) (@p4 ...))
(define-syntax word
(syntax-rules ()
((_ (@d1 ...) (@p1 ...)) (run (@d2 ...) (@p2 ...)))
((_ (@d3 ...) (@p3 ...)) (run (@d4 ...) (@p4 ...))))))
))

The run macro keeps the rewriting process running until the program stack is empty. The assumption is that there is a rewriting rule defined for symbols used on the program stack.

(define-syntax run
(syntax-rules ()
((run (@d ...) ()) ; we are done
'(@d ...))
((run (@d ...) (wrd @p ...)) ; inspect next wrd on program stack
(symbol?? wrd
(wrd (@d ...) (@p ...)) ; activate macro associated with wrd
(run (@d ... wrd) (@p ...)))))) ; move wrd on top of data stack

In order to have access to Scheme's built in procedures, we introduce a rewriting rule for a word called 'alien'. In Scheme, a procedure such as '+' can also be called using apply:

> (+ 2 3)
5
> (apply + '(2 3))
5

Thus, 'alien' expects the arguments and a symbol on top of the data stack, each wrapped in a quotation/list. To interact with Scheme, we need to use defmacro _alien.

(defmacro _alien (op q ds ps)
`(run (,@ds ,(apply (eval op) q)) ,ps))

(define-syntax alien
(syntax-rules ()
((_ (@d ... (@q ...) (op)) (@p ...))
(_alien op (@q ...) (@d ...) (@p ...)))))

Now let us define some rewriting rules:

(rule swap : (@d ... snd fst) ==> (@d ... fst snd))
(rule dup : (@d ... fst) ==> (@d ... fst fst))
(rule quote1 : (@d ... fst) ==> (@d ... (fst)))
(rule quote2 : (@d ... snd fst) ==> (@d ...(snd fst)))
(rule quote3 : (@d ... trd snd fst) ==> (@d ... (trd snd fst)))
(rule call : (@d ... (@q ...)) (@p ...) ==> (@d ...) (@q ... @p ...))
(rule ifte : (@d ... #f (@t ...) (@e ...)) (@p ...) ==> (@d ...) (@e ... @p ...)
(@d ... obj (@t ...) (@e ...)) (@p ...) ==> (@d ...) (@t ... @p ...))
(rule plus : (@d ...) (@p ...) ==> (@d ...) (quote2 (+) alien @p ...))
(rule minus : (@d ...) (@p ...) ==> (@d ...) (quote2 (-) alien @p ...))
(rule mult : (@d ...) (@p ...) ==> (@d ...) (quote2 (*) alien @p ...))
(rule div : (@d ...) (@p ...) ==> (@d ...) (quote2 (/) alien @p ...))
(rule eq : (@d ...) (@p ...) ==> (@d ...) (quote2 (eq?) alien @p ...))
(rule neg : (@d ... #f) ==> (@d ... #t)
(@d ... obj) ==> (@d ... #f))

And we are done ;-)

For information about concatenative languages, see concatenative.org.

Beliebte Posts aus diesem Blog

Lidl und der Kassen-Bug

Es gibt Fehler, im Informatiker-Jargon "Bugs", die etwas anrühriges haben. Ich bat den Menschen an der Kasse bei Lidl um einen Moment Geduld und meine Kinder um Ruhe, um nicht den wunderbaren Moment zu verpassen, bei dem es passierte. Der Lidl-Mensch fluchte kurz auf -- und ich war entzückt! "Einen Moment, davon muss ich ein Foto machen!" Und dann machte ich noch eines. Ich bin heute extra für diesen Fehler zu Lidl gepilgert -- ich wollte es mit eigenen Augen sehen. Gestern hat mir ein Student (vielen Dank Herr Breyer) von diesem Fehler in einer EMail berichtet. Ein richtig schöner Fehler, ein Klassiker geradezu. Ein Fehler, den man selten zu Gesicht bekommt, so einer mit Museumswert. Dafür wäre ich sogar noch weiter gereist als bis zum nächsten Lidl. Der Fehler tritt auf, wenn Sie an der Kasse Waren im Wert von 0 Euro (Null Euro) bezahlen. Dann streikt das System. Die kurze Einkaufsliste dazu: Geben Sie zwei Pfandflaschen zurück und Lidl steht mit 50 Cent bei Ihne...

Syntax und Semantik

Was ist Syntax, was ist Semantik? Diese zwei Begriffe beschäftigen mich immer wieder, siehe zum Beispiel auch " Uniform Syntax " (23. Feb. 2007). Beide Begriffe spielen eine entscheidende Rolle bei jeder Art von maschinell-verarbeitbarer Sprache. Vom Dritten im Bunde, der Pragmatik, will ich an dieser Stelle ganz absehen. Die Syntax bezieht sich auf die Form und die Struktur von Zeichen in einer Sprache, ohne auf die Bedeutung der verwendeten Zeichen in den Formen und Strukturen einzugehen. Syntaktisch korrekte Ausdrücke werden auch als "wohlgeformt" ( well-formed ) bezeichnet. Die Semantik befasst sich mit der Bedeutung syntaktisch korrekter Zeichenfolgen einer Sprache. Im Zusammenhang mit Programmiersprachen bedeutet Semantik die Beschreibung des Verhaltens, das mit einer Interpretation (Auslegung) eines syntaktisch korrekten Ausdrucks verbunden ist. [Die obigen Begriffserläuterungen sind angelehnt an das Buch von Kenneth Slonneger und Barry L. Kurtz: Formal Syn...

Factor @ Heilbronn University

It was an experiment -- and it went much better than I had imagined: I used Factor (a concatenative programming language) as the subject of study in a project week at Heilbronn University in a course called "Software Engineering of Complex Systems" (SECS). Maybe we are the first university in the world, where concatenative languages in general and Factor in specific are used and studied. Factor is the most mature concatenative programming language around. Its creator, Slava Pestov, and some few developers have done an excellent job. Why concatenative programming? Why Factor? Over the years I experimented with a lot of different languages and approaches. I ran experiments using Python, Scheme and also Prolog in my course. It turned out that I found myself mainly teaching how to program in Python, Scheme or Prolog (which still is something valuable for the students) instead of covering my main issue of concern: mastering complexity. In another approach I used XML as a tool ...