Zoem

an interpretive macro/programming language

Zoem: <Dutch>; The sound made by electrical devices and flying bugs. Pronounced: zoom or zum; the vowel is short.

Zoem is an interpretive macro/programming language that is evaluated by the macro processor called zoem. It is used by Portable Unix Documentation (PUD) and Aephea. The latter is a general authoring tool for writing HTML documents and provides both useful abstractions and a framework for creating new abstractions. It uses and promotes the use of CSS. A small core of Aephea has been ported to the typesetting language troff. This core is used in PUD, which provides mini-languages for FAQ documents and UNIX manual pages. Documents written in PUD can be output to troff and html, and further to plain text, PostScript, and PDF.

A zoem snippet to illustrate some features

\def{fib#1}{ \push{fibonacci} \set{a}{1} \set{b}{1} \set{c}{0} \while{\let{\a <= \1}}{ \setx{c}{\a} \setx{a}{\let{\a + \b}} \write{-}{txt}{\c\|} \setx{b}{\c} } \pop{fibonacci} } \: need to escape newlines below, lest they ruin the prompt \write{-}{device}{Enter a number please, then press <cr>, <ctrl-d>\@{\N>\s}}\ \setx{num}{\zinsert{-}}\ \fib{\num}

The snippet shows computation of some Fibonacci numbers up to a limit provided by the user. It features different set macros, iteration, arguments (delimited by curly braces) and shows that any special meaning or activity is always instigated by a backslash. Comments are introduced by \:. The sequence \@{..} is a way to explicitly control whitespace and can abstract over different output devices such as plain text, HTML or troff. The full language description.

Who uses zoem?

Very very very, very, very very very very few people. You'd be part of an elite group.

Zoem was used in a 128-language Quine relay by Yusuke Endoh - https://github.com/mame/quine-relay, with languages in alphabetical order.

This is a Ruby program that generates Rust program that generates Scala program that generates ...(through 128 languages in total)... REXX program that generates the original Ruby code again.

Zoem is in that august list of languages by virtue of satisfying the criterion for inclusion. It has slowly dawned on me this may be why zoem was called to life, aside from my own uses for it.

The magnitude of the Quine relay challenge is .. magnitudinous, see e.g. https://esoteric.codes/blog/the-128-language-quine-relay.

According to Endoh, the most challenging transitions were Befunge to BLC8 to brainfuck, as all three are esoteric. It runs through languages strictly in alphabetical order, so there was no opportunity to group easier-to-work-with languages together. BLC8 functions bit-wise, which meant finding a byte-aligned encoding way to work with it, to feed brainfuck. Other esolangs presented challenges as well; Piet, the language that uses images as source code (read the interview with Piet's creator here) was a bit easier as it came after Perl 6, which bundles Zlib as standard library, making it straightforward to generate a PNG file. Had it followed, say, brainfuck, it would have been a much larger challenge.

It is tempting to speculate that zoem may be one of the easier languages to transition through — either doubled backslashes or no backslashes yield a similar or unchanged text, but then easy should not be allowed anywhere near the phrase 128-language Quine relay(languages in alphabetical order).

Some aspects of the Zoem language highlighted

This pertains to the core zoem language, not to the packages built on top of it.

two-stage processing: macro evaluation and character filtering

simplicity in design

a single first-level meta character

a single way of delimiting strings and scopes

generic building blocks

macros can be easily treated as data

strict syntax, no fuzzy context rules

inside out evaluation if needed

arithmetic environment

data storage environment (multi-dimensional hashes)

iteration/list construct (apply macro)

exception framework integrated with error framework, applicable to arbiratrily deeply nested expressions, enabling e.g.
\catch{towel}{\while{1}{..deep-stuff..\throw{towel}..}} idiom.

interactive mode fully recovering from errors

interactive mode can be started from the command line, from within a file, or triggered when an error occurs during processing.

easy and comprehensive IO, control operators, dictionary stacks, system commands, comprehensive tracing

line-based text formatting capabilities, including positional alignment (left, centered, right), substring alignment, background padding, and virtual length specification (byte/glyph mismatch compensation).

regexp environment

fast

autotooled (courtesy Joost van Baal), should build on all Unix platforms

News

10 Dec 2021
zoem-21-341 released. From the ChangeLog:

Zoem is still as stable as it has been for the last ten years. No development or features are planned, with changes limited to bug fixes, documentation, and potentially code triage focussed on clarification of corner cases and tightening of weird nesting patterns involving file I/O.

Zoem source is now hosted on https://github.com/micans/zoem. Compiling zoem now requires installation of cimfomfa, a C utility library. Cimfomfa is hosted on https://github.com/micans/cimfomfa.

Tar releases will for the foreseeable future still be hosted on https://micans.org/zoem and https://micans.org/cimfomfa.

Previously cimfomfa was imported in the zoem source tree, but this practice has stopped. The zoem source tree contains this script that will download both cimfomfa and zoem tar archives and compile both.

Fixed bug in let#1 (this is a cimfomfa bug); it would think that abs(2) == abs(1) is true. The cause was a combination of two things: failure to maintain float/integer correspondence by abs and other functions, and a wrong ternary cascade that would check float (in)equality even if the integer (in)equality test was already conclusive.

22 Sep 2010
In interactive mode zoem can utilise readline editing and history capabilities if available. A new builtin macro seq#4 can be used as a simple type of for loop. Environments that are not closed at the end of processing are now reported in a diagnostic message.

08 Jan 2010
The HTML document framework has been wrapped up and split off under the name Aephea, a bacronym that grandiosely expands to Adaptable Exo-skeleton for Practical HTML Extension and Abstraction. The Portable UNIX Documentation (PUD) minilanguages are now shipped with Aephea. PUD provides specialised support for two mini languages, classic UNIX manual pages and FAQs, for two output devices, troff and HTML. Aephea provides more elaborate support for a single output device, HTML. The zoem package from now on provides only resources and documentation for the macro/programming language zoem itself.

Dictionaries can be named, and keys can be set and retrieved in dictionaries that are located by name. Support for HTML was streamlined.

05 Sep 2008
The 08-248 release fixes a very old (and so far unnoticed) bug that manifested itself only on certain platforms. It quite likely only surfaces now because zoem is being used on more platforms, and because of the arrival of 64-bit platforms that often implement C's variadic argument lists in a way that exposes the bug.

25 Jul 2007
The 07-205 release contains new zoem examples that are solutions to the N-queens problem, showing the extent to which zoem allows regular programming. While zoem is not the most suitable language for solving this type of problem, the solution as written in zoem is not overly contrived. It uses the powerful mechanism of zoem's exception mechanism, which enables conditional transferal of control by rewinding the execution stack. Dictionary stacks (a concept separate from the execution stack) are used to separate namespaces for recursive invocations of the same macro. The solution also features the zoem iteration primitive, apply, and the use of the bottom user dictionary as a global namespace.

Syntax was introduced to directly access the bottom user dictionary, which can thus unambiguously act as a global namespace.

22 Mar 2006
With 06-080 user macros are allowed to shadow (i.e. override) primitives. Syntax is provided to access primitives even if they are shadowed and thus inaccessible with regular syntax. This greatly aids in avoiding name clashes between modules and documents (by using the special syntax in modules), and it means that new primitives can be introduced in zoem without breaking older documents (as these documents are now allowed to shadow primitives).

The switch#2 primitive can group branches, whilst#2 flushes output immediately rather than building up the result the way while#2 does, and the session macro \__line__ should report the correct line number also in the presence of escaped newlines.

15 Feb 2006
In 06-046 many primitives were recoded in modules (counters and references).

24 Nov 2005
It is possible since 05-328 to register macros to be processed after all regular processing has finished. This can be used by modules to do consistency checks and optionally issue warnings or summary statistics (for example, the number of undefined references that were found).

Documentation

The Zoem User Manual (html only). The definition of the zoem language.

The zoem interpreter manual - (PostScript).

Download zoem

Download from micans.org and install from source.

Look at the ChangeLog.

Debian
Zoem is now shipped with Debian, courtesy Debian developer (and zoem autoconfiscator) Joost van Baal.

OpenBSD
Andreas Kahari built an OpenBSD port. Hopefully that link is sufficient to get interested OpenBSD people going.

Release log

Here is the zoem ChangeLog.

Some TODO items

!

enter zoem in the Unicode age. Read more further below.

 

fix the inspect macro interface and semantics (regular expression functionality).

more accessible syntax for nested anonymous or regular macros to facilitate staged processing (pipes).

customizable escape character (requires demand first)

more power in the filtering language (requires demand first)

separate macro packages from zoem package

UNIX manual and FAQ macros depend on ascii character encoding (in specifying filter rules)

printf, substr, length, split, pack macros (requires demand first)

Regarding Unicode, I have no clue yet what the sane ideas and requirements are. This project is postponed until at least a quarter-clue has emerged. Now let us call a hypothetical Unicode-enabled zoem successor zoef. Should zoef be encoding-agnostic, or is it reasonable to assume UTF-8? The current zoem language basically enables mapping ascii and custom ascii encodings to other character sets. Should zoef aspire to map Unicode to other encodings (unlikely)?. Should zoef just expand ascii maps and custom encodings with a Unicode target, perhaps adding explicit syntax for Unicode characters? Note that the current zoem is perfectly capable of handling UTF-8 input, with the caveat that it treats the input stream as a byte stream.

Author

zoem the language and zoem the interpreter were written by Stijn van Dongen. zoem's build environment was created by Joost van Baal.

Miscellaneous

This page used to list a number of macro and mark-up processors and converters. That section has now got its own page.