Zoem: <Dutch> The sound made by electrical devices and flying bugs. Pronounced: zoom or zum; the vowel is short.
Zoem is an interpretive macro/programming language that is evaluated by the macro processor called zoem. It is used by Portable Unix Documentation (PUD) and Aephea. The latter is a general authoring tool for writing HTML documents and provides both useful abstractions and a framework for creating new abstractions. It uses and promotes the use of CSS. A small core of Aephea has been ported to the typesetting language troff. This core is used in PUD, which provides mini-languages for FAQ documents and UNIX manual pages. Documents written in PUD can be output to troff and html, and further to plain text, PostScript, and PDF.
This pertains to the core zoem language, not to the packages built on top of it.
two-stage processing: macro evaluation and character filtering
simplicity in design
a single first-level meta character
a single way of delimiting strings and scopes
generic building blocks
macros can be easily treated as data
strict syntax, no fuzzy context rules
inside out evaluation if needed
arithmetic environment
data storage environment (multi-dimensional hashes)
iteration/list construct (apply macro)
exception framework integrated with error framework, applicable
to arbiratrily deeply nested expressions, enabling e.g.
\catch{towel}{\while{1}{..deep-stuff..\throw{towel}..}}
idiom.
interactive mode fully recovering from errors
interactive mode can be started from the command line, from within a file, or triggered when an error occurs during processing.
easy and comprehensive IO, control operators, dictionary stacks, system commands, comprehensive tracing
line-based text formatting capabilities, including positional alignment (left, centered, right), substring alignment, background padding, and virtual length specification (byte/glyph mismatch compensation).
regexp environment
fast
autotooled (courtesy Joost van Baal), should build on all Unix platforms
22 Sep 2010
In interactive mode zoem can utilise readline editing and history
capabilities if available. A new builtin macro seq#4 can be
used as a simple type of for loop. Environments that are not
closed at the end of processing are now reported in a diagnostic
message.
08 Jan 2010
The HTML document framework has been wrapped up and split off under the
name Aephea, a bacronym that grandiosely expands to
Adaptable Exo-skeleton for Practical HTML Extension and Abstraction.
The Portable UNIX Documentation (PUD) minilanguages are now shipped with
Aephea. PUD provides specialised support for two mini languages, classic UNIX manual pages
and FAQs, for two output devices, troff and HTML. Aephea provides more
elaborate support for a single output device, HTML.
The zoem package from now on provides only resources and documentation
for the macro/programming language zoem itself.
Dictionaries can be named, and keys can be set and retrieved in dictionaries that are located by name. Support for HTML was streamlined.
05 Sep 2008
The 08-248 release fixes a very old (and so far unnoticed)
bug that manifested itself only on certain platforms. It quite likely only
surfaces now because zoem is being used on more platforms, and because of
the arrival of 64-bit platforms that often implement C's variadic argument
lists in a way that exposes the bug.
25 Jul 2007
The 07-205 release
contains
new zoem
examples
that are solutions to the
N-queens problem, showing the extent to which zoem allows regular
programming. While zoem is not the most suitable language for solving this
type of problem, the solution as written in zoem is not overly contrived.
It uses the powerful mechanism of zoem's exception mechanism, which
enables conditional transferal of control by rewinding the execution stack.
Dictionary stacks (a concept separate from the execution stack) are used to
separate namespaces for recursive invocations of the same macro. The
solution also features the zoem iteration primitive, apply,
and the use of the bottom user dictionary as a global namespace.
Syntax was introduced to directly access the bottom user dictionary, which can thus unambiguously act as a global namespace.
22 Mar 2006
With 06-080 user macros are allowed to shadow (i.e. override)
primitives. Syntax is provided to access primitives even
if they are shadowed and thus inaccessible with regular syntax.
This greatly aids in avoiding name clashes between
modules and documents (by using the special syntax in modules),
and it means that new primitives can be introduced in zoem
without breaking older documents (as these documents are
now allowed to shadow primitives).
The switch#2 primitive can group branches, whilst#2 flushes output immediately rather than building up the result the way while#2 does, and the session macro \__line__ should report the correct line number also in the presence of escaped newlines.
15 Feb 2006
In 06-046 many primitives were recoded
in modules (counters and references).
24 Nov 2005
It is possible since 05-328 to register macros to be processed after all
regular processing has finished. This can be used by modules
to do consistency checks and optionally issue warnings or
summary statistics (for example, the number of undefined references
that were found).
Download from micans.org and install from source.
Debian
Zoem is now shipped with Debian,
courtesy Debian developer (and zoem autoconfiscator) Joost van Baal.
OpenBSD
Andreas Kahari built an
OpenBSD port.
Hopefully that link is sufficient to get interested OpenBSD people going.
Here is the zoem ChangeLog.
enter zoem in the Unicode age. Read more further below.
fix the inspect macro interface and semantics (regular expression functionality).
more accessible syntax for nested anonymous or regular macros to facilitate staged processing (pipes).
customizable escape character (requires demand first)
more power in the filtering language (requires demand first)
separate macro packages from zoem package
UNIX manual and FAQ macros depend on ascii character encoding (in specifying filter rules)
printf, substr, length, split, pack macros (requires demand first)
Regarding Unicode, I have no clue yet what the sane ideas and requirements are. This project is postponed until at least a quarter-clue has emerged. Now let us call a hypothetical Unicode-enabled zoem successor zoef. Should zoef be encoding-agnostic, or is it reasonable to assume UTF-8? The current zoem language basically enables mapping ascii and custom ascii encodings to other character sets. Should zoef aspire to map Unicode to other encodings (unlikely)?. Should zoef just expand ascii maps and custom encodings with a Unicode target, perhaps adding explicit syntax for Unicode characters? Note that the current zoem is perfectly capable of handling UTF-8 input, with the caveat that it treats the input stream as a byte stream.
zoem the language and zoem the interpreter were written by Stijn van Dongen. zoem's build environment was created by Joost van Baal.
This page used to list a number of macro and mark-up processors and converters. That section has now got its own page.