Objectionary: Dictionary and Factory for EO Objects
Since the time of Kernighan and Ritchie we share binary code in
libraries. You need to print some text with
printf() in C++?
You get libc library with
700+ other functions inside.
You need to copy a Java stream?
You get Apache Commons IO with
other methods and classes.
you need an object, or a class, or a function, or a method—you have to add
the entire library to your build.
Wouldn’t it be more elegant to deal with individual objects instead?
The idea is not new and not mine. I got it from the book Object Thinking by David West, where he suggested creating an Objectionary (page 306), a “combination of dictionary and object factory,” with the following properties:
- The total number of objects is less than 2000;
- Each object is an autonomous executable entity;
- Every object has a unique ID and a unique “address”;
- Objects are nothing more than collections of objects;
- Objects require hardware-specific VMs for execution.
Seventeen years later (the book was published in 2004), we implemented the idea on top of EO, our new programming language. The language is intentionally much simpler than Java or C++. You can read its more or less formal description here.
To turn an EO program into an executable entity and release it to the Objectionary, one has to go through the following mandatory steps, assuming the JVM is used as a target platform (the steps marked with 🌵 are implemented by our eo-maven-plugin):
- Discover🌵: find all foreign aliases
- Pull🌵: download foreign
- Resolve🌵: download and unpack
- Place🌵: move artifact
- Mark🌵: mark
.eosources found in
- ↑ Go back to Parse if some
.eofiles are still not parsed
- Assemble🌵: same as above, but for tests
- Test: run all unit tests
- Unplace🌵: remove artifact
- Unspile🌵: remove auto-generated
- Copy🌵: copy
- Deploy: package
.jarartifact and put it into Maven Central
- Push: send a pull request to yegor256/objectionary
- Merge: we test and merge the pull request
It is an iterative process, which loops over and over
again until all required
.eo objects are parsed and their atoms are present
.xmir files are transpiled to
.java and then compiled
.class binaries. Then, tested, packaged, and deployed to Maven Central. Then,
merged to the
master branch of Objectionary,
via a pull request.
The first part of the algorithm can be automated with
our Maven plugin, simply by placing
src/main/eo/ and adding this to
register goal will scan the
src/main/eo/ directory, find all
.eo sources, and “register” them in a special CSV catalog at
target/eo-foreigns.csv. Next, the
assemble goal will call
the following goals:
resolve. All these goals use the CSV catalog when they parse, optimize,
pull and so on.
When all of them are done,
assemble checks the catalog:
.eo files still require parsing? If they do, another
cycle starts, again with parsing. When all
.eo files are parsed,
transpile is executed, which turns
.xmir files into
and places them into
target/generated-sources. The rest is done by the
Let’s discuss each step in detail.
Say, this is the
.eo source code at
It will be parsed to this XMIR (XML Intermediate Representation):
If you wonder what this XML means, read this document: there is a section about XMIR.
At this step the XMIR produced by the parser goes through
many XSL transformations, sometimes getting additional elements and attributes.
Our example XMIR may get a new attribute
@ref, pointing the reference to the
user to the line where the object was defined:
Some XSL transformation may check for grammar or semantic errors and
add a new element
<errors/> if something wrong is found. Thus, if parsing
didn’t find any syntax errors, all other errors will be visible inside
the XMIR document, for example, like this:
By the way, this is not a real error, I just made it up.
At this step we find out which objects are “foreign”. In our example,
user is not foreign, since it’s defined in the code we
have in front of us, while the object
stdout is not defined here and
that’s why is a foreign one.
Going through all
.xmir files we can easily judge which object is foreign just
by looking at their names. Once we see the reference to
we check the presence of the file
org/eolang/io/stdout.eo in the directory
.eo sources. If the file is absent, we put the object name
into the CSV catalog and claim it to be foreign.
Here we simply try to find source code
.eo files for all foreign
objects in Objectionary, by looking at its
For example, this is where we would find
We find them there and pull to the local disc.
Pay attention, we pull the sources. Not binaries or compiled XMIR
documents, but the sources in
This is what
may look like, after the pull:
The object is an atom. This means that even though we have its source code,
it’t not complete without a piece of platform-specific binary code.
An atom is an object implemented by the runtime
platform, where the EO program is executed (also known
as FFI mechanism).
The line that starts with
+rt (runtime) explains where to get the
runtime code. The
jvm part is the name of the runtime.
By the way, a program may contain a number of
+rt meta instructions, for example:
Here, three runtime platforms will know where to get the missing code
EO➝Java will go to Maven Central for the JAR artifact,
EO➝Ruby will go to RubyGems
trying to find the gem by the name
eo-core and version
while EO➝Python will go to PyPi
trying to find
eo-basics package with the version
Next we place all
.class files found in the unpacked JAR,
target/classes directory. We do this in order
to help Maven Compiler Plugin find them in classpath.
In each JAR file that arrives we can find
.eo sources. They are the programs
this JAR file has had in classpath while it was built. We consider them
as foreign objects too and add to the CSV catalog.
When all foreign objects which are registered in the catalog are downloaded,
compiled, and optimized, we are ready to start
Instead of compiling XMIR directly to Bytecode, we transpile it to
and let Java complier do the job of generating Bytecode.
We believe that there are a few benefits of transpiling to Java vs. compilation to Bytecode:
- Output code is easier to read and debug,
- Optimization power of existing compilers is reused,
- Complexity of a transpiler is lower than of a compiler,
- Portability of the output code is higher.
We already have two EO➝Java transpilers: canonical one and the one made by HSE University. We also have EO➝Python experimental transpiler made by students of Innopolis University. Most probably, when you read this article, there will be more transpilers available.
Even though we believe in transpiling, it’s still possible to create EO➝Bytecode, EO➝LLVM, or EO➝x86 compilers. You are more than welcome to try!
At this step, the standard Maven Compiler Plugin
.java files in
and turns them into
Here, we remove all
.class files unpacked from dependencies. This is
necessary, in order to avoid getting them packaged into the
We do placing and then unplacing simply because Maven Compiler Plugin doesn’t allow us to extend classpath in runtime. If it would be possible, we would just download dependencies from Maven Central and add them to classpath, without unpacking, placing, and then unplacing.
Here, we delete all
.class files from the
which were auto-generated from
.eo. We don’t want to ship binaries,
which can be generated from
.eo sources. We only want to ship
atoms, which are
.java files originally.
At this step we take all
.eo sources from
src/main/eo/ and copy
target/classes/EO-SOURCES/ directory. Later, they will be
packaged together with
.class files into a
.jar, which will be
deployed to Maven Central. While copying, we replace
0.0.0 in the
runtime version to the currently deploying version. Take a look
at the file
in its source repository:
The version at the
+rt line is
0.0.0. When sources are copied to the
JAR, this text is replaced.
The motivation to ship sources together with binaries is the following. When atom binaries are compiled from Java to Bytecode, they stay next to transpiled sources. They are compiled together. Moreover, unit tests also rely on both atom sources and auto-generated/transpiled sources. We want future users of the JAR to know what sources we had in place when the compilation was going on, to maybe let them reproduce it or at least know what were the surroundings of the binaries they get.
From a more practical standpoint, we need these sources in the JAR in order to let the Mark step understand what objects are worth pulling next to the atoms resolved.
Here, we package everything from
target/classes/ into a JAR
archive and deploy it
to Maven Central.
I suggest deploying sources to GitHub Pages too, to let users see
them on the Web. Also, it will be helpful later when we make a pull
request to Objectionary.
script in one of my EO libraries, it deploys
.eo sources to GitHub Pages,
0.0.0 version markers in them correctly.
When the deployment is finished and Maven Central updates its CDN servers,
it’s time to submit a pull request to yegor256/objectionary.
.eo sources of objects go into
objects/ and their unit tests
tests/. Basically, we just copy
over there. But, stop… one important detail. In the sources, as was said earlier,
+rt versions set to
0.0.0. Here, when we copy to Objectionary,
versions must be set to real numbers.
When the pull request arrives, a GitHub Action pre-configured in the
.eo sources to all known platforms and runs all unit tests.
If everything is clean, we review the pull request and decide whether
the objects suggested go along with others already present in the Objectionary.
Once the pull request is merged, the objects become part of the centralized dictionary of all objects of EO. Take a look at this pull request, where a new object was submitted to Objectionary, after its atom was deployed to Maven Central.