Document Schema in Textended
Textended is a projectional editor. This far, the core representation has been a tree. Every file is accompanied by a header, and then follows a forest consisting of the primitives:
- A Labelled list
- A Labelled string or binary blob
- A Label, a symbol.
This is also the document model once the file has been read in, and it can be directly edited by the user. This means the document can represent large amount of constructs not valid in any language. To interpret or layout the constructs to the screen, we need to read the construct.
You could read the above lists in directly and interpret or construct a layout from them. It is impractical for several reasons:
- You end up with big if/elif/else -blocks, to extend the language you have to implement something like lisp macros.
- If the document isn't valid in your language, the layouting will require a fallback you have to code in yourself.
- It is hard to construct a parser for the language from a definition consisting of python functions.
While coding in my editor, I realised that layouter falling to fallback is tressful and confusing. It gives an impression that the code changed unpredictably, even if what you did was a small change. Fallback layouts should be rare sights during programming sessions. For this reason every language construct should be identifiable without it's context.
I designed schemas that resemble Context-Free Grammars. Every recognizable rule must be labelled, additionally those rules must end up with context markers or match to primitives. It pivots around the following method:
result = schema.recognize(node)
If the node validates against the schema, you can apply the returned rule to interpret or layout the construct:
data = result.build(function, node)
The given function is produced to populate the returned structure. It's function signature is:
function(slot, node) -> value
The schemas are stored as textended files. To bootstrap it there's a schema describing the schemas, constructed within a python script.
Existing treepython programs cannot be described in the new schema, so I have to discard or translate them. Writing code in treepython was not very fast, so it won't take long to rewrite those files for the new schema by hand. The reason for this is because I took an advantage from the CFG. The old treepython declaration could be described like this:
stmt -> :def name arguments stmt*
Now it can be described like this:
stmt -> :def name (argument*) (stmt*)
Since the user is less affected by the actual structure of the code, I no longer hold variable sized lists that have a prefix. It is more convenient like this.
It is unlikely this schema changes a lot. More primitive or fundamental my code construct are, less there seem to be a reason to change them. My schema is a CFG defined over the leaves of the document. Could it get simpler than that?