Loops and Overloops for Tree Walking Automata: Companion PageVincent HUGOT |
This web page complements the following paper:
[Go back to my main page].
This software merely serves to illustrate the algorithms of the paper, which I think is useful given the explosive character of these transformations, which prohibits doing things by hand even for very trivial TWA.
Note about the source code: (in case you read it, which you really don’t need to do)
Although I have some confidence that the code is correct, it is very messy, as it was written before the transformation was fully formalised – and it was not intended to be read by anyone else... Also note that there is a lot of source code which is completely unrelated to TWA – this is in fact my work tool for all kinds of tree automata research. So, in a nutshell: read and use at your own risk.
Related work:
The only other program I known of which implements the TWA → BUTA transformation (loop-based only, of course) is dtwa-tools^{1}, available here, which was developed for this article. However, I am fairly certain the algorithm which it implements is not entirely correct. We have discussed in our paper how the BUTA states given in the proof sketches – 𝕋 × 2^{Q2} – were surprising, as loops cannot be considered independently of the symbol which supports them. Which is why our loop-based algorithm uses states in Σ × 𝕋 × 2^{Q2}. In dtwa-tools, the implementor followed the proof sketches, and went around the lack of symbol by allowing loops to be closed if there exists a symbol for which there is an ↑-transition which closes the loop (and this is exactly what we did at first, too). However, for the reason discussed in the paper, this may compute more loops than actually exist – and we have given a trivial counter-example.
Binaries are available for the following operating systems:
The OCaml source code can be downloaded from my subversion repository at http://vincent-hugot-projects.googlecode.com/svn/trunk/ocaml/TAged-emptiness2/. Note that this repository is in constant evolution, which means that the latest version of the code may have little to do with the binary file you downloaded. For this reason, you may want to download the exact revision corresponding to the binaries you want to compile – or whose code you simply desire to read. This is why the revision number is provided – as annotations of the form (rXXX) – in the above list. Note that the revision number pertains to the whole repository, and not to any specific project in it.
If you want to build the proof of concept from source, you will need to pull at least ocaml/common (shared files) and ocaml/Taged-emptiness2 (the POC), and run setup.sh from common, and then make.sh from the POC’s directory. You will need wget, ocaml, camlp4 (with camlp4-extras), and menhir. Of course, this assumes some “Unix-y” environment, capable of running shell scripts. I work with GNU/Linux (Kubuntu), tested with Cygwin under Windows XP SP3, and assume it would work with a Mac. If you are using the Windows/MinGW or Windows/MVSC version of OCaml, then you will probably have to do a little bit of work to convert the shell scripts into MSDOS BAT files.
In this section, POC designates the name of the proof-of-concept executable, for instance poc.linux if you downloaded the linux version. The $ indicates a command prompt, regardless of which OS you are actually using.
The program is used in the following way:
$ POC [dbg] [_clean] twa <TWA> <TREE>
// left-most leaf is 'a' TWA Alphabet a 0 b 0 c 0 f 2 g 2 h 2 States ql // left qu // up Inits ql Finals qu Rules a ql {@,<} * qu ALL qu < ! qu ALL 2 ql {@,<} < qlThe syntax is rather straight-forward:
['_' '-' 'A'-'Z' 'a'-'z' '0'-'9' '*' '$' '#' ''' '~']+
g(f(a,b), c)
EXAMPLE: The output presented in the next section has been generated from a working copy of my subversion repository, using the command :
$ POC _clean twa t/lncs.twa t/ciaa-tree
The program generates a PDF file such as this one: . In this section we give a few explanations concerning the contents of such a file.
Note that the sets all appear with their cardinality. For instance
alphab = #6{a/0, b/0, c/0, f/2, g/2, h/2}
denotes the set Σ = {a,b,c, f,g,h}, where the notation “σ/k” means that the symbol σ is of arity k. So in that case we have Σ_{0} = {a,b,c} and Σ_{2} = {f,g,h}. The #6 at the beginning simply indicates that |Σ| = 6.
This run is represented in two ways:
The convention used throughout the document is that states are all written in bold, with final states (resp. initial states) being additionally underlined (resp. in italics).
In other word, each node contains exactly all states in which the TWA can visit this node.
Once a power run has been computed, it is easy to determine whether or not the tree is accepted: it is if and only if a final state appears at the root of the tree.
There are two parts for each BUTA:
TAGED 'fromTWA-Sigma' [1986] = {where ’fromTWA-Sigma’ is an arbitrary “name” given to the automaton, and the number between square brackets, here [1986], is its size. The size of a BUTA ⟨Σ,Q,F,Δ⟩ is defined – in the usual way – as:
∥ ⟨Σ,Q,F,Δ⟩ ∥ = |Q| + |
| [arity(f) +2]. |
==rel = #0{} <>rel = #0{}are meaningless and should be ignored. They are simply remnants of the fact that the implemented BUTA are actually the more general model of TAGEDs, which admit global equality and disequality constraints. Since we use neither here, those sets are always empty.
For each of these BUTA, a power run is provided. The idea is the same as that of power runs for TWA. This time, each node α is labelled by the set of all states q such that the subtree t|_{α}∈ Lang(B, q), where B is the BUTA.
Or, it could also be seen as the run of a determinised version of B on the tree.
A tree is accepted by the BUTA iff there is a final state at the root of the power run.
If you have any questions which are not covered by this document, you are welcome to contact me at vincent.hugot@gmail.com. Time permitting, I will update this page with the answer.
This document was translated from L^{A}T_{E}X by H^{E}V^{E}A.