This is a description of the algorithm searching for a query in the trees. For the sake of simplicity, a single-tree query is presumed.
Query Preprocessing (once for each query)
File Preprocessing (once for each file)
Matching heads (once for each file)
Creating matching lists of tree-nodes for each query-node (once for each tree)
Assembling a tree (once for each tree for which matching lists have been successfully created)
Checking references (once for each assembled tree - a matching candidate)
The global head is scanned first and positions of some attributes and all meta-attributes are stored (function scanGlobalHead)
The query tree(s) is preprocessed (function preprocessTrees) – some information about nodes is collected (depth_from_root, max_depth_to_leaf, nodes_in_subtree, depth_first_order, usage of meta-attribute _name, regular expressions are compiled)
The head is scanned and positions of some attributes (hide etc) are stored (function scanHead)
The global head and the file head are matched (function matchHeads)
A deep-first list of query-nodes is created (function treeToTVS)
The tree is preprocessed (function preprocessTree) – some information about nodes is collected (depth_from_root, max_depth_to_leaf, nodes_in_subtree, depth_first_order)
For each query-node, a list of tree-nodes matching the query-node is created. References are ignored at this stage.
The algoritm starts from the nodes matching the query-root (function findSubtreesTVSList). It takes them one by one and tries to assemble the rest of the tree.
If all query-nodes have been processed, the assembled tree is a candidate for matching the query. All that remains is checking the references.
If not, the next node from the list of query-nodes is taken. They are ordered depth-first, therefore the query-father is always processed before all its query-sons. It tries to assemble a part of the tree from all tree-nodes matching this query-node one by one.
In the assembled tree, all references are checked for validity (function isOKReferences). Number of occurrences at nodes that contain references is checked as well.