Adding UAST Annotations

Intro

Once you have written the code to AST parser the next step is to write the annotation code. This is the Go language code that will establish the rules to transform the original AST into the UAST (a process we call normalizing). Most of the files related to the Go part of the driver are auto-generated by the bblfsh-sdk init tool, but in order to translate the native AST to the UAST you need to complete two skeleton files that are referenced in the (auto-generated) driver/main.go file.

File: tonode.go

ObjectToNode implementation

native.ObjectToNoder implementation is provided in the SDK for trees represented as nested JSON objects. It serves to establish the role of some internal keys in the native AST into specific key roles in the UAST. This is needed so the rule applying process knows what parts of the native AST use to do the translation. The skeleton of this struct will be automatically generated by the bblfsh-sdk init command.

The type and meaning of the fields in the native.ObjectToNoder struct can be checked on the GoDoc documentation of the type.

Examples

Check the already implemented Java ObjectToNoder or Python one for real world examples of the ObjectToNoder struct filled.

File: annotation.go

AnnotationRules

This variable will hold a Rule object and is the result of the applying of the different rules using the functions defined in the file sdk/uast/ann/ann.go file. This is done by calling the On rule constructor and chaining calls to the different selectors, rule constructors (a rule can hold other rules itself) and predicates to build your AST to UAST normalizer.

To see how rules are built it's important to understand that a rule is composed of:

  • One or more predicates that will select what nodes will match the rule.
  • One of more actions that will apply to the matching nodes.
  • Zero or more selectors that will optionally select more nodes starting from the current ones and apply the rule/s in its argument list.
  • Zero or more additional rules contained inside to the selectors.

The rule constructor is the function On(predicates...) *Rule. As you can see in the signature, it takes one of more predicates and returns a rule pointer. This allows for chaining of different methods taking a rule pointer. A general form for a rule definition could this be like:

// Simple example:
var r  := On(somePredicate).SomeAction(actionParams)

// A more elaborate example:
var r2 := On(otherPredicate).SomeSelector(
    On(thirdPredicate).SomeAction(actionParams), // embedded rule
    On(fouthPredicate).SomeSelector( // embedded rule with its own childs
        On(fithPredicate).SomeAction(actionParams),
    )).SomeAction(actionPArams)

Predicates

Predicates are any function that follow the signature:

func(n *uast.Node) bool

Where uast.Node are the nodes that will be iterated during the annotation process. You could define your own predicate functions but the ann.go file has already defined the ones that you'll probably need most of the time:

  • Any(): Always returns true. It's usually used to match the first node in the native AST and add rules from there.

  • Not/And/Or(predicate): Applies the expected boolean operations to the predicate(s) used as argument.

  • HasInternalType(type string): Checks if the InternalType of the node matches the string provided. What key is the InternalType of a native AST node is defined in the parser.ObjectToNoder.InternalTypeKey field.

  • HasToken(name string): Checks if the node has a token matching the given string. What key of the node is a Token is defined in the parser.ObjectToNoder.TokenKeys map.

  • HasChild(predicate): Checks if any of the first level children of the current node matches the given predicate.

  • HasProperty(key, value string): Checks if the AST node has some specific key and the key as the given value.

  • HasInternalRole(key string): Checks if the node has an InternalRole matching the specified key; this is a shortcut to HasProperty(uast.InternalRoleKey, value). InternalRoles are added to children of any object keys holding them.

Actions

  • Roles(roles... uast.Role): Adds the given UAST role to the currently selected node(s).

  • Error(err Error): Produces an error and stop processing the AST.

Selectors

  • Self(rules ...*Rule): Selects the current node.

  • Children(rules ...*Rule): Select the children of the current node. Please note that only the first level children are selected, second level ("grandchildren") and others below are not selected (use Descendants for those).

  • Descendants(rules ...*Rule): Selects all (recursive) descendants of the selected node.

  • DescendantsOrSelf(rules ...*Rule): Selects the current node and all (recursive) descendants.

Examples:

Check the already implemented Java annotator or the Python annotator for real world examples of this API in action.

When in doubt...

[//]: # FIXME: (uncomment this paragraph when the table is available) [//]: # If you find a node in the native AST that you're unsure how to map into the UAST [//]: # the first step is to check how other languages map the same or similar nodes. For [//]: # this you can check this Annotation table (FIXME: link when available) that show how [//]: # different languages have mapped their internal roles to the UAST ones.

You can also ask any doubt on the project's public Babelfish Slack channel which is very friendly to newcomers to the project.

Finally, if you really think that there isn't a correspondence in the UAST roles for the native role that you want to map, you can open an issue on the SDK project or fork the Babelfish SDK project on Github, add the new role to the file uast/uast.go and make a PR. Don't expect the role to be added inmediatedly; we're somewhat picky about freely adding roles to the UAST and depending on the stage of the project we strive for adding the more generalizable roles first before adding exotic or very language-specific ones. If your role falls into the second category the PR will be tagged as "need-research" which means that it will be re-evaluated when a similar role is needed for other languages (and thus me can see how to generalize it to cover more ground) or there is a new version of the UAST.

results matching ""

    No results matching ""