A *Hello* ‘program‘ is defined by the `hello` production, and consists of the string `hello`, followed by an identifier and a `!`. E.g., valid (and boring) hello texts would be `"hello world!"`, `"hello\nworld!"` or `" hello you !"`. Invalid examples would be `"hello world"` (missing `!`), `"Hello World!"` (uppercase `Hello`) or `"hello!"` (missing identifier). Note that ANTLR4 by default makes literal terminals (e.g., `'hello'` above) *reserved keywords*, so `"hello hello!"` is not valid (`hello` will never match `ID`).
When the project is built (during the `generate-sources` phase), ANTLR4 will generate several Java classes implementing a parser for the *Hello* language:
*`HelloLexer.java` – the *lexer*, which splits an input string/stream into a stream of *tokens* or words (`'hello'`, `ID` or `WS` for this grammar; whitespace (`WS`) is ignored)
*`HelloParser.java` – the *parser*, which recognizes the sentence structure of the input text, and (optionally) builds a *parse tree*
*`HelloListener.java` – a *listener* interface, used together with a parse tree walker to perform an action for each node in the parse tree
*`HelloBaseListener.java` – a *listener* class, with default do-nothing methods for each type of parse tree node
We can use the parser like this (see example in `src/main/java/inf225/examples/HelloExample.java`):
* First, set up the input and the lexer; this will give us a stream of tokens (words):
```java
Stringinput="hello world";
// a lexer that splits the input string into tokens
* Next, make a `HelloParser` that reads the tokens:
```java
HelloParserparser=newHelloParser(tokens);
```
* Finally, we can get the *context* for the non-terminal we're interested in – the parser will then try to match the input to the production rule for the non-terminal (`hello: 'hello' ID '!'` in our case – i.e., we expect to find a `hello` token, an identifier and an exclamation mark):
```java
// the method name here matches the name of the non-terminal in the grammar (hello)
HelloContexttree=parser.hello();
```
### Tokens
You can easily examine the token stream by asking for a list of tokens. First, you must make sure that all the input has been processed, by calling `fill()`. Tthe parser will normally read tokens one by one (possibly looking ahead a few tokens), so the lexer produces tokens on demand – `fill()` makes it finish the job.
The token stream itself gives you enough information to do very simple syntax highlighting; e.g., adding colours for keywords, string literals and so on.
### Parse trees
To see the parse result, we can use a `ParseTreeWalker` to visit all the nodes in the parse tree, giving it a listener that will be called for each node:
```java
walker.walk(newHelloBaseListener(){
@Override
publicvoidvisitTerminal(TerminalNodenode){
System.out.println("'"+node+"' ");
}
// you can also add visit methods for error nodes, and before and after a non-terminal
},tree);
```
The output should look like this (for input `hello world!`):
```
'hello'
'world'
'!'
```
A more interesting walker would pick out who we're saying hello to:
```java
newParseTreeWalker().walk(newHelloBaseListener(){
@Override
publicvoidenterHello(HelloContextctx){
System.out.print("Saying hello to '"+ctx.getChild(1)+"'!");
}
},tree);
```
Giving the output `Saying hello to 'world'!`
## Expresssions
For a more interesting example, have a look at `Expr.g4` and `ExprExample.java`, which defines a very simple language for prefix expressions with a single operator (`+`). The tree walker is used to evaluate the expressions using a stack: literal numbers are pushed onto the stack, and the plus operator pops to numbers, adds them and pushes the result. Try improving it by adding more operators!
# Maven Setup
This project comes with a working Maven `pom.xml` file. You should be able to import it into Eclipse using *File → Import → Maven → Existing Maven Projects* (or *Check out Maven Projects from SCM* to do Git cloning as well). You can also build the project from the command line with `mvn package`.
Pay attention to these folders:
*`src/main/java` – Java source files go here (as usual for Maven)
*`src/main/antlr4` – ANTLR4 grammar files (`*.g4`) go here; use sub-folders to place the generated parser in a specific Java package
*`src/test/java` – JUnit tests
*`target/generated-sources/antlr4` – ANTLR4 will place Java source code here (this happens automatically during compilation or if you run `mvn generate-sources`)
*`target/classes` – compiled Java class files
*`target/*.jar` – your compiled project, packaged in a JAR file
#### POM snippets
If you're setting up / adding ANTLR4 to your own project, you can cut and paste these lines into your `pom.xml`file.
* You should make sure that both the parser generator and the runtime use the same version, so define the version number in `<properties>…</properties>`:
```xml
<antlr4.version>4.8-1</antlr4.version>
```
* The ANTLR4 runtime is needed to run the compiled parser; add it in the `<depencencies>…</dependencies>` section:
* The ANTLR4 maven plugin includes the ANTLR4 tool, and is needed to generate parser during compilation; add it to `<build><plugins>…</plugins></build>`: