UniCC LALR(1) Parser Generator
Version | 1.6.2 (2019) |
Download | v1.6.2 (source code) (more) v1.6.0 (Windows 32-Bit setup) |
GitHub |
https://github.com/phorward/unicc |
License | BSD |
UniCC is a universal LALR(1) parser generator, targetting C, C++, Python, JavaScript, JSON and XML.
Overview
UniCC (UNIversal Compiler-Compiler) compiles an augmented grammar definition into a program source code that parses the described grammar. Because UniCC is intended to be target-language independent, it can be configured via template definition files to emit parsers in almost any programming language.
UniCC comes with out of the box support for the programming languages C, C++, Python (both 2.x and 3.x) and JavaScript. Parsers can also be generated into JSON and XML.
UniCC can generate both scanner-less and scanner-mode parsers. The more powerful scanner-less parsing is the default, and allows to break the barrier between the grammar and its tokens, so tokens are under full control of the context-free grammar. Scanner-less parsing requires that the provided grammar is internally rewritten according to whitespace and lexeme settings.
Screenshot: UniCC v1.5 compiling & running C, C++, Python 2 & 3 and JavaScript (Node) test suite
Example
This is the full definition for a four-function arithmetic syntax including their integer calculation semantics (in C).
#!language C ; // <- target language!
#whitespaces ' \t';
#lexeme int;
#default action [* @@ = @1 *];
#left '+' '-';
#left '*' '/';
//Defining the grammar
calc$ : expr [* printf( "= %d\n", @expr ) *]
;
expr : expr '+' expr [* @@ = @1 + @3 *]
| expr '-' expr [* @@ = @1 - @3 *]
| expr '*' expr [* @@ = @1 * @3 *]
| expr '/' expr [* @@ = @1 / @3 *]
| '(' expr ')' [* @@ = @2 *]
| int
;
int : '0-9' [* @@ = @1 - '0' *]
| int '0-9' [* @@ = @int * 10 + @2 - '0' *]
;
To build and run this example, do
$ unicc expr.par
$ cc -o expr expr.c
$ ./expr -sl
3*10-(2*4)+1
= 23
More real-world examples for parsers implemented with UniCC are xpl, rapidbatch and ViUR logics or can be found in the examples-folder.
Features
UniCC provides the following features and tools:
- Grammars are expressed in a powerful Backus-Naur-style meta language
- Generates parsers in C, C++, Python, JavaScript, JSON and XML
- Scanner-less and scanner-mode parser construction supported
- Build-in full Unicode processing
- Grammar prototyping features, virtual productions and anonymous nonterminals
- Abstract syntax tree notation features
- Semantically determined symbols
- Standard LALR(1) conflict resolution
- Platform-independent (console-based)
Documentation
The UniCC User Manual is the ultimative guide and reference for the UniCC parser generator.
It covers a general feature introducion in to UniCC, a beginner's tutorial guiding into the topic 'parsing' in general and how to implement parsers with UniCC, a UniCC compiler and language reference guide and a user's reference relating to the C, C++ and Python targets. As an state-of-the-art example, the compiler for a simple programming language called xpl is developed alongside this howto guide.
The manual is continously updated and extended with more or detailed information and chapters. Hopefully it answers all of questions coming up when UniCC shall become the workhorse of your upcoming compiler project. If not, don't avoid to drop a mail, to get individual support and help with the UniCC Parser Generator and its subsequent modules.
Licensing
The UniCC LALR(1) Parser Generator can be used, modified and distributed under the BSD open source license.