Hello Compilers
Goals
Our goal is to extend the parser of the previous lab to a simple calculator that can handle complex numbers and two new operators @
and &
that will be interpreted as max
and min
respectively. Here is an example of the kind of expressions we want to handle:
➜ hello-compilers-solution git:(essay-2024-02-14) bin/mmt.js '2-3i*5i'
gives the output:
const {Complex, factorial, max, min, print } = require("/Users/casianorodriguezleon/campus-virtual/2122/learning/jison-learning/hello-compilers-solution/src/support-lib.js");
print(Complex('2').sub(Complex('3i').mul(Complex('5i'))));
We can execute the output with node
:
➜ hello-compilers-solution git:(essay-2024-02-14) bin/mmt.js '2-3i*5i' | node -
17
➜ hello-compilers-solution git:(essay-2024-02-14) bin/mmt.js '2-3i*5i&9' | node -
9
The expression a&b
is equivalent to min{a,b}
and &
has smaller priority than the arithmetic operators.
We can supply a file with the expression to translate with the -f
option:
➜ hello-compilers-solution git:(main) ✗ cat test/input/minmax.calc
3i&(2+4)@(5+6)
The a@b
expression is equivalent to max{a,b}
and @
has smaller priority than &
.
➜ hello-compilers-solution git:(main) ✗ bin/mmt.js -f test/input/minmax.calc
const { print,max,min,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(Math.max(Math.min(Complex('3i'), Complex('2').add(Complex('4'))), Complex('5').add(Complex('6'))));
When we execute the output with node
we get:
➜ hello-compilers-solution git:(main) ✗ bin/mmt.js -f test/input/minmax.calc | node -
11
Here is an example of the executable bin/mmt
options:
hello-compilers-solution git:(main) bin/mmt.js -h
Usage: mmt [options] [expression]
Arguments:
expression Expression to translate (default: null)
Options:
-V, --version output the version number
-f --file <file> path to the input file with the expression to translate
-a --ast <file.json> path to file with the AST in JSON format
-j --js <file> file to output the resulting javascript
-h, --help display help for command
Two new operators & and @
Let us consider a notation of arithmetic in which the @
and &
symbols on numbers are defined as the max
and min
operations. Thus, with this notation
and
Extienda el traductor de la calculadora (opens in a new tab) para que admita estas expresiones aritméticas y las traduzca a un programa JavaScript que las compute y las imprima.
Supondremos que el mínimo &
tiene mas prioridad que el máximo @
. Por ejemplo, la entrada podría ser traducida al siguiente código JS:
console.log(max(234, min(325,57)))
Compared to the other tokens, give a low priority to the @
and &
operators so that
an expression 4!@3**2
should be interpreted as (4!)@(3**2)
and produces a JS code as max(factorial("4"), pow("3", "2")))
.
Since &
and @
have a lower priority than +
and -
, an expression like 4+5i&3-2i
should be interpreted as (4+5i)&(3-2i)
.
Extienda el programa Jison hecho en la práctica anterior con estas dos operaciones para que produzca un AST compatible Espree conteniendo el correspondiente código JS. A continuación utilice escodegen.generate(ast) (opens in a new tab) para generar el código JS
Dealing with Ambiguity
For the &
and @
operators you can extend the initial incomplete grammar in the assignment repo this way:
%{
const { buildLiteral, buildRoot, buildMin } = require('./ast-build');
%}
%%
es: e
;
e:
e '@' e
| e '&' e
| N
| '(' e ')'
;
The problem with this grammar is that it is ambiguous. Expressions like 2&3@4
have more than one concrete syntax tree.
On one side:
that will lead to the interpretation 2&(3@4)
; but we have also this other syntax tree:
that leads to the interpretation (2&3)@4
.
To break the ambiguity you have to set that the precedence of the token &
is higher that the one of the token @
.
You have also to fix the ambiguity for phrases like 2&3&4
and 3@4@5
favouring a left associativity interpretation, i.e. preferring (2&3)&4
to 2&(3&4)
and (3@4)@5
to 3@(4@5)
.
An expression 4+5i&3-2i
should be interpreted as (4+5i)&(3-2i)
and can produce a JS code as min(add("4", "5i"), (sub("3", "2i")))
.
Breaking Ambiguity in Jison
To deal with issues of ambiguity in grammar, you can consult
- Conflict Solving in Yacc
- the Precedence and Associativity (opens in a new tab) section of the old PL notes
- See the examples in the Repo ULL-ESIT-PL/jison-prec (opens in a new tab).
These is a simplified version of the rules to resolve conflicts and ambiguities in a Yacc-like parser generator:
Precedence Rules
- La precedencia de los tokens se hace en la cabecera del programa Jison; esto es: antes del primer
%%
usando la sintáxis%left token ...
,%right token ...
o%nonassoc token ...
- La precedencia de una regla de producción es la precedencia del último token que aparece en la parte derecha de la regla.
- Por ejemplo la precedencia de será la precedencia que le demos al token en la cabecera del programa Jison.
- Si no existen tokens en la parte derecha con precedencia asignada o bien si la precedencia por defecto no es la deseada, se puede usar la directiva
%prec
para asignar la precedencia deseada a la regla. Por ejemplo:asigna la precedencia del tokenexp: '-' exp %prec 'STRONG'
STRONG
a la reglaexp: '-' exp
- Cuando el parser detecta un conflicto y ve que hay dos posibles vias de continuar la construcción del árbol: Una que indica que quizá se aplicó la regla y otra que indica que quizá se pueda seguir leyendo el token a la entrada,
- El parser compara las precedencias del token y de la regla y se queda con el de mas prioridad.
- Si es el token quien tiene mayor prioridad avanzará en la lectura desplazando el token y buscando nuevos símbolos (se dice que hace un shift; en este caso el AST se "hundirá" a derechas) y
- Si es la regla completará el subárbol parcial y continuará en su construcción del árbol (se dice que hace un reduce y en este caso el árbol construido estará más hundido a izquierdas)
- Los tokens declarados en la misma línea mediante una declaración
%left
o%right
tienen igual precedencia e igual asociatividad. - La precedencia es mayor cuanto mas abajo su posición en el texto
Complex Numbers
Extend the regular expressions in the lexical analyzer to cover both floating point real numbers like 2.53e-2
and floating point imaginary numbers like 2.9e-5i
or -i
.
The complex.js (opens in a new tab) library provides a constructor Complex
and methods mul
, add
, sub
, div
, etc. that can be used this way:
let Complex = require('complex.js');
let c = Complex({ re: 1.0e1, im: 8}); // Same: let c = Complex("1.0e1 + 8i");
console.log(c); // { re: 10, im: 8 }
console.log(c.mul({re: 2, im: 2}).div(2).sub(3, 2)); // { re: -1, im: 16 }
console.log(c.add({re: 3, im: 9})); // { re: 13, im: 17 }
For more examples of use of the library visit ULL-ESIT-PL/hello-complex.js (opens in a new tab)
Write the code inserting the support functions and the require to complex.js (opens in a new tab) lib in the preamble
that is concatenated to the generated code.
Redefining the minimum and maximum operators
Use lexicographical order (opens in a new tab) to compare Complex numbers:
Redefining the factorial function
To keep compatibility with the calculator in the previous lab, you can extend the complex class with a factorial method.
You can either constrain the factorial to natural numbers like this:
#!/usr/bin/env node
const Complex = require("complex.js");
Complex.prototype.factorial = function() {
if (this.im !== 0) throw new Error(`Imaginary part must be zero. Instead is ${this.im}`);
let n = this.re;
if (!Number.isInteger(n)) throw new Error(`Not an Integer number ${n}`);
if ( n < 0) throw new Error(`Factorial of negative number ${n}`);
let result = Complex(1);
if (n === 0) return result;
for (let i = 1; i <= n; i++) {
result = result.mul(i);
}
return Complex({re: result.re, im: this.im});
};
console.log(Complex(process.argv[2]).factorial());
Here are several executions of the former example:
➜ complex-lib ./factorial.js "3"
{ re: 6, im: 0 }
➜ complex-lib ./factorial.js "3+2i"
Error: Imaginary part must be zero. Instead is 2
➜ complex-lib ./factorial.js "-3+0i"
Error: Factorial of negative number -3
➜ complex-lib ./factorial.js "-3.2+0i"
Error: Not an Integer number -3.2
The new version of the factorial
function has to be added in the preamble
of the generated code.
Or you can extend the factorial to complex numbers using the gamma function (opens in a new tab) and the reflection formula (opens in a new tab) for negative numbers. See an example of an implementation of gamma
in the file ULL-ESIT-PL/hello-complex.js/gamma.js (opens in a new tab)
If you compute factorials using the gamma function, you'll likely observe some small errors due to the fact that floating point numbers have limitations on how accurately a real or complex number can be represented:
➜ hello-compilers-solution git:(main) ✗ bin/mmt.js '3!'
const { print,gamma,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(gamma(Complex('3')));
➜ hello-compilers-solution git:(main) ✗ bin/mmt.js '3!' | node -
2.000000000000001
➜ hello-compilers-solution git:(main) ✗ bin/mmt.js '(3+2i)!'
const { print,gamma,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(gamma(Complex('3').add(Complex('2i'))));
➜ hello-compilers-solution git:(main) ✗ bin/mmt.js '(3+2i)!' | node -
-0.4226372863112039 + 0.871814255696508i
The translation scheme
Thus your calc translator must be able to generate code for expressions like
2!+3**2i-4i
that using the complex library augmented with our factorial
can be ultimately evaluated as:
Complex(2).factorial().add(Complex(3).pow("2i")).sub(Complex("4i"))
Using auxiliary proxy functions like sub
, add
, pow
, etc.
2!+3**2i-4i
can be translated as:
sub(add(factorial("2"), pow("3", "2i")), "4i")
which simplifies the AST and thus the translation.
Computing the dependencies
We can find the dependencies of the generated code by traversing the AST and collecting the names of the functions that are not part of the target language. We can use espree.estraverse for that:
const findUsedFunctions = function (ast) {
const usedSupportFunctions = new Set();
estraverse.traverse(ast, {
enter: function (node, _ ) {
if (node.type === "Identifier" && exportedSupportIdentifiers.includes(node.name)) {
usedSupportFunctions.add(node.name)
}
},
});
return Array.from(usedSupportFunctions);
};
Where exportedSupportIdentifiers
is an array with the names of the functions exported by the support library:
const exportedSupportIdentifiers = Object.keys(require("./support-lib.js")); // [ power, factorial ]
Tests
Añada pruebas usando Mocha y Chai o Jest
Tips
Probably you have experienced that when you change the computer you develop on or you make changes in the preamble
the tests that check the generated code fail. For instance, the require
mentions the specific path of the support lib in your computer:
➜ hello-compilers-solution git:(main) ✗ bin/mmt.js -f test/input/fact.calc
const { print,gamma,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(gamma(Complex('3')));
To fix this weakness, use a comment with a string in the template to separate the preamble from the generated code:
const { {{ dependencies }} } = require("{{root}}/src/support-lib.js");
/* end of preamble of generated code */
{{code}}
Then you can use an easy regular expression to remove the preamble from the generated code and check that the expected code and the actual code are equal.
After a change in a support lib function some tests may fail.
You can use the --grep
option of mocha
:
$ npx mocha -g 'factorial'
Also, you can use the only
methods of describe
and it
(opens in a new tab) to run only the tests that are failing or simply use a conditional to skip the tests that are failing. For instance if the &
and @
operators are not yet implemented, you can skip the tests that use them:
describe('Testing hello maxmin translator', () => {
for (let c of Checks) {
if (c.text.match(/[&@]+/)) continue;
it(`Test ${c.text} = ${c.result}`, () => { ... });
}
});
Mocking
Mocking means creating a fake version of an external or internal service that can stand in for the real one, helping your tests run more quickly and more reliably. When your implementation interacts with an object’s properties, rather than its function or behavior, a mock can be used.
Stubbing
Stubbing, like mocking, means creating a stand-in, but a stub only mocks the behavior, but not the entire object. This is used when your implementation only interacts with a certain behavior of the object.
To give an example: You can stub a database by implementing a simple in-memory structure for storing records. The object under test can then read and write records to the database stub to allow it to execute the test. This could test some behaviour of the object not related to the database and the database stub would be included just to let the test run.
If you instead want to verify that the object under test writes some specific data to the database you will have to mock the database. Your test would then incorporate assertions about what was written to the database mock.
Examples of Mocking and Stubbing
See the code at ast/test/test.mjs (opens in a new tab) in the repo hello-jison for an example of stubbing the console.log
.
Cuando vaya a escribir las pruebas de la práctica podemos intentar una aproximación como esta: ➜ arith2js-solution git:(dependencies) ✗ cat test/data/test1.calc
- Tomamos un objeto como
c = { text: "3! - 1", result: 5 }
con el atributotext
conteniendo la expresión de prueba y el atributoresult
el resultado esperado después de la traducción y evaluación del código - Construimos primero el árbol con
t = p.parse(c.text)
- Generamos el JavaScript con
js = escodegen.generate(t)
- Evaluamos el JavaScript con
result = eval(js)
- Si nuestro traductor es correcto
result
debería ser igualc.result
Suena bien ¿Verdad?
Pero en tal aproximación ¡tenemos un problema! y es que el código JavaScript generado para "3! - 1"
nos han pedido que sea:
➜ arith2js-solution git:(dependencies) ✗ cat test/data/test1.calc
3! - 1
➜ arith2js-solution git:(dependencies) ✗ bin/calc2js.mjs test/data/test1.calc
#!/usr/bin/env node
const factorial = n => (n === 0) ? 1 : n * factorial(n - 1);
console.log(factorial(3) - 1);
y si evaluamos el código resultante:
➜ arith2js-solution git:(dependencies) ✗ node
Welcome to Node.js v16.0.0.
Type ".help" for more information.
> result = eval(`const factorial = n => (n === 0) ? 1 : n * factorial(n - 1);
... console.log(factorial(3) - 1);`)
5
undefined
> result
undefined
¡La variable result
está undefined
!
Esto es así porque la llamada a console.log()
siempre retorna undefined
(no se confunda por el 5
que aparece en stdout
producido por el console.log
. El valor retornado es undefined
)
Así pues una aproximación como esta no funcionaría:
const p = require("../src/transpile.js").parser;
const escodegen = require("escodegen");
require("chai").should();
const Checks = [
{ text: "2+3*2", result: 8 },
{ text: "4-2-1", result: 1 },
];
describe("Testing translator", () => {
for (let c of Checks) {
it(`Test ${c.text} = ${c.result}`, () => {
const t = p.parse(c.text);
const js = escodegen.generate(t);
const result = eval(js);
result.should.equal(c.result);
console.log = oldLog;
});
}
});
No funcionaría porque lo que queda en result
es undefined
y no el resultado de 2+3*2
.
¿Cómo arreglarlo?
¡El patrón de Stubbing al rescate!
Sustituyamos el método log
del objeto console
con nuestra propia función adaptada a nuestras necesidades de testing console.log = x => x;
que retorna el valor del argumento pasado a console.log
. De esta forma podemos acceder al valor de la evaluación de la expresión:
it(`Test ${c.text} = ${c.result}`, () => {
let oldLog = console.log;
console.log = x => x;
const t = p.parse(c.text);
const js = escodegen.generate(t);
const result = eval(js);
result.should.equal(c.result);
console.log = oldLog;
});
}
});
Ahora result
contiene la evaluación de la expresión y las pruebas funcionan.
Covering
You can use nyc (opens in a new tab) to do the covering of your mocha tests. See the notes in covering.
Activate the GitHub pages of your repo (go to settings and use the default branch and the docs
folder) and be sure to include your covering report in the docs
folder. Add to your package.json
a script to run the tests and leave the report in the docs
folder.
Continuous Integration
Esta tarea tiene asociada una GitHub Action education/autograding (opens in a new tab) para ejecutar los tests proveidos por el profesor. Añada tests y compruebe en las GitHub actions de su repo el estado de las pruebas. Lea la sección GitHub Actions de los apuntes.
Videos
Vídeos de las clases del 19/02/2024 y 21/02/2024:
Este vídeo del 22/02/2023 puede ayudar con la elaboración de la práctica hello-compiler:
Related topics
- Computing Dependencies
- Solving Conflicts
- Generating an Espree compatible AST (opens in a new tab)
- Introduction to Compilers (opens in a new tab)
Rubric
hello-compilers Repos
References
Essentials for this lab
- See the examples in the repo ULL-ESIT-PL/hello-jison (opens in a new tab)
- https://astexplorer.net (opens in a new tab)
- Tipos de Nodos del AST y nombres de las propiedades de los hijos
- Escodegen repo en GitHub (opens in a new tab)
- Jison Documentation (opens in a new tab)
Jison and Syntax Analysis
- Análisis Sintáctico Ascendente en JavaScript (opens in a new tab)
- Jison
- Mi primer proyecto utilizando Jison (opens in a new tab) por Erick Navarro
- Folder jison/examples from the Jison distribution (opens in a new tab)
- Jison Debugger (opens in a new tab)
- Precedencia y Asociatividad (opens in a new tab)
- Construcción de las Tablas para el Análisis SLR (opens in a new tab)
- Algoritmo de Análisis LR (yacc/bison/jison) (opens in a new tab)
- Repo ULL-ESIT-PL-1718/jison-aSb (opens in a new tab)
- Repo ULL-ESIT-PL-1718/ull-etsii-grado-pl-jisoncalc (opens in a new tab)
- Leveling Up One’s Parsing Game With ASTs by Vaidehi Joshi 👍