Hello Compiler

Hello Compilers

Goals

Our goal is to extend the parser of the previous lab to a simple calculator that can handle complex numbers and two new operators @ and & that will be interpreted as max and min respectively. Here is an example of the kind of expressions we want to handle:

➜  hello-compilers-solution git:(essay-2024-02-14) bin/mmt.js '2-3i*5i'

gives the output:

const {Complex, factorial, max, min, print } = require("/Users/casianorodriguezleon/campus-virtual/2122/learning/jison-learning/hello-compilers-solution/src/support-lib.js");
print(Complex('2').sub(Complex('3i').mul(Complex('5i'))));

We can execute the output with node:

➜  hello-compilers-solution git:(essay-2024-02-14) bin/mmt.js '2-3i*5i' | node -
17
➜  hello-compilers-solution git:(essay-2024-02-14) bin/mmt.js '2-3i*5i&9' | node -
9

The expression a&b is equivalent to min{a,b} and & has smaller priority than the arithmetic operators.

We can supply a file with the expression to translate with the -f option:

➜  hello-compilers-solution git:(main) ✗ cat test/input/minmax.calc 
3i&(2+4)@(5+6)

The a@b expression is equivalent to max{a,b} and @ has smaller priority than &.

➜  hello-compilers-solution git:(main) ✗ bin/mmt.js -f test/input/minmax.calc         
const { print,max,min,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(Math.max(Math.min(Complex('3i'), Complex('2').add(Complex('4'))), Complex('5').add(Complex('6'))));

When we execute the output with node we get:

➜  hello-compilers-solution git:(main) ✗ bin/mmt.js -f test/input/minmax.calc | node -
11

Here is an example of the executable bin/mmt options:

 hello-compilers-solution git:(main) bin/mmt.js -h
Usage: mmt [options] [expression]

Arguments:
  expression            Expression to translate (default: null)

Options:
  -V, --version         output the version number
  -f --file <file>      path to the input file with the expression to translate
  -a --ast <file.json>  path to file with the AST in JSON format
  -j --js <file>        file to output the resulting javascript
  -h, --help            display help for command

Two new operators & and @

Let us consider a notation of arithmetic in which the @ and & symbols on numbers are defined as the max and min operations. Thus, with this notation

12@7=max{12,7}=1212 @ 7=\max\{12,7\}=12

and

12&7=min{12,7}=712 \& 7=\min\{12,7\}=7

Extienda el traductor de la calculadora (opens in a new tab) para que admita estas expresiones aritméticas y las traduzca a un programa JavaScript que las compute y las imprima.

Supondremos que el mínimo & tiene mas prioridad que el máximo @. Por ejemplo, la entrada 234@325&57234 @ 325 \& 57 podría ser traducida al siguiente código JS:

console.log(max(234, min(325,57)))

Compared to the other tokens, give a low priority to the @ and & operators so that an expression 4!@3**2 should be interpreted as (4!)@(3**2) and produces a JS code as max(factorial("4"), pow("3", "2"))).

Since & and @ have a lower priority than + and -, an expression like 4+5i&3-2i should be interpreted as (4+5i)&(3-2i).

Extienda el programa Jison hecho en la práctica anterior con estas dos operaciones para que produzca un AST compatible Espree conteniendo el correspondiente código JS. A continuación utilice escodegen.generate(ast) (opens in a new tab) para generar el código JS

Dealing with Ambiguity

For the & and @ operators you can extend the initial incomplete grammar in the assignment repo this way:

%{
const { buildLiteral, buildRoot, buildMin } = require('./ast-build');
%}
 
%%
es: e 
;
 
e: 
    e '@' e  
  | e '&' e  
  | N        
  | '(' e ')'
;

The problem with this grammar is that it is ambiguous. Expressions like 2&3@4 have more than one concrete syntax tree.

On one side:

that will lead to the interpretation 2&(3@4); but we have also this other syntax tree:

that leads to the interpretation (2&3)@4.

To break the ambiguity you have to set that the precedence of the token & is higher that the one of the token @.

You have also to fix the ambiguity for phrases like 2&3&4 and 3@4@5 favouring a left associativity interpretation, i.e. preferring (2&3)&4 to 2&(3&4) and (3@4)@5 to 3@(4@5).

An expression 4+5i&3-2i should be interpreted as (4+5i)&(3-2i) and can produce a JS code as min(add("4", "5i"), (sub("3", "2i"))).

Breaking Ambiguity in Jison

To deal with issues of ambiguity in grammar, you can consult

These is a simplified version of the rules to resolve conflicts and ambiguities in a Yacc-like parser generator:

💡

Precedence Rules

  1. La precedencia de los tokens se hace en la cabecera del programa Jison; esto es: antes del primer %% usando la sintáxis %left token ... , %right token ... o %nonassoc token ...
  2. La precedencia de una regla de producción AαA \rightarrow \alpha es la precedencia del último token que aparece en la parte derecha α\alpha de la regla.
    • Por ejemplo la precedencia de ee@ee \rightarrow e @ e será la precedencia que le demos al token @@ en la cabecera del programa Jison.
  3. Si no existen tokens en la parte derecha con precedencia asignada o bien si la precedencia por defecto no es la deseada, se puede usar la directiva %prec para asignar la precedencia deseada a la regla. Por ejemplo:
    exp: '-' exp %prec 'STRONG'
    asigna la precedencia del token STRONG a la regla exp: '-' exp
  4. Cuando el parser detecta un conflicto y ve que hay dos posibles vias de continuar la construcción del árbol: Una que indica que quizá se aplicó la regla AαA \rightarrow \alpha y otra que indica que quizá se pueda seguir leyendo el token tt a la entrada,
    1. El parser compara las precedencias del token y de la regla y se queda con el de mas prioridad.
    2. Si es el token quien tiene mayor prioridad avanzará en la lectura desplazando el token tt y buscando nuevos símbolos (se dice que hace un shift; en este caso el AST se "hundirá" a derechas) y
    3. Si es la regla completará el subárbol parcial αA\overset{A}{\overset{\triangle}{\alpha}} y continuará en su construcción del árbol (se dice que hace un reduce y en este caso el árbol construido estará más hundido a izquierdas)
  5. Los tokens declarados en la misma línea mediante una declaración %left o %right tienen igual precedencia e igual asociatividad.
  6. La precedencia es mayor cuanto mas abajo su posición en el texto

Complex Numbers

Extend the regular expressions in the lexical analyzer to cover both floating point real numbers like 2.53e-2 and floating point imaginary numbers like 2.9e-5i or -i.

The complex.js (opens in a new tab) library provides a constructor Complex and methods mul, add, sub, div, etc. that can be used this way:

  let Complex = require('complex.js'); 
  let c = Complex({ re: 1.0e1, im: 8}); // Same: let c = Complex("1.0e1 + 8i");
  console.log(c); // { re: 10, im: 8 }
  console.log(c.mul({re: 2, im: 2}).div(2).sub(3, 2)); // { re: -1, im: 16 }
  console.log(c.add({re: 3, im: 9})); // { re: 13, im: 17 }  

For more examples of use of the library visit ULL-ESIT-PL/hello-complex.js (opens in a new tab)

Write the code inserting the support functions and the require to complex.js (opens in a new tab) lib in the preamble that is concatenated to the generated code.

Redefining the minimum and maximum operators

Use lexicographical order (opens in a new tab) to compare Complex numbers:

a+bia+bi if and only if a<a or (a=a and bb),{\displaystyle a+bi\leq a^{\prime }+b^{\prime }i{\text{ if and only if }}a<a^{\prime }{\text{ or }}\left(a=a^{\prime }{\text{ and }}b\leq b^{\prime }\right),}

Redefining the factorial function

To keep compatibility with the calculator in the previous lab, you can extend the complex class with a factorial method.

You can either constrain the factorial to natural numbers like this:

hello-compilers/complex-lib/factorial.js
#!/usr/bin/env node
const Complex = require("complex.js");
 
Complex.prototype.factorial = function() {
  if (this.im !== 0) throw new Error(`Imaginary part must be zero. Instead is ${this.im}`);
  let n = this.re;
  if (!Number.isInteger(n)) throw new Error(`Not an Integer number ${n}`);
  if ( n < 0) throw new Error(`Factorial of negative number ${n}`);
  let result = Complex(1);
  if (n === 0) return result;
  for (let i = 1; i <= n; i++) {
    result = result.mul(i);
  }
  return Complex({re: result.re, im: this.im});
};
 
console.log(Complex(process.argv[2]).factorial());

Here are several executions of the former example:

➜  complex-lib ./factorial.js "3"
{ re: 6, im: 0 }
➜  complex-lib ./factorial.js "3+2i"
Error: Imaginary part must be zero. Instead is 2
➜  complex-lib ./factorial.js "-3+0i"
Error: Factorial of negative number -3
➜  complex-lib ./factorial.js "-3.2+0i"
Error: Not an Integer number -3.2

The new version of the factorial function has to be added in the preamble of the generated code.

Or you can extend the factorial to complex numbers using the gamma function (opens in a new tab) and the reflection formula (opens in a new tab) for negative numbers. See an example of an implementation of gamma in the file ULL-ESIT-PL/hello-complex.js/gamma.js (opens in a new tab)

If you compute factorials using the gamma function, you'll likely observe some small errors due to the fact that floating point numbers have limitations on how accurately a real or complex number can be represented:

➜  hello-compilers-solution git:(main) ✗ bin/mmt.js '3!'                            
const { print,gamma,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(gamma(Complex('3')));
➜  hello-compilers-solution git:(main) ✗ bin/mmt.js '3!' | node -
2.000000000000001
 
➜  hello-compilers-solution git:(main) ✗ bin/mmt.js '(3+2i)!'    
const { print,gamma,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(gamma(Complex('3').add(Complex('2i'))));
➜  hello-compilers-solution git:(main) ✗ bin/mmt.js '(3+2i)!' | node -
-0.4226372863112039 + 0.871814255696508i

The translation scheme

Thus your calc translator must be able to generate code for expressions like

2!+3**2i-4i

that using the complex library augmented with our factorial can be ultimately evaluated as:

Complex(2).factorial().add(Complex(3).pow("2i")).sub(Complex("4i"))

Using auxiliary proxy functions like sub, add, pow, etc. 2!+3**2i-4i can be translated as:

sub(add(factorial("2"), pow("3", "2i")), "4i")

which simplifies the AST and thus the translation.

Computing the dependencies

We can find the dependencies of the generated code by traversing the AST and collecting the names of the functions that are not part of the target language. We can use espree.estraverse for that:

const findUsedFunctions = function (ast) {
  const usedSupportFunctions = new Set();
  estraverse.traverse(ast, {
    enter: function (node, _ ) {
      if (node.type === "Identifier" && exportedSupportIdentifiers.includes(node.name)) {
        usedSupportFunctions.add(node.name)
      }
    },
  });
  return Array.from(usedSupportFunctions);
};

Where exportedSupportIdentifiers is an array with the names of the functions exported by the support library:

const exportedSupportIdentifiers = Object.keys(require("./support-lib.js")); // [ power, factorial ]

Tests

Añada pruebas usando Mocha y Chai o Jest

Tips

Probably you have experienced that when you change the computer you develop on or you make changes in the preamble the tests that check the generated code fail. For instance, the require mentions the specific path of the support lib in your computer:

➜  hello-compilers-solution git:(main) ✗ bin/mmt.js -f test/input/fact.calc
const { print,gamma,Complex } = require("/Users/casianorodriguezleon/campus-virtual/2324/pl2324/practicas/hello-compilers/hello-compilers-solution/src/support-lib.js");
print(gamma(Complex('3')));

To fix this weakness, use a comment with a string in the template to separate the preamble from the generated code:

src/template.js.txt
const { {{ dependencies }} } = require("{{root}}/src/support-lib.js");
/* end of preamble of generated code */
{{code}}

Then you can use an easy regular expression to remove the preamble from the generated code and check that the expected code and the actual code are equal.

After a change in a support lib function some tests may fail. You can use the --grep option of mocha:

$ npx mocha -g 'factorial'

Also, you can use the only methods of describe and it (opens in a new tab) to run only the tests that are failing or simply use a conditional to skip the tests that are failing. For instance if the & and @ operators are not yet implemented, you can skip the tests that use them:

test/test.js
describe('Testing hello maxmin translator', () => {
  for (let c of Checks) {
    if (c.text.match(/[&@]+/)) continue;
    it(`Test ${c.text} = ${c.result}`, () => {  ... });
  }
});

Mocking

Mocking means creating a fake version of an external or internal service that can stand in for the real one, helping your tests run more quickly and more reliably. When your implementation interacts with an object’s properties, rather than its function or behavior, a mock can be used.

Stubbing

Stubbing, like mocking, means creating a stand-in, but a stub only mocks the behavior, but not the entire object. This is used when your implementation only interacts with a certain behavior of the object.

To give an example: You can stub a database by implementing a simple in-memory structure for storing records. The object under test can then read and write records to the database stub to allow it to execute the test. This could test some behaviour of the object not related to the database and the database stub would be included just to let the test run.

If you instead want to verify that the object under test writes some specific data to the database you will have to mock the database. Your test would then incorporate assertions about what was written to the database mock.

Examples of Mocking and Stubbing

See the code at ast/test/test.mjs (opens in a new tab) in the repo hello-jison for an example of stubbing the console.log.

Cuando vaya a escribir las pruebas de la práctica podemos intentar una aproximación como esta: ➜ arith2js-solution git:(dependencies) ✗ cat test/data/test1.calc

  1. Tomamos un objeto como c = { text: "3! - 1", result: 5 } con el atributo text conteniendo la expresión de prueba y el atributo result el resultado esperado después de la traducción y evaluación del código
  2. Construimos primero el árbol con t = p.parse(c.text)
  3. Generamos el JavaScript con js = escodegen.generate(t)
  4. Evaluamos el JavaScript con result = eval(js)
  5. Si nuestro traductor es correcto result debería ser igual c.result

Suena bien ¿Verdad?

Pero en tal aproximación ¡tenemos un problema! y es que el código JavaScript generado para "3! - 1" nos han pedido que sea:

➜  arith2js-solution git:(dependencies) ✗ cat test/data/test1.calc 
3! - 1                                                                                                        
➜  arith2js-solution git:(dependencies) ✗ bin/calc2js.mjs test/data/test1.calc 
#!/usr/bin/env node
const factorial = n => (n === 0) ? 1 : n * factorial(n - 1);
console.log(factorial(3) - 1);

y si evaluamos el código resultante:

➜  arith2js-solution git:(dependencies) ✗ node                                
Welcome to Node.js v16.0.0.
Type ".help" for more information.
> result = eval(`const factorial = n => (n === 0) ? 1 : n * factorial(n - 1);
... console.log(factorial(3) - 1);`)
5
undefined
> result
undefined

¡La variable result está undefined!

Esto es así porque la llamada a console.log() siempre retorna undefined (no se confunda por el 5 que aparece en stdout producido por el console.log. El valor retornado es undefined)

Así pues una aproximación como esta no funcionaría:

const p = require("../src/transpile.js").parser;
const escodegen = require("escodegen");
require("chai").should();
 
const Checks = [
  { text: "2+3*2", result: 8 },
  { text: "4-2-1", result: 1 },
];
 
describe("Testing translator", () => {
  for (let c of Checks) {
    it(`Test ${c.text} = ${c.result}`, () => {
      const t = p.parse(c.text);
      const js = escodegen.generate(t);
      const result = eval(js);
 
      result.should.equal(c.result);
      
      console.log = oldLog;
    });
  }
});

No funcionaría porque lo que queda en result es undefined y no el resultado de 2+3*2. ¿Cómo arreglarlo?

¡El patrón de Stubbing al rescate!

Sustituyamos el método log del objeto console con nuestra propia función adaptada a nuestras necesidades de testing console.log = x => x; que retorna el valor del argumento pasado a console.log. De esta forma podemos acceder al valor de la evaluación de la expresión:

it(`Test ${c.text} = ${c.result}`, () => {
      let oldLog = console.log;
      console.log = x => x;
 
      const t = p.parse(c.text);
      const js = escodegen.generate(t);
      const result = eval(js);
 
      result.should.equal(c.result);
      
      console.log = oldLog;
    });
  }
});

Ahora result contiene la evaluación de la expresión y las pruebas funcionan.

Covering

You can use nyc (opens in a new tab) to do the covering of your mocha tests. See the notes in covering.

Activate the GitHub pages of your repo (go to settings and use the default branch and the docs folder) and be sure to include your covering report in the docs folder. Add to your package.json a script to run the tests and leave the report in the docs folder.

Continuous Integration

Esta tarea tiene asociada una GitHub Action education/autograding (opens in a new tab) para ejecutar los tests proveidos por el profesor. Añada tests y compruebe en las GitHub actions de su repo el estado de las pruebas. Lea la sección GitHub Actions de los apuntes.

Videos

Vídeos de las clases del 19/02/2024 y 21/02/2024:



Este vídeo del 22/02/2023 puede ayudar con la elaboración de la práctica hello-compiler:


Related topics

Rubric










hello-compilers Repos

References

Essentials for this lab

Jison and Syntax Analysis

Have a look