Mlang

The Mlang compiler has a traditionnal architecture consisting of various intermediate representations going from the source code to the target backend.

M Frontend

First, the source code is parsed according to the Menhir grammar specified in Mlang.Mparser. The grammar is not exactly LR(1) so we rely on Mlang.Parse_utils to backtrack, especially on symbol parsing. The target intermediate representation is Mlang.Mast, which is very close to the concrete syntax and can be printed using Mlang.Format_mast.

M Intermediate representation

The M language has a lot of weird syntactic sugars and constructs linked to its usage inside multiple DGFiP applications. Mlang.Mast_to_mir extracts from the AST the computational core corresponding to a DGFiP application into the M Variable Graph (Mlang.Mir), which consists basically of a flat map of all the definitions of the variables used in the application. The type system of M is very primitive, and basically all programs typecheck ; however Mlang.Mir_typechecker provides a top-down typechecking algorithm to split simple variables from tables.

At this point, the Mlang.Mir_dependency_graph modules interprets the MIR as a first-class graph and computes various reachability and cycle analysis in order to determine the computational flow inside the program. This dependency order is encapsulated with the program in Mlang.Mir_interface. Some typechecking and macro expansion is performed by Mlang.Mir_typechecker.

Mlang.Mir
Mlang.Mast_to_mir
Mlang.Format_mir
Mlang.Mir_typechecker
Mlang.Mir_interface
Mlang.Mir_dependency_graph

M++ Frontend

The M code can be called several times through a sort of driver that saves some variables between calls. This driver is encoded in the M++ domain-specific language that Mlang handles together with the M. M++ is parsed with Mlang.Mpp_parser

Mlang.Mpp_parser
Mlang.Mpp_ast

M++ Intermediate representation

From the M++ AST, Mlang.Mpp_frontend translates to Mlang.Mpp_ir by eliminating some syntactic sugars.

Mlang.Mpp_frontend
Mlang.Mpp_ir
Mlang.Mpp_format

M/M++ Common backend intermediate representation

The module Mlang.Mpp_ir_to_bir performs the inlining of all the M calls in the M++ programs and yields a single program in a new common intermediate representation, Mlang.Bir. Inputs and outputs for this representation can be specified using Mlang.Bir_interface.

The BIR representation is equipped with a fully fledged interpreter, Mlang.Bir_interpreter. The interpreter is instrumented for code coverage via Mlang.Bir_instrumentation, and can be parametrized via multiple sorts of custom floating point values for precision testing (Mlang.Bir_number).

Mlang.Bir_instrumentation
Mlang.Bir_interface
Mlang.Bir_interpreter
Mlang.Bir_number
Mlang.Bir
Mlang.Format_bir

Optimizations

While BIR is an AST-based representation, Mlang.Oir defines a dual of BIR in a control-flow-graph (CFG) for. You can go back and forth between those two implementations using Mpp.Bir_to_oir.

OIR is the right place to perform some basic program optimizations. Mlang.Inlining expands some of the small rules into their full expressions, while Mlang.Partial_evaluation and Mlang.Dead_code_removal simplify the program depending on the inputs and outputs. The main optimizing loop is implemented in Mlang.Oir_optimizations.

Mlang.Oir_optimizations
Mlang.Inlining
Mlang.Partial_evaluation
Mlang.Dead_code_removal

Testing M and M++ programs

Mlang comes with a testing framework for M and M++ programs that is based on the test format used by the DGFiP. The test files are parsed with Mlang.Test_parser into Mlang.Test_ast. Then, single or batch testing can be performed using Mlang.Test_interpreter.

Compiling M and M++ programs

M/M++ programs can be compiled to other programming languages using several backends that take BIR and produce source code files in their respective languages.

Mlang.Bir_to_python
Mlang.Bir_to_c
Mlang.Bir_to_dgfip_c