Mlang.Mast
Abstract Syntax Tree for M
This AST is very close to the concrete syntax. It features many elements that are just dropped in later phases of the compiler, but may be used by other DGFiP applications
Applications are rule annotations. The 3 main DGFiP applications seem to be:
batch
: deprecated, used to compute the income tax but not anymore;bareme
: seems to compute the income tax;iliad
: usage unkown, much bigger than bareme
.module DomainId = StrSet
module DomainIdSet = StrSetSet
module DomainIdMap = StrSetMap
module ChainingSet = StrSet
module ChainingMap = StrMap
For generic variables, we record the list of their lowercase parameters
A variable is either generic (with loop parameters) or normal
A table index is used in expressions like TABLE[X]
, and can be variables, integer or the special X
variable that stands for a "generic" index (to define table values as a function of the index). X
is contained here in SymbolIndex
because there can also be a variable named "X"
...
val get_table_size : table_size -> int
val get_table_size_opt : (table_size * 'a) option -> (int * 'a) option
type set_value =
| FloatValue of float Pos.marked |
| VarValue of variable Pos.marked |
| Interval of int Pos.marked * int Pos.marked |
The M language has an extremely odd way to specify looping. Rather than having first-class local mutable variables whose value change at each loop iteration, the M language prefers to use the changing loop parameter to instantiate the variable names inside the loop. For instance,
somme(i=1..10:Xi)
should evaluate to the sum of variables X1
, X2
, etc. Parameters can be number or characters and there can be multiple of them. We have to store all this information.
type set_value_loop =
| Single of literal Pos.marked |
| Range of literal Pos.marked * literal Pos.marked |
| Interval of literal Pos.marked * literal Pos.marked |
Values that can be substituted for loop parameters
type loop_variable = char Pos.marked * set_value_loop list
A loop variable is the character that should be substituted in variable names inside the loop plus the set of value to substitute.
There are two kind of loop variables declaration, but they are semantically the same though they have different concrete syntax.
val precedence : binop -> int
val is_left_associative : binop -> bool
type expression =
| TestInSet of bool * expression Pos.marked * set_value list | (* Test if an expression is in a set of value (or not in the set if the flag is set to |
| Comparison of comp_op Pos.marked
* expression Pos.marked
* expression Pos.marked | (* Compares two expressions and produce a boolean *) |
| Binop of binop Pos.marked * expression Pos.marked * expression Pos.marked | |
| Unop of unop * expression Pos.marked | |
| Index of variable Pos.marked * table_index Pos.marked | (* Access a cell in a table *) |
| Conditional of expression Pos.marked
* expression Pos.marked
* expression Pos.marked option | (* Classic conditional with an optional else clause ( |
| FunctionCall of func_name Pos.marked * func_args | |
| Literal of literal | |
| Loop of loop_variables Pos.marked * expression Pos.marked | (* The loop is prefixed with the loop variables declarations *) |
| NbCategory of string Pos.marked list Pos.marked | |
| Attribut of variable Pos.marked * string Pos.marked | |
| Size of variable Pos.marked | |
| NbAnomalies | |
| NbDiscordances | |
| NbInformatives | |
| NbBloquantes |
The main type of the M language
and func_args =
| ArgList of expression Pos.marked list |
| LoopList of loop_variables Pos.marked * expression Pos.marked |
Functions can take a explicit list of argument or a loop expression that expands into a list
The rule is the main feature of the M language. It defines the expression of one or several variables.
An lvalue (left value) is a variable being assigned. It can be a table or a non-table variable
type formula =
| SingleFormula of formula_decl |
| MultipleFormulaes of loop_variables Pos.marked * formula_decl |
In the M language, you can define multiple variables at once. This is the way they do looping since the definition can depend on the loop variable value (e.g Xi
can depend on i
).
type print_arg =
| PrintString of string |
| PrintName of variable Pos.marked |
| PrintAlias of variable Pos.marked |
| PrintIndent of expression Pos.marked |
| PrintExpr of expression Pos.marked * int * int |
type var_category_id = string Pos.marked list Pos.marked
type restore_vars =
| VarList of string Pos.marked list |
| VarCats of string Pos.marked * var_category_id list * expression Pos.marked |
type instruction =
| Formula of formula Pos.marked |
| IfThenElse of expression Pos.marked
* instruction Pos.marked list
* instruction Pos.marked list |
| ComputeDomain of string Pos.marked list Pos.marked |
| ComputeChaining of string Pos.marked |
| ComputeTarget of string Pos.marked |
| ComputeVerifs of string Pos.marked list Pos.marked * expression Pos.marked |
| VerifBlock of instruction Pos.marked list |
| Print of print_std * print_arg Pos.marked list |
| Iterate of string Pos.marked
* var_category_id list
* expression Pos.marked
* instruction Pos.marked list |
| Restore of restore_vars Pos.marked list * instruction Pos.marked list |
| RaiseError of error_name Pos.marked * variable_name Pos.marked option |
| CleanErrors |
| ExportErrors |
| FinalizeErrors |
type rule = {
rule_number : int Pos.marked; | |
rule_tag_names : string Pos.marked list Pos.marked; | |
rule_applications : application Pos.marked list; | |
rule_chaining : chaining Pos.marked option; | |
rule_formulaes : formula Pos.marked list; | (* A rule can contain many variable definitions *) |
}
type target = {
target_name : string Pos.marked; |
target_file : string option; |
target_applications : application Pos.marked list; |
target_tmp_vars : (string Pos.marked * table_size Pos.marked option) list; |
target_prog : instruction Pos.marked list; |
}
type 'a domain_decl = {
dom_names : string Pos.marked list Pos.marked list; |
dom_parents : string Pos.marked list Pos.marked list; |
dom_by_default : bool; |
dom_data : 'a; |
}
type rule_domain_decl = rule_domain_data domain_decl
The M language has prototypes for declaring variables with types and various attributes. There are three kind of variables: input variables, computed variables and constant variables.
Variable declaration is not application-specific, which is not coherent.
type variable_attribute = string Pos.marked * int Pos.marked
Here are all the types a value can have. Date types don't seem to be used at all though.
type input_variable = {
input_name : variable_name Pos.marked; | |
input_category : string Pos.marked list; | |
input_attributes : variable_attribute list; | |
input_alias : variable_name Pos.marked; | (* Unused for now *) |
input_is_givenback : bool; | |
input_description : string Pos.marked; | |
input_typ : value_typ Pos.marked option; |
}
type computed_variable = {
comp_name : variable_name Pos.marked; | |
comp_table : table_size Pos.marked option; | (* size of the table, |
comp_attributes : variable_attribute list; | |
comp_category : string Pos.marked list; | |
comp_typ : value_typ Pos.marked option; | |
comp_is_givenback : bool; | |
comp_description : string Pos.marked; |
}
type variable_decl =
| ComputedVar of computed_variable Pos.marked | |
| ConstVar of variable_name Pos.marked * literal Pos.marked | (* The literal is the constant value *) |
| InputVar of input_variable Pos.marked |
type var_category_decl = {
var_type : var_type; |
var_category : string Pos.marked list; |
var_attributes : string Pos.marked list; |
}
These clauses are expression refering to the variables of the program. They seem to be dynamically checked and trigger errors when false.
type verification_condition = {
verif_cond_expr : expression Pos.marked; | |
verif_cond_error : error_name Pos.marked * variable_name Pos.marked option; | (* A verification condition error can ba associated to a variable *) |
}
type verification = {
verif_number : int Pos.marked; | |
verif_tag_names : string Pos.marked list Pos.marked; | |
verif_applications : application Pos.marked list; | (* Verification conditions are application-specific *) |
verif_conditions : verification_condition Pos.marked list; |
}
type verif_domain_decl = verif_domain_data domain_decl
type error_ = {
error_name : error_name Pos.marked; |
error_typ : error_typ Pos.marked; |
error_descr : string Pos.marked list; |
}
type source_file_item =
| Application of application Pos.marked | (* Declares an application *) |
| Chaining of chaining Pos.marked * application Pos.marked list | |
| VariableDecl of variable_decl | |
| Rule of rule | |
| Target of target | |
| Verification of verification | |
| Error of error_ | (* Declares an error *) |
| Output of variable_name Pos.marked | (* Declares an output variable *) |
| Function | (* Declares a function, unused *) |
| VarCatDecl of var_category_decl Pos.marked | |
| RuleDomDecl of rule_domain_decl | |
| VerifDomDecl of verif_domain_decl |
type source_file = source_file_item Pos.marked list
type program = source_file list
val get_variable_name : variable -> string