Expression Tree
Evaluate expressions in real time against arbitrary data
Introduction
Consider the diagram above. The tree represents an expression,
(a – b) / (c*d + e)
Parentheses were added to aid in disambiguation.
Now, consider the following block of JSON,
{
“customerId”: “123-456”,
“a”: 1,
“b”: 0,
“c”: 1,
“d”: 2,
“e”: 3
}
Your task is to author a presentational component and one of the numbers that is rendered in the component is computed from the above formula. The actual values of a, b, c, d, and e vary across different runs of the application.
Just when you are about to mark the JIRA task completed, a change request is submitted. One of the child components is to be conditionally displayed according to the prescription,
(a-b) > (c*d)
A Typical Implementation
A typical Angular implementation might provide the JSON data as an Input. Values are read from that Input and used to compute the required numerical value inside the component. A conditional expression in the component’s template is likely bound to a visibility property that controls the optional display of a child component.
There is nothing intrinsically ‘wrong’ with this implementation. The approach does, however, introduce several issues. First, it quietly embeds business logic inside a purely presentational component. Second, it dilutes single source of truth for business logic. One part of this logic is specified before the component is written and may or may not be properly documented. Another small piece of business logic is buried inside the template and is only referenced in a JIRA ticket. Sometimes, we don’t even have that much of a paper trail. How many CR’s in your career had to be crammed into code from an e-mail or IM request? Of course, the deadline is yesterday, so what is the developer to do other than turn the CR around as quickly as possible?
Over time, these side discussions are deleted and/or forgotten, but the time bomb of coupling business logic and component code remains.
When is tighter coupling good?
Note that both the numerical value and conditional display of a child component depend on formulas or expressions whose specific values are taken from a known data structure.
What if we created a model that coupled data with the necessary expressions for business logic to control component display?
export interface CustomerModel
{
customerId: string;
a: number;
b: number;
c: number;
d: number;
e: number;
kpiExpression: string;
kpiDigits: number;
optionalExpression: string;
}
A concrete implementation of this model is illustrated in the following JSON
{
customerId: “123-456”,
“a”: 1,
“b”: 0,
“c”: 1,
“d”: 2,
“e”: 3
kpiExpression: “(a – b) / (c*d + e)“;
kpiDigits: 3;
optionalExpression: (a-b) > (c*d);
}
This model can be shared (and documented) across the front and back end. It couples data values with numerical and logical expressions that control how the component presents data. There is now a source of documentation for business logic as well as a single source of truth.
While this approach seems useful from an organizational standpoint, it just made the FE developer’s job about ten times harder. It’s very easy to code a simple formula even if it’s buried inside a component’s template or backing code. Parsing and evaluating a formula or general logical expression that is supplied in data is far more complex.
Is there any reasonable solution to this dilemma?
Function Evaluation
I first faced this problem in the early 2000’s when a client requested a general-purpose function graphing engine in Flash. I thought I did a really good job on the engine until a late change request was added that the client wanted to enter free-form functions in infix notation.
<side-note>infix, prefix, and postfix are notational conventions for describing expressions. Infix is how most people write expressions and infix notation is used throughout this article.</side-note>
What I needed was a general-purpose function parser and evaluation engine. Given one or more independent variables, i.e. x, y, z, a function is specified in a string such as
2*x + 72*x + 3*y^2 + sin(x*y)x^3 + y^2 + 2*z + cos(3*z)
The function is first parsed into an expression tree similar to that shown above, although it is stored differently as will be discussed at a later point in the article.
The function can be evaluated after parsing by providing numerical values of the independent variable(s) in an array.
This worked very well for a number of applications as long as the expression was a function of a specified number of independent variables and produced a single number as a result.
What about the above situation where a child component is conditionally displayed as a result of evaluating?
(a-b) > (c*d)
This expression has four independent variables, a, b, c, and d. The result of the expression, however, is a boolean, not a number.
Now, what?
Expression Engine
I extended the concept of an expression beyond a numerical function. An expression still operates on one or more independent variables. Those variables may be number, string, or boolean. An expression produces a result that is a number, string, or boolean. Some examples of general expressions include
(doctor == ‘approved’) && (x > 0)-3*x + y/2x = y(2*x + 1) < (3*y – 2)
The general process of using such an engine consists of
1 — specify a set of independent variables
2 — Parse the expression
3 — Evaluate the expression for specific values of each independent variable in the order they were defined.
Example:
Suppose we have an instance of an expression engine stored in the variable, ‘expression.’
expression.variables = [“x”, “y”];
const success: boolean = expression.parse( “(2*x + 1) > (3*y – 2)” );
expect(success).toBe(true);
let value: expressionValue = expression.evaluate([0, 0]);
expect(value).toBe(true);
value = expression.evaluate([1, 1])
expect(value).toBe(true);
value = expression.evaluate([-2, 3])
expect(value).toBe(false);
This example defines an expression in infix notation with the string “(2*x + 1) > (3*y — 2)”. The expression is parsed once. It is evaluated three times,
First evaluation: x = 0 and y = 0.
Second evaluation: x = 1 and y = 1
Third evaluation: x = -2 and y = 3.
In this example, the expression evaluation results in a number. In rare cases, the result of an expression could be a string. The most common result types are number and boolean.
The benefit of a packaged engine is that we now have the best of both worlds relative to the problem mentioned in the introduction. Business logic can be coupled with data in a manner that ensures some level of documentation and a single source of truth. This is made possible with a Typescript Interface and the fact that the block of input data is easy to read and understand, even by a non-programmer.
Engine Internals
TLDR; This section contains references to Computer Science concepts and is not necessary reading. I am giving you a Typescript Expression engine. If you are interested in a high-level overview of how it works, then read on. If you want free code, skip to the end 🙂
I’ll skip the parsing part, other than pointing out that it was originally written in a hurry in ActionScript and probably not the best code I’ve ever authored 🙂
I should also mention that forced parentheses are used for disambiguation. I found early on that many potential users of the engine or others who might modify expressions in a CMS are not familiar with mathematical order of operations. I may one day add this functionality as an option to the parser, but I would create a pre-processor that manually adds the correct parentheses to an expression to enforce order of operations. Then, the modified expression would be passed to the current parser. That parser happens to be very lightweight and is written in a manner that is fairly easy to follow.
One ‘trick’ that can be used to compact storage is to take advantage of the fact that each node of the expression tree has at most two children. In this case, the tree can be treated as a stack and stored in an Array. Now, an array has a lot of overhead, so if I had to parse a LOT of expressions, I’d likely optimize the engine by reorganizing the stack as a linked list. For now, it’s stored in an Array as the most common usage is parse a modest-length expression once and evaluate often.
The basic idea behind evaluation is to use a simple stack machine as an alternative to an inorder traversal of the expression tree. I don’t want to reinvent the wheel or retype what is already thoroughly documented online and in the open literature on Computer Science. If you want to see how an inorder traversal of the expression tree can be used to evaluate a result, here is a good article from GeeksForGeeks,
Evaluation of Expression Tree – GeeksforGeeks
and, if you want to see good illustrations of how to use a stack to evaluate expressions in a variety of formats (including infix), then this is a good read.
Expression Evaluation Using Stack | Coding Ninjas Blog
So, there is really nothing new under the Sun. In fact, I was introduced to these parsing and evaluation techniques in the very early 1990s as a C developer. Yes, I really am that old 🙂
Notice that every operation we perform takes place on one or two operands. So, we have expression operands and expression results, which leads to three important types in the Typescript expression engine.
export type expressionValue = number | string | boolean;export type expressionOperand = number | string | boolean;export type ExpressionFcn = (… args: Array<expressionOperand>) => expressionValue;
Internal (protected) class methods are used to evaluate the individual constituents of an expression. These can be referenced dynamically via two observations.
1 — A class compiles down to a function, which is still a JavaScript Object.
2 — The self-referential pointer (this) resolves to the Object’s scope, so internal methods can be referenced by this[some-method-name].
Well, number two isn’t exactly that easy with modern linters. The precise syntax that passes eslint is
let f: ExpressionFcn;
.
.
.f = (this as unknown as Record<string, ExpressionFcn>)[token] as ExpressionFcn;
This may seem like a lot of extra work, but it serves as a comment in that we are treating the class as a property bag, then extracting a specific property, which is specified in advance as having the type ExpressionFcn. We also know in advance that an ExpressionFcn takes an arbitrary list of arguments (one or two for this engine) and returns an expressionValue as a result. TypeScript alone takes care of some helpful documentation work.
Let’s finish up with a simple example,
Evaluate (2*x) >= (3*y – 2)
for arbitrary values of x and y. If the expression engine is to be used for many different expressions with different sets of independent variables, then initialize as
const expression: ExpressionEngine = new ExpressionEngine();
For the current example, the independent variables are known in advance and can be specified at construction,
const expression: ExpressionEngine = new ExpressionEngine([‘x’, ‘y]);
After parsing, the evaluation stack is as follows
[
‘num’,
‘2’,
‘var’,
‘y’,
‘num’,
‘3’,
‘fun2’,
‘mul’,
‘fun2’,
‘sub’,
‘var’,
‘x’,
‘num’,
‘2’,
‘fun2’,
‘mul’,
‘fun2’,
‘ge’
]
Each string is a simple symbolic code for a function, variable, or represents a constant value.
To evaluate the expression at x = -2 and y = 0, execute the code
expression.evaluate([-2, 0]);
You can see how the stack is processed in the evaluate method, shown in the gist below.
https://medium.com/media/f74ac65cf2e33dc4bda960936b4fada1/href
This type of code is what we used to call ‘bookkeeping’ back in the day. It’s just a lot of detail after detail after detail, even though the high-level approach is well-known and well documented in the open literature.
There is another method in the class called eval. This method also evaluates an expression based on supplied values for the independent variables. It also allows a pre-parsed execution stack to be passed as an argument. This was useful in Flash application where I was required to evaluate many different expressions in a game environment, so speed was imperative. There were always a fixed number of expressions for a given level that could be evaluated an unknown number of times. Every time an expression was changed to something that had already been evaluated, it was wasteful to re-parse the expression. So, each expression was given a UID and its parsed execution stack was stored in hash. The pre-parsed stack could be quickly retrieved and evaluated in real-time, as needed.
The Repo
Although I experimented a bit with such an engine in C, I never got serious about it until writing a function parser in ActionScript for a Flash application. I converted the function parser to TypeScript and expended it into the Expression Engine in 2017. I also open-sourced that initial engine in summer of 2017.
The code has been upgraded and included in the AMYR library and is available for commercial use under an Apache 2.0 license.
I’ve used this engine for everything from freeform function graphing to executing real-time logic returned by a server in multi-tenant applications. This engine was also central to my John Wick impossible task. I was asked to convert over 30K lines of Java code with additional third-party library dependencies such as MVEL to TypeScript. MVEL is a Map-based expression evaluation library for Java. I was able to quickly substitute the expression engine for MVEL and complete the impossible task without taking out a marker 🙂
Expression evaluation, however, is not the end game; in many cases, it’s just the beginning. The greatest power of this engine is not standalone use, but what can be built on top of the engine.
Coming next … a Decision Tree.
Typescript Expression Engine was originally published in ngconf on Medium, where people are continuing the conversation by highlighting and responding to this story.