Equations

From Efficient Java Matrix Library
Jump to navigation Jump to search

Writing succinct and readable linear algebra code in Java, using any library, is problematic. Originally EJML just offered two API's for performing linear algebra. A procedural API which provided complete control over memory and which algorithms were used, but was verbose and has a sharper learning curve. Alternatively you could use an object oriented API (SimpleMatrix) but lose control over memory and it has a limited set of operators. Neither of these API's produces code which is all that similar to how equations are written mathematically.

Languages, such as Matlab, are specifically designed for processing matrices and are much closer to mathematical notation. C++ offers the ability to overload operators allowing for more natural code, see Eigen. To overcome this problem EJML now provides the _Equation_ API, which allows a Matlab/Octave like notation to be used.

This is achieved by parsing text strings with equations and converting it into a set of executable instructions, see the usage example below:

eq.process("K = P*H'*inv( H*P*H' + R )");

It is easy to see that the equations code is the compact and readable. While the syntax is heavily inspired by Matlab and its kin, it does not attempt to replicate their functionality. It is also not a replacement for SimpleMatrix or the procedural API. There are situations where those other interfaces are easier to use and most programs would need to use a mix.

Equations is designed to have minimal overhead. It runs almost as fast as the procedural API and can be used such that all memory is predeclared.


Quick Start

The syntax used in Equation is very similar to Matlab and other computer algebra systems (CAS). It is assumed the reader is already familiar with these systems and can quickly pick up the syntax through these examples.

Let's start with a complete simple example then explain what's going on line by line.

01: public void updateP( DMatrixRMaj P , DMatrixRMaj F , DMatrixRMaj Q ) {
02:     Equation eq = new Equation();
03:     eq.alias(P,"P",F,"F",Q,"Q");
04:     eq.process("S = F*P*F'");
05:     eq.process("P = S + Q");
06: }

Line 02: Declare the Equation class.
Line 03: Create aliases for each variable. This allowed Equation to reference and manipulate those classes.
Line 04: Process() is called and passed in a text string with an equation in it. The variable 'S' is lazily created and set to the result of F*P*F'.
Line 05: Process() is called again and P is set to the result of adding S and Q together. Because P is aliased to the input matrix P that matrix is changed.

Three types of variables are supported, matrix, double, and integer. Results can be stored in each type and all can be aliased. The example below uses all 3 data types and to compute the likelihood of "x" from a multivariable normal distribution defined by matrices 'mu' and 'P'.

eq.alias(x.numRows,"k",P,"P",x,"x",mu,"mu");
eq.process("p = (2*pi)^(-k/2)/sqrt(det(P))*exp(-0.5*(x-mu)'*inv(P)*(x-mu))");

The end result 'p' will be a double. There was no need to alias 'pi' since it's a built in constant. Since 'p' is lazily defined how do you access the result?

double p = eq.lookupDouble("p");

For a matrix you could use eq.lookupMatrix() and eq.lookupInteger() for integers. If you don't know the variable's type then eq.lookupVariable() is what you need.

It is also possible to define a matrix inline:

eq.process("P = [10 0 0;0 10 0;0 0 10]");

Will assign P to a 3x3 matrix with 10's all along it's diagonal. Other matrices can also be included inside:

eq.process("P = [A ; B]");

will concatenate A and B horizontally.

Submatrices are also supported for assignment and reference.

eq.process("P(2:5,0:3) = 10*A(1:4,10:13)");

P(2:5,0:3) references the sub-matrix inside of P from rows 2 to 5 (inclusive) and columns 0 to 3 (inclusive).

This concludes the quick start tutorial. The remaining sections will go into more detail on each of the subjects touched upon above.

The Compiler

The current compiler is very basic and performs very literal translations of equations into code. For example, "A = 2.5*B*C'" could be executed with a single call to CommonOps.multTransB(). Instead it will transpose C, save the result, then scale B by 2.5, save the result, multiply the results together, save that, and finally copy the result into A. In the future the compiler will become smart enough to recognize such patterns.

Compiling the text string contains requires a bit of overhead but once compiled it can be run with very fast. When dealing with larger matrices the overhead involved is insignificant, but for smaller ones it can have a noticeable impact. This is why the ability to precompile an equation is provided.

Original:

eq.process("K = P*H'*inv( H*P*H' + R )");

Precompiled:

// precompile the equation
Sequence s = eq.compile("K = P*H'*inv( H*P*H' + R )");
// execute the results with out needing to recompile
s.perform();

Both are equivalent, but if an equation is invoked multiple times the precompiled version can have a noticable improvement in performance. Using precompiled sequences also means that means that internal arrays are only declared once and allows the user to control when memory is created/destroyed.

To make it clear, precompiling is only recommended when dealing with smaller matrices or when tighter control over memory is required.

When an equation is precompiled you can still change the alias for a variable.

eq.alias(0,"sum",0,"i");
Sequence s = eq.compile("sum = sum + i");
for( int i = 0; i < 10; i++ ) {
    eq.alias(i,"i");
    s.perform();
}

This will sum up the numbers from 0 to 9.

Debugging

There will be times when you pass in an equation and it throws some weird exception or just doesn't do what you expected. To see the tokens and sequence of operations set the second parameter in compile() or peform() to true.

For example:

eq.process("y = z - H*x",true);

When application is run it will print out

Parsed tokens:
------------
VarMATRIX
ASSIGN
VarMATRIX
MINUS
VarMATRIX
TIMES
VarMATRIX

Operations:
------------
multiply-mm
subtract-mm
copy-mm

Alias

To manipulate matrices in equations they need to be aliased. Both DMatrixRMaj and SimpleMatrix can be aliased. A copy of scalar numbers can also be aliased. When a variable is aliased a reference to the data is saved and a name associated with it.

DMatrixRMaj x = new DMatrixRMaj(6,1);
eq.alias(x,"x");

Multiple variables can be aliased at the same time too

eq.alias(x,"x",P,"P",h,"Happy");

As is shown above the string name for a variable does not have to be the same as Java name of the variable. Here is an example where an integer and double is aliased.

int a = 6;
eq.alias(2.3,"distance",a,"a");

After a variable has been aliased you can alias the same name again to change it. Here is an example of just that:

for( int i = 0; i < 10; i++ ) {
    eq.alias(i,"i");
    // do stuff with i
}

If after benchmarking your code and discover that the alias operation is slowing it down (a hashmap lookup is done internally) then you should consider the following faster, but uglier, alternative.

VariableInteger i =  eq.lookupVariable("i");
for( i.value = 0; i.value < 10; i.value++ ) {
    // do stuff with i
}

Submatrices

Sub-matrices can be read from and written to. It's easy to reference a sub-matrix inside of any matrix. A few examples are below.

A(1:4,0:5)

Here rows 1 to 4 (inclusive) and columns 0 to 5 (inclusive) compose the sub-matrix of A. The notation "a:b" indicates an integer set from 'a' to 'b', where 'a' and 'b' must be integers themselves. To specify every row or column do ":" or all rows and columns past a certain 'a' can be referenced with "a:". Finally, you can reference just a number by typeing it, e.g. "a".

A(3:,3)  <-- Rows from 3 to the last row and just column 3
A(:,:)   <-- Every element in A
A(1,2)   <-- The element in A at row=1,col=2

The last example is a special case in that A(1,2) will return a double and not 1x1 matrix. Consider the following:

A(0:2,0:2) = C/B(1,2)

The results of dividing the elements of matrix C by the value of B(1,2) is assigned to the submatrix in A.

A named variable can also be used to reference elements as long as it's an integer.

a = A(i,j)

Inline Matrix

Matrices can be created inline and are defined inside of brackets. The matrix is specified in a row-major format, where a space separates elements in a row and a semi-colon indicates the end of a row.

[5 0 0;0 4.0 0.0 ; 0 0 1]

Defines a 3x3 matrix with 5,4,1 for it's diagonal elements. Visually this looks like:

[ 5 0 0 ]
[ 0 4 0 ]
[ 0 0 1 ]

An inline matrix can be used to concatenate other matrices together.

[ A ; B ; C ]

Will concatenate matrices A, B, and C along their rows. They must have the same number of columns. As you might guess, to concatenate along columns you would

[ A B C ]

and each matrix must have the same number of rows. Inner matrices are also allowed

[ [1 2;2 3] [4;5] ; A ]

which will result in

[ 1 2 4 ]
[ 2 3 5 ]
[   A   ]

Built in Functions and Variables

Constants

pi = Math.PI
e  = Math.E

Functions

eye(N)       Create an identity matrix which is N by N.
eye(A)       Create an identity matrix which is A.numRows by A.numCols
normF(A)     Frobenius normal of the matrix.
det(A)       Determinant of the matrix
inv(A)       Inverse of a matrix
pinv(A)      Pseudo-inverse of a matrix
rref(A)      Reduced row echelon form of A
trace(A)     Trace of the matrix
zeros(r,c)   Matrix full of zeros with r rows and c columns.
ones(r,c)    Matrix full of ones with r rows and c columns.
diag(A)      If a vector then returns a square matrix with diagonal elements filled with vector
diag(A)      If a matrix then it returns the diagonal elements as a column vector
dot(A,B)     Returns the dot product of two vectors as a double.  Does not work on general matrices.
solve(A,B)   Returns the solution X from A*X = B.
kron(A,B)    Kronecker product
abs(A)       Absolute value of A.
max(A)       Element with the largest value in A.
min(A)       Element with the smallest value in A.
pow(a,b)     Scalar power of a to b.  Can also be invoked with "a^b".
sin(a)       Math.sin(a) for scalars only
cos(a)       Math.cos(a) for scalars only
atan(a)      Math.atan(a) for scalars only
atan2(a,b)   Math.atan2(a,b) for scalars only
exp(a)       Math.exp(a) for scalars and element-wise matrices
log(a)       Math.log(a) for scalars and element-wise matrices

Symbols

'*'        multiplication (Matrix-Matrix, Scalar-Matrix, Scalar-Scalar)
'+'        addition (Matrix-Matrix, Scalar-Matrix, Scalar-Scalar)
'-'        subtraction (Matrix-Matrix, Scalar-Matrix, Scalar-Scalar)
'/'        divide (Matrix-Scalar, Scalar-Scalar)
'/'        matrix solve "x=b/A" is equivalent to x=solve(A,b) (Matrix-Matrix)
'^'        Scalar power.  a^b is a to the power of b.
'\'        left-divide.  Same as divide but reversed.  e.g. x=A\b is x=solve(A,b)
'.*'       element-wise multiplication (Matrix-Matrix)
'./'       element-wise division (Matrix-Matrix)
'.^'       element-wise power. (scalar-scalar) (matrix-matrix) (scalar-matrix) (matrix-scalar)
'^'        Scalar power.  a^b is a to the power of b.
'''        matrix transpose
'='        assignment by value (Matrix-Matrix, Scalar-Scalar)

Order of operations: [ ' ] precedes [ ^ .^ ] precedes [ * / .* ./ ] precedes [ + - ]


Specialized Submatrix and Matrix Construction

Extracts a sub-matrix from A with rows 1 to 10 (inclusive) and column 3.
              A(1:10,3)
Extracts a sub-matrix from A with rows 2 to numRows-1 (inclusive) and all the columns.
              A(2:,:)
Will concat A and B along their columns and then concat the result with  C along their rows.
               [A,B;C]
Defines a 3x2 matrix.
           [1 2; 3 4; 4 5]
You can also perform operations inside:
           [[2 3 4]';[4 5 6]']
Will assign B to the sub-matrix in A.
            A(1:3,4:8) = B

Integer Number Sequences

Previous example code has made much use of integer number sequences. There are three different types of integer number sequences 'explicit', 'for', and 'for-range'.

1) Explicit:
   Example: "1 2 4 0"
   Example: "1 2,-7,4"     Commas needed to create negative numbers. Otherwise it will be subtraction.
2) for:
   Example:  "2:10"        Sequence of "2 3 4 5 6 7 8 9 10"
   Example:  "2:2:10"      Sequence of "2 4 6 8 10"
3) for-range:
   Example:  "2:"          Sequence of "2 3 ... max"
   Example:  "2:2:"        Sequence of "2 4 ... max"
4) combined:
   Example:  "1 2 7:10"    Sequence of "1 2 7 8 9 10"

User Defined Functions

It's easy to add your own custom functions too. A custom function implements ManagerFunctions.Input1 or ManagerFunctions.InputN, depending on the number of inputs it takes. It is then added to the ManagerFunctions in Equation by call add(). The output matrix should also be resized.

Example Customizing Equations

User Defined Macros

Macros are used to insert patterns into the code. Consider this example:

eq.process("macro ata( a ) = (a'*a)");
eq.process("b = ata(c)");

The first line defines a macro named "ata" with one parameter 'a'. When compiled the equation in the second line is replaced with "b = (a'*a)". The "(" ")" in the macro isn't strictly necissary in this situation, but is a good practice. Consider the following.

eq.process("b = ata(c)*r");

Will become "b = (a'*a)*r" but with out () it will be "b = a'*a*r" which is not the same thing!

NOTE:In the future macros might be replaced with functions. Macros are harder for the user to debug, but functions are harder for EJML's developer to implement.