The Great Refactoring
In version 0.31 a massive refactoring is going to be performed. This was brought about by the introduction of 32-bit data types and sparse matrices. The old naming scheme was inconsistent through out the code base and didn't provide an easy way to describe all these different data types and operations which are performed on them. Fortunately SimpleMatrix and Equations API's have been left unchanged. Under the hood SimpleMatrix has changed a little bit. It can now handle 32-bit and 64-bit floats, depending on what the user passes in.
Summary of Changes:
- Introduction of 32-bit float matrices
- Introduction of sparse matrices
- Names of matrices
- Names of classes in procedural API
- Packages and location of some classes
- SimpleMatrix is starting to be able to support multiple matrix types internally
What has NOT changed:
- SimpleMatrix API
- Equations API
To make this transition easier a python script is being/has been written which will recursively perform the refactoring on older code written for version 0.30.
Feedback is Welcome! It's not too late to change what's listed here or to introduce other changes that can improve usability.
Contents
Why These Names?
There are a few reasons for the naming scheme below.
- The new 32-bit float code is auto generated from the 64-bit double code. The translation is done through a mostly dumb search and replace.
- This is actually a difficult requirement.
- Need to know which files should be translated by their file name only.
- Client libraries need to be able to auto generate their code from EJML without difficulty, e.g. can use Double as a keyword.
- The procedural interface follows the philosophy that everything is strongly typed and provides as much control to the user as possible
- Functions should take in a specific data type and not a general purpose one
- SimpleMatrix is provided for those who don't to deal with this complexity
- Attempting to keep class names for commonly used data structure short to enable concise code
Thoughts and comments about alternative approaches to meet these goals is welcomed!
Proposal 1
Matrix Name Op-Suffice Data Type ------------------------------------------------------------------------ DMatrixRow_F64 _R64 Dense Row-Major real double DMatrixRow_C64 _CR64 Dense Row-Major complex double DMatrixRow_C32 _CR32 Dense Row-Major complex float DMatrixBlock_F64 _B64 Dense Block Real double DMatrixFixed3x3_F64 Dense Fixed Sized 3x3 real double SMatrixCsc_F64 _O64 Sparse Compressed Column real double SMatrixCsc_C64 _CO64 SMatrixTriplet_F64 _T64 Sparse Triplet real double SMatrixTriplet_C64 _CT64
- CSC = Compressed Sparse Column (typical name) aka Compact Column
Matrix Class Names
A Matrix will follow the following pattern strictly:
<S/D>Matrix<Data Structure>_<C/F><32/64>
- <S/D> The first character indicates if it's 'S' for sparse or 'D' for dense.
- <Data Structure> This section specifies how the matrix is encoded internally.
- <C/F> If it encodes a matrix using Complex or Real numbers. Just accept that 'F' is for real numbers.
- <32/64> If the matrix uses 32-bit float or 64-bit double
Operation Class Names
<Class Name>_<Type><32/64>
- <Class Name> This has not changed. CommonOps and NormOps are two examples
- <Type> A single character is used to indicate the internal data structure. 'C' is the first character if complex. Otherwise assume real.
- <32/64> If the matrix uses 32-bit float or 64-bit double
Matrix Data Structures
Character | Type |
---|---|
R | dense row-major |
B | dense row-major block |
N/A | Classes for fixed sizes matrices follow their own naming scheme |
T | sparse triplet |
O | sparse compact column |
Examples:
- CommonOps_R64 for dense row major
- CommonOps_O64 for sparse compact column
- CommonOps_CR64 for complex dense row major
Historical
- DenseMatrix64F -> DMatrixRow_F64
- CDenseMatrix64F -> DMatrixRow_C64
Proposal 2
Matrix Name Op-Suffice Data Type ------------------------------------------------------------------------ MatrixDense_64 _DR64 Dense Row-Major real double MatrixDense_64C _DR64C Dense Row-Major complex double MatrixDense_32C _DR32C Dense Row-Major complex float MatrixBlock_64 _DB64 Dense Block Row-Major Real double MatrixBlock_64C _DB64C Matrix3x3_64 Dense Fixed Sized 3x3 real double Matrix3_64 Dense Fixed Sized 3 real double MatrixSparseCsc_64 _SC64 Sparse Compressed Column real double MatrixSparseCsc_64C _SC64C MatrixSparseTriplet_64 _ST64 Sparse Triplet real double MatrixSparseTriplet_64C _ST64C MatrixSparseSkyline_64 _SK64 Sparse Skyline real double (future)
Matrix Class Names
All matrices start with "Matrix" in their name followed by their structure and number format. 'Dense' is an exception since it is the most commonly used matrix type. It should be a more recognizable name.
Operation Class Names
The only difference between this and the other proposal is which suffices are used. The suffice pattern is:
_<D or S><Structure Letter><bits><C for complex>
Prefix letter is D for dense and S for sparse. Each matrix data structure will be assigned a letter for its operations. If the matrix type is complex then a C will be appended to the end.
Proposal 3
This is just a list of other ideas
MDRD (Matrix-Dense-RowMajor-Double) MatrixDense64 MatrixDense_64 DMatrixRow_64 DMatrixR_64 Matrix64rm CMatrix64rm MatrixDRM D=double CC = row-major MatrixZRM Z=double complex CMatrix64rm MatrixSR_64 MatrixDense_64C MatrixDoubleDense MatrixDoubleComplexDense MatrixDoubleDenseSingleRow MatrixDoubleComplexDenseSingleRow MSCD (Matrix-Sparse-CSC-Double) MatrixCsc64 MatrixCsc_64 SMatrixCsc_64 MatrixS64cc CMatrixS64cc MatrixSparseDCC D=double CC = compact column MatrixSparseZCC Z=double complex MatrixSparseCsc_64 MatrixSparseCSC_64 MatrixDoubleSparseCsc MatrixDoubleComplexSparseCsc
- Single refers to a single array being used to store the data as compared to multiple arrays. E.g. double[], as composed to double[][]
- LAPACK: F for single real float, D for double real, C for single complex, and Z for double complex
Packages
Use your IDE to figure these out if the automated script misses them.