Sparse matrices and projection calculations

If you ever worked with high performance 3D applications you know that every cycle counts. One of the issues programmers try to solve is reducing computation time when dealing with matrices and vectors, especially if calculations are done very frequently each frame. Here’s a little trick that can save you some memory and cycle counts when determining projection. Consider a typical projection matrix P:

[ A, 0, 0, 0 ]
[ 0, B, 0, 0 ]
[ 0, 0, C, D ]
[ 0, 0, E, 0 ]

Projection is essentially just a linear transformation in homogenous coordinate space. If vertex coordinates are already expressed in view space as a 4-element vector (Vv):

Vv = [vx, vy, vz, 1]

then getting clip space homogenous coordinates (Vh) is simply:

Vh = Vv * P = [ a, b, c, w ]

We then divide a, b and c by w to get the regular 3D coordinates. There’s a small optimization that can be done here, provided that:

1. Most elements of object’s transformation/modelview matrix are 0 (this is called a sparse matrix). If this is the case, performing additional projection transformation using matrix multiplication is unnecessary and creates computation and memory overhead.
2. The sparse transformation/modelview matrix has it’s last row equal to [ 0, 0, 0, 1 ].
3. The object’s transformation/modelview matrix is non-projective (we wouldn’t have to bother otherwise!).

If condition 1 is met, then we only need 4 elements to determine projection (instead of a matrix), as long as we’re dividing by a positive constant: [ A/D, B/D, C/D, E/D ].

Condition 2 implicates that for proper model -> view vertex transformation we only need three 4-element vectors instead of a 4×4 modelview matrix. We can then get the view space coordinates of a vertex by performing dot products. Below is a simplified vertex shader code of how the optimization can be applied (in actual codebase I used it to reduce sizes of bone animation matrices, so the full code is slightly more complex):

#version 100
attribute highp vec3 inVertexPos;
uniform   highp vec4 inModelView[3]; // first three rows of modelview matrix
uniform   highp vec4 inProjection;   // projection matrix elements expressed as: [A/D, B/D, C/D, E/D]

void main()
{
    // transform from model space to view space
    highp vec3 viewSpacePos;
    viewSpacePos.x = dot(inModelView[0], vec4(inVertexPos, 1.0));
    viewSpacePos.y = dot(inModelView[1], vec4(inVertexPos, 1.0));
    viewSpacePos.z = dot(inModelView[2], vec4(inVertexPos, 1.0)); 

    // calculations using view space vertex position
    (...)

    // transform from view space to clip space
    gl_Position = viewSpacePos.xyzz * inProjection;
    gl_Position.z += 1.0;
}

Note that depending on how you define your projection matrix in your codebase, it might be necessary to flip signs, ie.:

(...)

uniform   highp vec4 inProjection; // projection matrix elements expressed as: [-A/D, -B/D, -C/D, -E/D]
void main()
{
    (...)
    gl_Position.z -= 1.0;
}

This neat trick, as well as many others came from PowerVR Performance Recommendations which is an excellent source for mobile graphics developers.