How Does *kf Generate Filters?

At the beginning of this year, An Uncommon Lab announced *kf (pronounced "star kay eff"), a tool for designing, analyzing, prototyping, and finalizing detailed Kalman filters and other recursive state estimation algorithms in MATLAB. So far, *kf's been useful in prototyping quick variations on spacecraft attitude determination algorithms and generating codegen-friendly Kalman filters for UAVs (with one exciting project scheduled to fly soon). But how does it work?

The Engine

Fundamentally, the engine behind *kf works with symbolic snippets of pseudocode ("expressions"), with attached assumptions (e.g., "This input matrix will always be diagonal." or "This calculation implies that we're using the optimal Kalman gain.").

When the user specifies what outputs she needs from the algorithm-to-be, what architecture she'd like to use, what the names of her models are, and what assumptions she can make, the engine gathers all of the pseudocode snippets for the architecture. It starts by looking through them to find the best (fastest/most robust) one that provides the requested output and that satisfies her assumptions. For example, a code snippet might look like this:

% Calculate the innovation covariance at sample k for 
% full H and R (most general).
% 
% Assumes: We're not using consider covariance.
% 
S_k = H_k * P_kkm1 * H_k.' + R_k;

It makes a little note that it will need this expression. Then, the engine determines what the "inputs" to the expression are -- all of the variables it needs to have first in order to perform the computation. In the above, that's H_k, P_kkm1, and R_k. Now, it looks to see if it already has expressions to provide these variables, and if not, searches for expressions that both provide them and satisfy assumptions. Sometimes, there aren't expressions to provide for a variable (such as the current measurement vector), and so these are noted to be inputs to the algorithm.

Proceeding in this way, the engine builds up a directed graph, where the expressions are nodes and the variables are the links between them. The engine can then operate on this graph to determine, e.g., the order in which the expressions should appear in code. (There are actually many more things that it does here, and we'll come back to this.)

Finally, the engine is ready to write some code. Each of those "variables" in the expressions is not intended to be a real variable; rather, variables represent ideas, and the engine determines how to store ideas as real variables in memory. For instance, there's usually no need to have separate variables for P_km1, P_kkm1, and P_k, all of which stand for the covariance matrix at various points in the algorithm. Instead, using and reusing a single P matrix in the real code would suffice — surely code written by hand would do it this way. Similarly, perhaps that R_k matrix from above is actually constant throughout the life of the algorithm. In that case, it may live in a "constants" structure. All told, when the engine gets around to turning this expression into actual code, it might look like this:

% Calculate the innovation covariance at sample k for 
% full H and R (most general).
S_k = H_k * P * H_k.' + constants.R_k;

Or, depending on options, it might even generate it as:

% Calculate the innovation covariance at sample k for 
% full H and R (most general).
S_k = symmetric_multiply(H_k, P);
S_k = symmetric_add(S_k, constants.R_k);

Or it could generate in a different target language too, like maybe C.

Sometimes, one needs to keep P_kkm1 separate from the other two forms, and the algorithm should see this and determine that P_kkm1 should have its own variable, and that P_km1 and P_k can be stored in P.

This process continues from the inputs to the outputs, using up all of the selected expressions and generating the necessary variables along the way, and then the user has a nice algorithm, custom tailored to her own models, preferences, and assumptions. Not bad.

The Harder Stuff

Of course, now we come back to those "many more things" that we skipped over. What about if/else statements? What about loops? How do those show up as nodes in this graph of expressions? This section could get long rather quickly, so let's stay pretty high level.

Consider this: if the time step for the filter-to-be isn't constant, then we might use an if/else like so to propagate to the current time:

% If time has elapsed since the last update...
%
% Assumes: Time step is not constant.
% 
if t_k > t_km1
    P_kkm1 = F_km1 * P_km1 * F_km1.' + Q_km1;
else
    P_kkm1 = P_km1; % The propagated form is just the prior.
end

This entire "frame" is really a type of expression. It currently outputs only P_kkm1, and its inputs are pretty clear too. Let's say that this gets selected, so now it needs to find an expression for F_km1. It finds this:

% Calculate the Jacobian using the user-provided function.
F_km1 = users_F_fcn(x, y, z);

Later on, once everything has been selected, we'll see that F_km1 is only necessary inside one of the blocks of the if/else. We don't want to spend time calculating F_km1 when it isn't used, so we determine if the expression can "move into" the if block like so (ignoring comments):

if t_k > t_km1
    F = users_F_fcn(x, y, z);
    P = F * P * F.' + Q_km1;
else
    P = P; % (More on this below.)
end

But can it do that? Well, that depends on the inputs to F_km1 expression. It may be that the expression depends on variables that, for whatever reason, this if/else expression won't be able to see. In that case, though inefficient, F_km1 just can't possibly "move in" and must be calculated wherever those other expressions live. On the other hand, maybe it can, and so we get to save some computation time.

So, the engine treats structures like if/else blocks and loops as expressions with internal scope (that is, with internal expressions), and its outputs are the outputs of its internal expressions. Whereas before we were dealing with a "flat" problem, we're now dealing with a problem inside a problem, etc. After building up the set of expressions that will be necessary, it determines how to organize these, such as by placing some expressions inside of other expressions, keeping constant expressions outside of loops, etc.

Further, notice that, once we've gone through the process of "expressing" the ideas as actual variables, the else block above becomes completely worthless (P = P; is not useful code). The engine watches for stuff like this and eliminates it. That would leave the whole else block empty, so it eliminates that too. The real code would then be:

if t_k > t_km1
    F = users_F_fcn(x, y, z);
    P = F * P * F.' + Q_km1;
end

Clearly, this is the best code for this example.

Ok, to summarize, the process looks like this:

  1. Interpret the assumptions.
  2. Add the user's desired outputs to the list of things we need to provide.
  3. For each thing we need:
    1. see if it's an input or has already been provided by something else, and bail if so;
    2. find an expression providing it that is consistent with the assumptions;
    3. and add the selected expression's inputs to the list of things we need.
  4. Use the directed graph to resolve the scope for each expression according to the rules for if/else blocks, loops, subfunctions, etc. This requires a lot of recursion and is actually the most expensive part of this process by far.
  5. Express the pseudocode as real code with actual variables:
    1. Start with the known inputs.
    2. Find the next thing to code.
      1. Is there an expression that only needs the currently available variables? (Note that this is recursive for if/else blocks and loops.)
      2. Will it overwrite a memory space that contains an idea that's necessary later? If so, skip it for now.
      3. If nothing could be found above, determine which variable(s) will need new memory spaces, and try again.
    3. Map the expression to memory spaces, eliminate worthless expressions and blocks, and finally write the code for it.
    4. Continue until all expressions have been used.

Clearly, adding the scope-within-a-scope turns this into a big problem that requires a lot of consideration. This is why good embedded programmers have jobs, and why *kf took a long time to produce.

New and Prior Art

The ideas of representing algorithms as directed graphs, operating on those graphs for efficiency, and managing variables (and memory in general) are not new, of course. Optimizers for compilers have done this for a long time, for instance.

Further, the idea of representing snippets of pseudocode and configuring those into real algorithms — even Kalman filters — is also not new. In fact, I was stunned when my friend pointed me to this paper by Dr. Johann Schumann about Autofilter, which uses similar ideas to generate code. It's also interesting that we generate code so differently, but with similar motivations!

What seems to be new in *kf is the idea that individual expressions have assumptions, and that the engine is free to generate algorithms using whichever expressions it can link together to get the desired outputs while satisfying those assumptions. One does not need to specify a "template" for how the algorithm should go, and so *kf can do far more than generating the expected forms of various filters. For instance, what's the template for an iterated extended Kalman filter with consider covariance and UD representation of the covariance matrix? How does that change if I opt instead for an information matrix? I could come up with these templates by hand, but I don't need to; *kf will happily produce them for me.

What the *kf engine fundamentally enables is the automatic creation of algorithms based on a user's models and assumptions (and nothing else). The user never needs to provide any of the statistical code forming the core of the filter. Further, since the engine knows about the user's models, it can go through a similar process to build up a simulation of the user's problem from an additional set of "simulation" snippets (e.g., propagating and drawing random noise values).

What Else?

Building up the engine was a lot of fun and a lot of work. I'd love to spend some time opening up this engine for use on other problems, even providing a general-purpose open-source package, but I need to know where it can make a real difference! It's ultimately useful when there are many possibilities for assumptions to interact, and when the generated code needs to be efficient (such as for microcontrollers or code that will run millions of times). Otherwise, one could just write a "regular", bloated library. And I haven't been able to determine where this would apply. Kalman filters live at an interesting intersection between embedded electronics (such as running on puny processors on UAVs) and diverse needs (it seems like every filter is unique in some way due to all of the possible permutations). What else would benefit from this level of algorithm generation? Email/comment if you have ideas!

MATLAB Compiler on OS X

The MathWorks has a great product for MATLAB® called MATLAB® Compiler™. It wraps up just about any program written MATLAB in an executable so others can run it without needing MATLAB or any kind of license. It's an expensive tool, but as a consultant who builds many math-heavy applications for clients who want to "just use" the tools I've created, it's been a big payoff for me.

However, when it comes to building for OS X, there are some bugs and quirks. I couldn't find any help on this online, so let's blast through them here. Note that this is all work for the developer; the user of the "compiled" application doesn't have to do anything. For the user, it will just work.

Continue reading

A Simulation Engine

I write a lot of simulations. For me, it's critical that I can pump out a solid simulation quickly and know that it's accurate, flexible, and fast. The best simulation engine out there is probably Simulink from The MathWorks, though others are doing interesting things too, such as SystemModeler from Wolfram. However, these are often geared towards building large, complex simulations in a graphical way. That's not always what I want. Let's consider:

Suppose I were to create a simulation of a satellite's orbit, attitude, and control. I'd probably pop open Simulink and start drawing blocks, especially if I needed the particular strengths of that platform. But now suppose I needed to simulate n interacting smallsats in a formation, but n was an input. Woah, suddenly that's hard to do in a platform where you draw each system as a little box. I would have to manually draw boxes for the 2-sat config, the 3-sat config, 4-sat, .... You can't just spontaneously reconfigure everything like you can with raw code. Plus, I often need the systems to function both in the simulator and outside of it in ways that just aren't possible with these platforms. Finally, if I need to be able to run the simulation tons of times, e.g. to train a controller or do post-flight analysis, these big simulation platforms kind of get in the way. They have serious strengths, and I love using them to their full potential, but when I need something else, I don't want to try to hack it out in a giant platform. I want something lean and easy and fast.

Continue reading