Yet Another Matlab Vs. Python Post — But This Time, for Aerospace GNC Engineers

There exists a lively debate on the internet about the goodness of Matlab as compared to Python.

That battle has not passed by my workplace. I work at X, formerly Google[x], as an aerospace guidance, navigation, and control (GNC) engineers on Project Wing. On one hand, we have a stable of talented GNC engineers who have worked on big aerospace projects for Boeing, Rockwell Collins, Blue Origin, and NAVAIR. On the other hand, we have perhaps THE best software engineers in the world, who have worked on (for example) Google Search, Youtube, Android, and Google Maps.

We aerospace engineers learned and used Matlab from our university days, continuing through our professional careers. Airbus and Boeing use Matlab to design their autopilots and use the Matlab Coder to produce C code, which flies their commercial aircraft every day. Companies like Raytheon trust Matlab to generate similar code for their missile control systems. The Matlab Aerospace Toolbox, Control Design Toolbox, and Simulink are invaluable tools that speed up the controls design and analysis process.

The Google software engineers were kind of appalled at our heavy use of Matlab. In their eyes, it’s not a “real” programming language. “The indexing is wrong.” “You can’t scale it.” “You have to buy somebody else’s software just to run yours.” Most frustrating of all is that Google’s amazing array of software development, testing, and review tools are all but incompatible with Matlab, making the codebase that much more difficult to maintain.

First, a defense of our Matlab use: If you hired a carpenter to work on your house, and he brought a funky Japanese hammer, how would you react? Would you chide him that his hammer doesn’t look or work like you’re used to? Or would you assume that as a craftsman, he’s brought the tool that can serve him best for the job you’ve hired him to do? The same with Matlab: most of the control design and analysis tasks that we’ve been hired to do, we know how to do best with Matlab + toolboxes. Yes, we COULD learn Python and re-create the calculators, etc. that power our analyses, but we’re trying to get a product out the door, and the latter choice just doesn’t seem like a wise use of our limited time.

Second, a defense of our software engineers: Matlab isn’t a great programming language; it HAS grown to be a racket, and people in my line of work probably need to move on to a better toolset.

With that in mind, I wanted to outline, after having used both a good bit, some factors why and why not a student or an organization in the aerospace GNC world would switch from Matlab to Python.

Cons of switching to Python from Matlab:

  • Plotting functionality is much less convenient:
  • >> x = 0:0.01:pi;
    >> plot(x, sin(x))
    

    Becomes

    In[0]: import numpy as np
    In[1]: import matplotlib.pyplot as plt
    In[2]: x = np.arange(0, np.pi, 0.01)
    In[3]: plt.plot(x, np.sin(x))
    In[4]: plt.show()
    
  • Additionally, the interactive-ness of Matplotlib plots is far less; there’s no built-in datacursor, and you can’t select plot elements with the mouse.
  • Matrix manipulation is less convenient.
  • If you’re already got a code base in Matlab that you depend upon, a switch could be quite painful.
  • There is nothing even close to a replacement for Simulink.
  • The documentation for Python modules is not uniform (though the documentation for the biggest modules and the native pieces is very good).
  • Debugging is harder and takes a change of mentality to test-driven development.

Pros of switching to Python from Matlab:

  • Python is free and open-source, which means
    • The direct monetary cost of switching is (near) zero.
    • The number of libraries, modules, toolboxes, etc. is astoundingly high, also mostly free and open-source. This includes functionality found in many of the most popular Matlab toolboxes such as the controls toolbox and the image processing toolbox.
    • The ability of aerospace engineers who use Python to integrate their work with other teams, especially ones with software engineers, is drastically improved, especially those working outside Big Aerospace.
    • The quality of Python and its toolboxes aren’t dependent upon a single organization (Mathworks) and instead can draw from a very wide community.
  • Python has features of a “more serious” programming language that make it much more appropriate to scale up for production purposes–for example, namespaces and mature unit testing support.
  • Python is faster to load and typically runs programs faster than Matlab because it only loads modules when you need them.
  • Wide variety of development environments: command line, PyCharm, Eclipse, Spyder, IPython Notebook/Jupyter, just to name a few.

Recommendations:

  • University Aerospace Engineering departments adopt Python as their language of choice for Computer Science 101 courses, and as the language that supports all their undergraduate and graduate coursework.
    • Cost: while the student version of Matlab is cheap, university licenses are not. Python is free, full stop.
    • Transition ability: It’s much easier to learn to program in Python and switch to Matlab than the other way around, and with the rise of the drone economy, many many more aerospace jobs will switch to startup-like companies who are much more likely to use Python as their lingua franca.
    • The open source model much more closely mirrors the ideals of scholarship, academic freedom, and information sharing.
  • Everyone use Matlab for what it’s really good at: things like quick & dirty prototyping, concept development, etc. Avoid trying scale or production-ize Matlab code, especially if it’s not explicitly supported by the organization’s version control and testing systems.

Riddler, 14 July 2017

From FiveThirtyEight:

Congratulations! The Acme Axegrinders, which you own, are the regular season champions of the National Squishyball League (NSL). Your team will now play a championship series against the Boondocks Barbarians, which had the second-best regular season record. You feel good about Acme’s chances in the series because Acme won exactly 60 percent of the hundreds of games it played against Boondocks this season. (The NSL has an incredibly long regular season.) The NSL has two special rules for the playoffs:

The owner of the top-seeded team (i.e., you) gets to select the length of the championship series in advance of the first game, so you could decide to play a single game, a best two out of three series, a three out of five series, etc., all the way up to a 50 out of 99 series.
The owner of the winning team gets $1 million minus $10,000 for each of the victories required to win the series, regardless of how many games the series lasts in total. Thus, if the top-seeded team’s owner selects a single-game championship, the winning owner will collect $990,000. If he or she selects a 4 out of 7 series, the winning team’s owner will collect $960,000. The owner of the losing team gets nothing.
Since Acme has a 60 percent chance of winning any individual game against Boondocks, Rule 1 encourages you to opt for a very long series to improve Acme’s chances of winning the series. But Rule 2 means that a long series will mean less winnings for you if Acme does take the series.

How long a series should you select in order to maximize your expected winnings? And how much money do you expect to win?

This is a probability problem, and a good bit easier than I anticipated. Basically, the expected winnings is (one_million_dollars – price_per_win * num_wins_required) * probability_of_getting_required_wins. Calculating the probability of getting the required wins is simply

P(\text{winning n-game series}) = \sum_{i = \frac{n + 1}{2}}^{n} \begin{pmatrix} n \\ n - i \end{pmatrix} p^i (1 - p)^{n - i}

Where \begin{pmatrix} n \\ n - i \end{pmatrix} is defined by Pascal’s triangle.

I wrote this in Python to compute the expected winnings. The answer is: a 23-game series nets expected winnings of $736,222.

#!/usr/bin/ipython

import numpy as np
import matplotlib.pyplot as plt

def probwin(n, p):
    pascals_triangle = np.zeros((n + 1, n + 1))
    for i in range(n + 1):
      pascals_triangle[i, 0] = 1
      pascals_triangle[i, i] = 1
      for j in range(1, i):
        pascals_triangle[i, j] = (
          pascals_triangle[i - 1,j-1] + pascals_triangle[i-1,j])
    P = 0.
    minwin = (n + 1)/2
    maxwin = n
    for i in range(minwin, maxwin + 1):
      P += pascals_triangle[n, n - i] *(p**i * (1 - p)**(n - i))
        
    return P

def expected_income(n, p):
    return (1 - 0.01*(n + 1)/2)*probwin(n, p)

if __name__ == "__main__":
    
    num_games = range(1, 100, 2)
    
    J = np.zeros(50)
    
    prob_individual_win = 0.6

    for i in num_games:
       j = (i + 1)/2
       J[j - 1] = expected_income(i, prob_individual_win)

    print np.argmax(J) * 2 - 1
    print J[np.argmax(J)]
    
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.plot(np.array(range(1,100,2)), J)
    plt.ion()
    plt.show()

And below is the curve of expected winnings using a modified version of this script for win probabilities of 0.4 thru 0.9 (I assumed that if we won less than 40% of our regular season games, we’re probably not going to make the championship!) The takeaway? If you have a 50% chance of winning any particular game, take a shot at the 1-game series–anything could happen. If you have better than 50/50 odds, you probably want to have a long series–something is better than nothing, and you’re favored to win over the long haul. But as you become more dominant, you need to play less games to be assured of a win, so you can expect to win more with shorter series (though it’s not worth risking it all on a 1-game series!).

Expected winnings chart

Riddler Express, 14 July 2017

From FiveThirtyEight:

You and your two older siblings are sharing two extra-large pizzas and decide to cut them in an unusual way. You overlap the pizzas so that the crust of one touches the center of the other (and vice versa since they are the same size). You then slice both pizzas around the area of overlap. Two of you will each get one of the crescent-shaped pieces, and the third will get both of the football-shaped cutouts. Which should you choose to get more pizza: one crescent or two footballs?

The answer is, the two footballs. I enjoyed this problem, but I don’t enjoy typing out equations in LaTeX if I don’t have to, so below is my paper scratching showing how to do it. The process is basically this: the footballs are twice the area of a circle segment/secant, and the crescent is the area of a circle minus the area of a football. The problem is scale invarient, so it doesn’t matter how big the pizzas are, and the angle of the secant is \frac{2 \pi}{3} because the two centers and one point of intersection form an equilateral triangle.