Comes python with pandas

Pandas and Altair: Visualizing data with Python

Clicking on the download link sends a 25 megabyte CSV file to the download directory. A double click on it starts LibreOffice and then you have to wait. The import takes over 30 seconds on a not entirely fresh i5. When the giant table finally appears, impatiently type the formula for the average into the free column on the far right and then scroll for a minute to transfer the formula to all rows. It all feels pretty tough. If you realize that Office has to render all the data in its interface, you notice that the program is actually not slow. It's just not made for tables that big.

So that everything goes faster, the graphical interface has to give way. Instead, Python is used, more precisely the pandas library. Under the hood, Pandas uses Numpy and thus fast C code to store arrays efficiently. Instead of in (without column names) everything ends up in objects of the type. One of these is, so to speak, a worksheet for programmers that provides the same functions as Excel. Known for numpy veterans, rather unusual for Python: the data in s is hard-typed. When you create a DataFrame, you must therefore immediately determine whether there are floating point numbers, integers, strings or times in the columns. In addition, columns are generally named, which provides an overview in a large table.

The Altair framework draws a pretty diagram from a DataFrame with a handful of lines. It offers a shorter and more logical syntax for this than other Python plotting libraries such as Matplotlib. Altair does not reinvent the wheel, but relies internally on the JavaScript plotting library Vega (or Vega-Lite), which is why the framework exports web pages in addition to PNGs and SVGs without additional effort. The easiest way to use Altair's built-in web affinity is to write all of the Python code in a Jupyter notebook. If you activate the appropriate renderer there with a line, the diagrams appear directly in the web interface of the notebook.