I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

## A Comparison between Blaze and Pandas

Blaze (https://github.com/ContinuumIO/blaze) is a lightweight interface on top of other data or computing engines such as numpy or sqlite. Blaze itself doesn’t do any computation but provides a Pandas-like syntax to interact with the back-end data.

Below is an example showing how Blaze leverages the computing power of SQLite (https://www.sqlite.org) and outperforms Pandas when performing some simple tasks on a SQLite table with ~7MM rows. Since Blaze doesn’t have to load the data table into the memory as Pandas does, the cpu time is significantly shorter.

Pandas

```import pandas as pda
import pandas.io.sql as psql
import sqlite3 as sql
import numpy as npy

con = sql.connect('/home/liuwensui/Documents/data/flights.db')
ds1 = psql.read_sql('select * from tbl2008', con)
ds2 = ds1[ds1.DayOfWeek > 1]
ds3 = ds2.groupby('DayOfWeek', as_index = False)
ds4 = ds3['AirTime'].agg({'avg_AirTime' : npy.mean})
print(ds4)

#   DayOfWeek  avg_AirTime
#0          2   103.214930
#1          3   103.058508
#2          4   103.467138
#3          5   103.557539
#4          6   107.400631
#5          7   104.864885
#
#real 1m7.241s
#user 1m0.403s
#sys  0m5.278s
```

Blaze

```import blaze as blz
import pandas as pda

ds1 = blz.Data('sqlite:////home/liuwensui/Documents/data/flights.db::tbl2008')
ds2 = ds1[ds1.DayOfWeek > 1]
ds3 = blz.by(ds2.DayOfWeek, avg_AirTime = ds2.AirTime.mean())
ds4 = blz.into(pda.DataFrame, ds3)
print(ds4)

#   DayOfWeek  avg_AirTime
#0          2   103.214930
#1          3   103.058508
#2          4   103.467138
#3          5   103.557539
#4          6   107.400631
#5          7   104.864885
#
#real 0m21.658s
#user 0m10.727s
#sys  0m1.167s
```