Yet Another Blog in Statistical Computing

I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

Chain Operations of Pandas DataFrame

Chain operations is an innovative feature provided in dlpy package of R language (https://statcompute.wordpress.com/2014/07/28/chain-operations-an-interesting-feature-in-dplyr-package). This new functionality makes the data manipulation more compact and expressive.

In python, pandas-ply package also brings this capability to pandas DataFrame objects, as shown below.

In [1]: import pandas as pd

In [2]: df = pd.read_csv('/home/liuwensui/Downloads/2008.csv')

In [3]: ### STANDARD PANDAS OPERATIONS ###

In [4]: grp = df[df.Month < 6].groupby(["Month", "DayOfWeek"])

In [5]: df1 = pd.DataFrame()

In [6]: df1["ArrMedian"] = grp.ArrTime.median()

In [7]: df1["DepMean"] = grp.DepTime.mean()

In [8]: df2 = df1[(df1.ArrMedian > 1515) & (df1.DepMean > 1350)]

In [9]: print(df2)
                 ArrMedian      DepMean
Month DayOfWeek                        
1     7               1545  1375.156487
2     5               1522  1352.657670
      7               1538  1375.605698
3     7               1542  1377.128506
4     7               1536  1366.719829
5     7               1535  1361.323864

In [10]: ### PLY CHAIN OPERATIONS ###

In [11]: from ply import install_ply, X

In [12]: install_ply(pd)

In [13]: df1 = (df
   ....:        .ply_where(X.Month < 6)
   ....:        .groupby(['Month', 'DayOfWeek'])
   ....:        .ply_select(DepMean = X.DepTime.mean(), ArrMedian = X.ArrTime.median())
   ....:        .ply_where(X.ArrMedian > 1515, X.DepMean > 1350))

In [14]: print(df1)
                  ArrMedian      DepMean
Month DayOfWeek                         
1     7                1545  1375.156487
2     5                1522  1352.657670
      7                1538  1375.605698
3     7                1542  1377.128506
4     7                1536  1366.719829
5     7                1535  1361.323864
Advertisements

Written by statcompute

January 31, 2015 at 4:26 pm

Posted in PYTHON

Tagged with ,

%d bloggers like this: