Yet Another Blog in Statistical Computing

I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

Efficiency of Importing Large CSV Files in R

### size of csv file: 689.4MB (7,009,728 rows * 29 columns) ###

system.time(read.csv('../data/2008.csv', header = T))
#   user  system elapsed 
# 88.301   2.416  90.716

library(data.table)
system.time(fread('../data/2008.csv', header = T, sep = ',')) 
#   user  system elapsed 
#  4.740   0.048   4.785

library(bigmemory)
system.time(read.big.matrix('../data/2008.csv', header = T))
#   user  system elapsed 
# 59.544   0.764  60.308

library(ff)
system.time(read.csv.ffdf(file = '../data/2008.csv', header = T))
#   user  system elapsed 
# 60.028   1.280  61.335 

library(sqldf)
system.time(read.csv.sql('../data/2008.csv'))
#   user  system elapsed 
# 87.461   3.880  91.447
Advertisements

Written by statcompute

February 11, 2014 at 12:07 am

Posted in Big Data, S+/R

%d bloggers like this: