Yet Another Blog in Statistical Computing

I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

Efficiency of Concatenating Two Large Tables in SAS

In SAS, there are 4 approaches to concatenate two datasets together:
1. SET statement,
2. APPEND procedure (or APPEND statement in DATASETS procedure)
3. UNION ALL in SQL procedure,
4. INSERT INTO in SQL procedure.
Below is an efficiency comparison among these 4 approaches, showing that APPEND procedure, followed by SET statement, is the most efficient in terms of CPU time.

data one two;
  do i = 1 to 2000000;
    x = put(i, best12.);
    if i le 1000 then output one;
    else output two;
  end;
run;

data test1;
  set one two;
run;

data test2;
  set one;
run;

proc append base = test2 data = two;
run;

proc sql;
  create table
    test3 as
  select
    *
  from
    one

  union all

  select
    *
  from
    two;
quit;

data test4;
  set one;
run;

proc sql;
  insert into
    test4
  select
    *
  from
    two;
quit;
Advertisements

Written by statcompute

June 12, 2015 at 11:11 pm

Posted in Big Data, SAS

%d bloggers like this: