# Yet Another Blog in Statistical Computing

I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

## Kick Off Spark

My first Spark section:

```
scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext

sdf: org.apache.spark.sql.DataFrame = [CARDHLDR: string, DEFAULT: string ... 12 more fields]

scala> sdf.printSchema()
root
|-- CARDHLDR: string (nullable = true)
|-- DEFAULT: string (nullable = true)
|-- AGE: string (nullable = true)
|-- ACADMOS: string (nullable = true)
|-- ADEPCNT: string (nullable = true)
|-- MAJORDRG: string (nullable = true)
|-- MINORDRG: string (nullable = true)
|-- OWNRENT: string (nullable = true)
|-- INCOME: string (nullable = true)
|-- SELFEMPL: string (nullable = true)
|-- INCPER: string (nullable = true)
|-- EXP_INC: string (nullable = true)
|-- SPENDING: string (nullable = true)
|-- LOGSPEND : string (nullable = true)

scala> sdf.createOrReplaceTempView("tmp1")

scala> spark.sql("select count(*) as obs from tmp1").show()
+-----+
|  obs|
+-----+
|13444|
+-----+

```

Pyspark section doing the same thing:

```
In [1]: import pyspark as spark

In [2]: sc = spark.SQLContext(spark.SparkContext())

In [3]: sdf = sc.read.csv("Documents/spark/credit_count.txt", header = True)

In [4]: sdf.printSchema()
root
|-- CARDHLDR: string (nullable = true)
|-- DEFAULT: string (nullable = true)
|-- AGE: string (nullable = true)
|-- ACADMOS: string (nullable = true)
|-- ADEPCNT: string (nullable = true)
|-- MAJORDRG: string (nullable = true)
|-- MINORDRG: string (nullable = true)
|-- OWNRENT: string (nullable = true)
|-- INCOME: string (nullable = true)
|-- SELFEMPL: string (nullable = true)
|-- INCPER: string (nullable = true)
|-- EXP_INC: string (nullable = true)
|-- SPENDING: string (nullable = true)
|-- LOGSPEND : string (nullable = true)

In [5]: sdf.createOrReplaceTempView("tmp1")

In [6]: sc.sql("select count(*) as obs from tmp1").show()
+-----+
|  obs|
+-----+
|13444|
+-----+

```