Russell Spitzer's Blog

Some guy's blog

Folding with Spark

I felt the need to write this post after I read the blog post which did a great job at explaining how fold and foldByKey worked. The only thing I thought was missing from this rundown was a bit of detail on how these operations work differently than their scala counterparts.

Read Post

Exploring Tombstone Behavior with CQL on Cassandra 2.0 and 2.1

##Cassandra 2.1 17:06:16 ➜ ~/repos/ git:(master) ✗ ./

Read Post

Loading a CassandraRDD into a HiveContext in Spark

Spark is awesome and I love it. SparkSQL is also awesome but unfortunately is not fully mature. Although the folks at DataBrix have talked about how it will eventually become as full ANSI SQL langauge that time is honestly far off. This means that most folks will want to fall back onto HiveQL for doing their more complicated queries on Spark.

Read Post

It isn't fast enough

It isn’t fast enough

I’m often confronted with people asking my why a certain technology or program isn’t fast enough. This is a good question since we should always be interested in making things fast. But usually I hear these questions in response to a perceived slowness which is hard to define or can only be explained in terms of other technologies.

Read Post

Setting up

Setting up with Jekyll

I was a little tired of how difficult it was for me to manage my wordpress style blog and I love git. So after I saw a couple of my friends with their awesome blogs I knew I needed one as well.

Read Post