Big Data in 2020
Last year, Big Data became a big topic across nearly every area of IT. IDC defines Big Data technologies as a new generation of technologies and architectures, designed to economically extract value...
View ArticleHow to Install CouchDB and Futon on Ubuntu
CouchDB, like Redis, Cassandra, and MongoDB, is a “NoSQL” database. Similar to other databases of its kind, CouchDB stores its information in a non-relational database, keeping its data in separate...
View ArticleSetting up a Neo4j Cluster on Amazon
There are multiple ways to setup a Neo4j Cluster on Amazon Web Services (AWS) and I want to show you one way to do it. Overview: Create a VPC Launch 1 Instance Install Neo4j HA Clone 2 Instances...
View ArticleHadoop Hive UDF Tutorial – Extending Hive with Custom Functions
http://blog.matthewrathbone.com/2013/08/10/guide-to-writing-hive-udfs.html
View ArticleApache Tez: A New Chapter in Hadoop Data Processing
In this post we introduce the motivation behind Apache Tez (http://incubator.apache.org/projects/tez.html) and provide some background around the basic design principles for the project. As Carter...
View ArticleIntroducing the Spring YARN framework for Hadoop
If you have been following the Hadoop community over the past year or two, you’ve probably seen a lot of discussions around YARN and the next version of Hadoop’s MapReduce called MapReduce v2. YARN...
View ArticlePictures Make Sense of Big Data
Visualization technology can turn data into pictures that are far more comprehensible Most people have trouble recalling strings of numbers that are longer than their phone numbers. So how do we begin...
View ArticleRealtime Analytics for Big Data: A Facebook Case Study
Realtime Analytics for Big Data Knowing what your users are doing on your site in real time and matching what they do with more targeted information transforms into better conversion rate and better...
View Article