Saturday, October 5, 2013

Overview : Data Week 2013

The first two days of the conference started with bootcamps. There was a good representation of topics from Machine learning, Hadoop, Visualization, NoSQL and R. The rest of the week was filled with exhibitions, talks and presentations by practitioners in the field.

The are a few recurrent themes or trends that I noticed 
  • Everyone is trying to interface with Hadoop. Everyone is jumping on the Hadoop bandwagon.. yes I mean everyone including Database / Statistics / ETL giants like SAS, Oracle, Informatica, Teradata among others. There were companies showing their SQL on Hadoop solutions and querying JSON data with SQL, etc
  • A lot of established companies are making big data plays, a few examples.. Monsanto's acquisition of Climate Corporation, Home Depot's acquisition of BlackLocus, CSC's acquisition of InfoChimps and Ebay's acquisition of Decide just to name a few. They understand that data is now a strategic asset and having talent on hand to analyze and extract actionable insights from data is just par for the course.
  • More Open Source Software efforts developed at some of the most innovative companies and then released into the wild .. like Cassandra (Facebook), Storm (Twitter), Impala (Cloudera).. obviously this is not new
  • Democratizing data and insights via easy to use APIs

The underlying theme here is that if you can turn data into useful products that makes people's lives easier and/or more efficient or help companies understand and/or monetize their customers better, you're on your way to huge valuations and IPO riches (right...)

There were talks by Causes and Civic Data Challenge around collaboration, analyzing and using data for public good. You probably didn't know every time you waste 10 seconds of your life trying to figure out the words on a recaptcha, you're actually helping to digitize textbooks and when you try to learn a new language on duolingo, you're helping to translate the web. These massive online collaborative efforts have helped sites like Wikipedia which now has an army of proofreaders and writers.

Democratizing Data
There were a lot of vendors showcasing their data, analytics, machine learning and visualization APIs. Democratizing data by getting it in the hands of more people / decision makers and reducing time to insights will only help organizations become more agile and nimble.

Evolution of the Data Scientist
The term was officially coined a few years ago, but the role has always been around in various forms. Organizations are trying to resolve how to fit data scientist into the product pipeline, some of them discussed embedding them with various teams, deploying the data science team as a skunk-works, etc

In all, there was a wide variety of speakers and topics about data. The value you get from these conferences is in discovering new/innovative companies in the space working on cool products/ideas and getting a chance to meet, listen to and converse with most of the names behind the big data tools, packages and utilities you already use.

No comments:

Post a Comment