Monday, August 26, 2013

Machine Learning as a Service

There are a couple of start-ups working on delivering machine learning as a service. Some of these solutions provide a turn-key data science environment : some data prep, munging, models, predictions and visualizations

Prediction.io - this is an open source machine learning server. It looks like they focus mainly on recommendations and you can build and fine tune your own models

Yhat - build and deploy models with R and Python

BigML - this has a user friendly interface and great visualizations for non-techies. You also have the choice to modify and fine tune your models and can solve both classification and regression problems here

wise.io - they seem to have a wider variety of problems they can tackle. They also have their own optimized Random Forest implementation that showed some impressive stats when benchmarked against R, Weka and Scikit-Learn

Precog - another strong contender in the space. They have a wider variety of type of input data - logs. JSON, NoSQL, etc. They were recently acquired, so their service may be shut down.

Ersatz - their service uses deep neural networks (black box). Looks like they're still in private beta

Tuesday, August 13, 2013

Some Interesting links. Hyperloop, Google n-grams, PredPol

Finally, Elon Musk unveiled some details about the Hyperloop transport system. It would be quite interesting if this makes it out of conception and actually gets built. Right now, California is on the verge of spending more than 10x of what it would cost to build the Hyperloop system, to build one of the slowest high-speed rail systems in the world.  A nice infograph of the Hyperloop Transport System

A very detailed  Data Mining Map

An interesting visual mashup showing the Start-up Universe This mashup shows everything you'd want to know including company details, funding, VC's, rounds, etc. The data for this mashup is from the CrunchBase API

I recently came across  Google's N-gram viewer.It graphs yearly counts of n-grams over the past 200 years.  You could literally see when people started using "Donut" over  "Doughnuts"  link  The rise of "Donut" starts at around the same time breakfast chains like Dunkin' Donuts were founded

The Atlas learning environment from O'Reilly. You could read a bunch of tech books both in early release and published.

Predpol, a system that helps predict crime in real time.  They claim to place police officers at the right time and place, giving them the best chance to prevent crimes. The systems analyses historical crime data and assigns probabilities of future events / crimes to regions of space and time. This is probably sounding like Tom Cruises' sci-fi thriller, The Minority Report, but without psychics and bathtubs. The system predicts the 'when' and the 'where' of a crime but not the whom.  I would not be surprised if the 'whom' can be predicted in the near future. PredPol was designed by scientists at UCLA, Santa Clara, UCI

And finally, the road to becoming a Data Scientist can be a long and winding one, Swami Chandrasekaran's take on the Data Science Curriculum