Yet Another Data Blog: Week 7 : Zipfian Academy - Advanced Machine Learning and Deep Learning

Monday, March 10, 2014

Week 7 : Zipfian Academy - Advanced Machine Learning and Deep Learning

The week started with a detour into more advanced machine learning algorithms. We covered Logistic Regression, SVM, Naive Bayes and $k-$Nearest Neighbors. We compared these algorithms on several datasets to see which situations one would perform better than the other. Before now, I usually pick my favorite algorithms and apply them on a datasets - that's the wrong way to approach things. You need to be more strategic in choosing which algorithm you use (Use the right tool for the job). Machine learning algorithms are broadly classified as either generative or discriminative and you have MCMC (generative) and Neural Networks (discriminative) at the opposite ends of this spectrum respectively.

Next day, we moved to Decision Trees, Random Forest and ensembles We used BigML to visualize decision trees and i must say they have one of the best visualizations for Decision Trees. It comes in pretty handy when you're just doing EDA. Building Decision Trees can be slow and prone to over fitting but Random Forest solves a lot of these issues as the modeling process can be parallelized. They also tend to give you models with low bias and variance.

By mid-week, we delved into Deep Learning and built a Deep Belief Network using this library with several hidden layers (Restricted Boltzmann Machines) to classify digits from the popular MNIST dtataset. We had a feed-forward setup with no back propagation. Neural Networks have been around for a while but they were put back on the map with the advent of Deep Learning about a decade ago. In academic circles, most of the hottest Deep Learning research is going on at places like NYU/Facebook (Yann LeCun), Toronto/Google (Geoff Hinton), Montreal (Yoshua Bengio) and Stanford/Google (Andrew Ng) (..of course, not listed in order)

Towards the end of the week, we took another detour into time series analysis and worked on some trend and seasonality analysis using pandas. Friday was more of a catchup day. There were no official sprints. We worked on project proposals and had a Deep Learning and git breakout. We're slowing nearing the end of the structured curriculum and everyone is slowly moving into project mode. I'll say, there were a lot of 'aha' moments for me this week.

Highlights of the week:

We had a guest lecture from Allen on multi-layer perceptrons. He talked about some of the interesting research he worked on at Nike and how they used Neural Networks and other machine learning tools to design better footwear.
Chief Data Scientist at @Personagraph gave an interesting lecture on how they're using machine learning
We closed out the busy week with a Mentor Mixer. A lot of industry practitioners attended. The goal was to match current students with practicing Data Scientists. There were a variety of mentors who showed up, mostly Data Scientists, some Chief Data Scientists and a two-time Kaggle Competition Winner (yes, they are a rare breed but they do exist)

Yet Another Data Blog

Monday, March 10, 2014

Week 7 : Zipfian Academy - Advanced Machine Learning and Deep Learning

No comments:

Post a Comment