Machine Learning is undoubtedly one of keywords for 2016 and will continue to be in 2017 and onwards.
3 familiar cases that anybody will be able to associate with:
Throughout this year, I aim to understand how machine learning works and how it will impact us in different industries.
First, let us understand what are the fundamentals that enabled machine learning, or in other words, why now?
To summarize, 4 elements are essential:
Cost to Store Data: I will credit this as the No 1 element for Machine learning, throughout the past decade, Cost of manufacturing hardware continue to reduce, while frameworks such as Hardoop allows for distributed processing of large data sets across clusters of computers using simple programing models.
IDC report suggest that from 2010 to 2015, the cost per unit data storage has reduced from US$9 to US$0.2; cheap storage means more things piling up ( that is also why our big house is always full of things we don't need), this provides the grounding for having digital data and the rest.
Digital Data: we are creating data every second we live, faster than ever more:
IBM teamed with researchers from John Hopkins University to predict outbreaks of dengue fever and malaria. They look at see how changes in rainfall, temperature, and even soil acidity can dramatically affect the populations of wild animals and insects that carry the diseases.
Deep Learning Algorithm: nevertheless, data is just part of the requirement for learning, since we are kids we know there must be a way to learn, or the information is non sense. And we know that first we teach and then the kid by itself.
The way to learn for the machines are different Algorithm that can be used for different use cases. There are likely never will be a "one-size-fit-all" Algorithm (let's put religion on the side for now). But we can look at what are the most popular Algorithm on the market and how they can be used:
Ironically there are so many "most popular algorithms" list out in the market, in my reference you can find 4 different ones, but as long as you do know the problem you are trying to solve the answer should not be far away.
Oracle offers a good framework to determine what you are trying to move, it categorize all the algorithms into 6 categorise and I just put all the names I found into one of the buckets:
To paraphrase this in the layman terms, we don't know much about how we learn, we will continue to figure it out and then use algorithms to copy these learning to machine so they can do some specific tasks we learn to do using our brain.
Computing Power: GPU, which is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display, enables parallel computing which means we can process fragmented, large amount of data in a very short period of time. With this, we can quickly put our algorithm, which are our findings, into test and get feedback to refine our models.
These are the 4 most credited factors that machine learning base on. I would add another factor that I see as also critical to to the phenomenon:
The Drive of Evolution: our desire to live better, do less and know more. I was going to stop at the 4th point, you see, cost goes down, more product (data) are created, more test can be done (computing) so we can learn faster and better (algorithm). It is a complete storyline.
But why do we do this, if there is no why I would be confident to say we can also stop tomorrow. Such things happen everywhere, when the hype goes away, so is everything else. When the task becomes to hard, we divert our attention to sometime else.
I would say not so much so for the case of machine learning for the 5th reason, which I will put it as "there are always a few that are interested to explore naively and will never stop; there are more than desire to find the few and push the findings to market so they get money or fame or fulfillment to their life; and there are so many that work without knowing why so they can enjoy a better life.
If I can put them into one sentence, I would say it is the Diversity of Human Value, what we need to fulfill our life that drives machine learning.
It is hard to say throughout this trip, who will learn more, the machine or us, but for sure whatever we have done, we can improve.
The opportunities in Machine learning applies to machine as well as the society. It is still in its early stage and for a while will stay there. In this year let's look at what has been done and what we think we happen in the near future.
Reference:
https://dzone.com/articles/comparison-gridcloud-computing
https://www.google.com.hk/amp/www.forbes.com/sites/bernardmarr/2015/09/30/big-data-20-mind-boggling-facts-everyone-must-read/?client=safari
http://www.businessinsider.com/facebook-350-million-photos-each-day-2013-9
https://www.google.com.hk/amp/venturebeat.com/2013/09/29/ibm-uses-big-data-to-predict-outbreaks-of-dengue-fever-and-malaria/amp/?client=safari
https://www.dezyre.com/article/top-10-machine-learning-algorithms/202
https://www.quora.com/What-are-the-top-10-data-mining-or-machine-learning-algorithms
https://www.cs.cmu.edu/~tom/pubs/Science-ML-2015.pdf
http://mp.weixin.qq.com/s/Bg4wkLmOQOJ53Se5uXHqbw
3 familiar cases that anybody will be able to associate with:
- Case 1: search engine such as Google
- Case 2: social media feed such as Facebook
- Case 3: mobile keyboard text suggestions
Throughout this year, I aim to understand how machine learning works and how it will impact us in different industries.
First, let us understand what are the fundamentals that enabled machine learning, or in other words, why now?
To summarize, 4 elements are essential:
Cost to Store Data: I will credit this as the No 1 element for Machine learning, throughout the past decade, Cost of manufacturing hardware continue to reduce, while frameworks such as Hardoop allows for distributed processing of large data sets across clusters of computers using simple programing models.
IDC report suggest that from 2010 to 2015, the cost per unit data storage has reduced from US$9 to US$0.2; cheap storage means more things piling up ( that is also why our big house is always full of things we don't need), this provides the grounding for having digital data and the rest.
Digital Data: we are creating data every second we live, faster than ever more:
- By 2020, 1.7 megabyte data will be crated every second
- By 2020, 1.4 trillion gigabyte data will exist on earth
- 31.25M messages are sent per minute on Facebook
- 350M photos are uploaded to Facebook per day, adding to its current 250B photo database
- 40,000 search queries are done via Google alone, which equals to 1.2T per year
- 300 hours of videos are uploaded onto Youtube every minute
IBM teamed with researchers from John Hopkins University to predict outbreaks of dengue fever and malaria. They look at see how changes in rainfall, temperature, and even soil acidity can dramatically affect the populations of wild animals and insects that carry the diseases.
Deep Learning Algorithm: nevertheless, data is just part of the requirement for learning, since we are kids we know there must be a way to learn, or the information is non sense. And we know that first we teach and then the kid by itself.
The way to learn for the machines are different Algorithm that can be used for different use cases. There are likely never will be a "one-size-fit-all" Algorithm (let's put religion on the side for now). But we can look at what are the most popular Algorithm on the market and how they can be used:
Ironically there are so many "most popular algorithms" list out in the market, in my reference you can find 4 different ones, but as long as you do know the problem you are trying to solve the answer should not be far away.
Oracle offers a good framework to determine what you are trying to move, it categorize all the algorithms into 6 categorise and I just put all the names I found into one of the buckets:
- Classification: logistic regression, naïve bayes, SVM, decision tree, neighbours etc)
- Regression: multiple regression, SVM, linear regression; PLS
- Attribute importance: MDL, non-negative matrix factorization
- Anomaly detection: one-class SVM
- Clustering: k-means, orthogonal partitioning
- Association: A Priori
- Feature extraction: NNMF; dimensionality reduction; fast singular value exaction; random Forest
To paraphrase this in the layman terms, we don't know much about how we learn, we will continue to figure it out and then use algorithms to copy these learning to machine so they can do some specific tasks we learn to do using our brain.
Computing Power: GPU, which is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display, enables parallel computing which means we can process fragmented, large amount of data in a very short period of time. With this, we can quickly put our algorithm, which are our findings, into test and get feedback to refine our models.
These are the 4 most credited factors that machine learning base on. I would add another factor that I see as also critical to to the phenomenon:
The Drive of Evolution: our desire to live better, do less and know more. I was going to stop at the 4th point, you see, cost goes down, more product (data) are created, more test can be done (computing) so we can learn faster and better (algorithm). It is a complete storyline.
But why do we do this, if there is no why I would be confident to say we can also stop tomorrow. Such things happen everywhere, when the hype goes away, so is everything else. When the task becomes to hard, we divert our attention to sometime else.
I would say not so much so for the case of machine learning for the 5th reason, which I will put it as "there are always a few that are interested to explore naively and will never stop; there are more than desire to find the few and push the findings to market so they get money or fame or fulfillment to their life; and there are so many that work without knowing why so they can enjoy a better life.
If I can put them into one sentence, I would say it is the Diversity of Human Value, what we need to fulfill our life that drives machine learning.
It is hard to say throughout this trip, who will learn more, the machine or us, but for sure whatever we have done, we can improve.
The opportunities in Machine learning applies to machine as well as the society. It is still in its early stage and for a while will stay there. In this year let's look at what has been done and what we think we happen in the near future.
Reference:
https://dzone.com/articles/comparison-gridcloud-computing
https://www.google.com.hk/amp/www.forbes.com/sites/bernardmarr/2015/09/30/big-data-20-mind-boggling-facts-everyone-must-read/?client=safari
http://www.businessinsider.com/facebook-350-million-photos-each-day-2013-9
https://www.google.com.hk/amp/venturebeat.com/2013/09/29/ibm-uses-big-data-to-predict-outbreaks-of-dengue-fever-and-malaria/amp/?client=safari
https://www.dezyre.com/article/top-10-machine-learning-algorithms/202
https://www.quora.com/What-are-the-top-10-data-mining-or-machine-learning-algorithms
https://www.cs.cmu.edu/~tom/pubs/Science-ML-2015.pdf
http://mp.weixin.qq.com/s/Bg4wkLmOQOJ53Se5uXHqbw
Comments
Post a Comment