Can we really hold the evolution of Machine ?
Updated: May 5, 2020
The term ‘Machine Learning’ refers to the automated detection of meaningful patterns in data. Machine Learning (ML) is sometimes also called as ‘Automated Learning’.
In this article, I am going to talk about Data Science, Machine Learning, Statistics and Probability and many more detailed subjects which a Data Scientist use in their routine jobs.
When do we need Machine Learning?
Tasks where there can be a defect: Human perform many tasks routinely; however, they fail sometimes. Ex: Driving. In such cases if we program efficiently and provide enough training data the program will learn from its experience and achieve satisfactory results.
Task where Human cannot go or perform: Human is one of the intelligent species on earth, however there are few tasks where human has no capability to handle such huge or micro elements. Ex: Datasets of astronomy and Pharma medical knowledge.
Adaptive : Another default feature is that, once program is installed, they are known for its rigidity. There can be many changes in the behavior of a person to person ML program adapts to their inputs. Ex: Siri or Alexa. The program is trained by datasets however can adapt to almost all the users.
Types of Machine Learning:
ML is a vast domain; however, it can widely be categorized into:
Supervised Machine Learning:
As the name suggests this involves an interaction with the user or the environment.
Example: A dataset with 100 pictures of a dog are provided as an input to determine the breed of the dog to an ML program, however in the real world when a new size, shape of the same dog has been provided as a test data to the ML program. The program adapts the new picture and predicts the correct breed of the dog. In this case program learnt from the test data to predict.
The popular algorithms used in Supervised ML are:
Super Vector Machines
Un-Supervised Machine Learning:
In case of Un-Supervised ML, unlike the earlier ML, here we do not have any variant ‘Y’ to predict. Basically, in this the program tries to classify the data in different segments or clusters for better understanding.
Example: Dividing the population based on gender. In banking companies, customers can be segmented based on their expense behavior into prospects for other banking products.
The popular algorithms used in Un-Supervised ML are:
Hierarchical Clustering Algorithm
Artificial Intelligence (AI):
Artificial intelligence came into existence in 1960s, however as there are no enough data for the AI to learn or survive. Off late we have huge volumes of data captured and analyst would like to leverage AI to learn the patterns of the data and predict required variants.
Always there’s a myth around AI and ML. Below picture clarifies all those. Deep learning, Machine Learning are subsets of AI.
Popular Programming Languages for Machine Learning:
All the above can be accessed from an Anaconda navigator and
Python is one my personal favorite for Machine Learning and I am coming up with yet another detailed article/blog on Python. So, stay tuned.
Statistics and Probability:
When we talk about Data Science, we need to shed some light on Statistics and Probability. These 2 subjects are always been given lower importance during school. However, Statistics and Probability are always there in the nerve.
Ex: In a cricket match, depending on the past data, even a small child can predict whether Sachin is going to hit a century or not in a match. Or while playing a Ludo game with a dice, every player thinks about getting a ‘six’, however does not get as the probability of him getting a ‘six’ is 1/6th.
Data refer to facts and statistics collected together for reference or analysis.
Data can be:
Types of Data:
Everyone one is surrounded with different sets of data. For instance, when we enter our office, we have a finite number of people using finite number of gadgets however each of them may use infinite resources like power, water or even the WIFI data.
Data can be categorized into:
1. Quantitative Data: As the name suggests, this is related to quantity. Again, in Quantitative Data we have 2 types:
Discrete: Finite number. Ex: Strength of a classroom
Continuous. Infinite number. Ex: Weight of a person.
2. Qualitative Data: On the divergent side, this is related to quality. Which can be classified into 2 types:
Nominal Ex: Male/Female/LGBT gender of the population
Ordinal Ex: Ranking in the school or CGPA – A, B, C)
Statistics is an area of applied mathematics concerned with the data collection, analysis, interpretation and presentation.
Real time check:
We have 10 years of Sales data of a product, business would like to analyze the data and come up to improve the Key Performance Indicators (KPIs).
This is a Statistical problem statement and can be solved by Statistics.