Descriptive Analytics and K-Means Clustering in Big Data
What is Machine Learning?
- By providing instructions with data we can train a digital machine to process and execute operations like a human
- Once the machine learns from that, it will work very smoothly and efficiently on new data and operations
Types of Machine Learning
Unsupervised Learning
- When we cannot guide a machine with a proper guideline on a dataset we can let the machine to discover important patterns, relationships, and anomalies from the given data, and this ripples to unsupervised learning.
Supervised Learning
- When we can make decisions on data, we can make machine to do that more accurately and more efficiently, this refers to supervised learning
Seim-Supervised Learning
- We are partially giving instructions but it is not enough to make a final decision, there fore computer has to work on the data from other way around at combination of both these efforts will provide a final conclusion.
Reinforcement Learning
- To collect observations from the environment at realtime, and make decisions which are dynamic.
Descriptive Analytics
- Descriptive analysis is a type of data analysis that involves summarising and describing the main features of a dataset. It helps to identify patterns, trends, and relationships within the data.
- Exploring what happens in an incident It's called descriptive analysis.
Clustering
- It is an unsupervised learning method which will group data objects based on different / similar characteristics.
Q: Differentiate the characteristics between a Cat and a Dog
- Length of Ears
- Length between eyes and nose
- Eye shape
- Height
- Shape of head
Applications of Clustering
- Customer Segmentation
- Market Segmentation
- Image Segmentation
K-Means Clustering Algorithm
It is a very simple algorithm, that group data points, into different groups according to variance of that specific data point with it's pre-defined centroid
Ex:- We received 10 data points which are having only two parameters, salary and expenses. and It's plotted in the following coordinate plane.