Principal component analysis (PCA) is to find the subset of variables that best explains the variation in the data. Let’s see what exactly that means,

It often happens there are so many features in the data set and small fraction of information is present in each feature or variable. For ex: suppose we have a dataset which consist of 50 columns (features), it will be almost impossible to visualize this 50 features in the same plot (like 50 dimensional view or 2 dimensional plot for every feature VS 49 others) and look for insights. So what we do is that…

Support Vector Machine (SVM) is an approach for classification which uses the concept of separating hyperplane. It was developed in the 1990s. It is a generalization of an intuitive and simple classifier called maximal margin classifier.

In order to study Support Vector Machine (SVM), we first need to understand what is maximal margin classifier and support vector classifier.

In maximal margin classifier, we use a hyperplane to separate the classes. But, **What is a hyperplane?** Consider we have a p-dimensional space, a hyperplane is a flat affine(does not necessarily pass from the origin)subspace of dimension p-1. For example, in a…

I often find myself in situations saying yes, like buying a product online which I don’t need or returning a favor even if it is infeasible to do so. Sales operators, automobile dealers, and the e-commerce industry use compliance techniques to influence people. Professionals aren’t the only ones who use these techniques. We all use them and fall victim to them in our interactions with people¹. A mere understanding of the underlying principles which govern these techniques can help us make better decisions.

Your perception about something will get affected by things you have encountered just before. For example, After…

Generalized additive models (GAMs) provide a general framework for extending a standard linear model by allowing nonlinear functions of each of the variables, while maintaining additivity. Let’s see what exactly that means,

Linear models are simple to describe and implement and have advantage over other approaches in terms of interpretation and inference. But they have limitations in prediction power, that is, how accurately we can predict the output. Suppose we have data which consist of input of P features (X1, X2,….., Xp), and a output Y. …

**In machine learning, ROC curve is an evaluation metric which measures the performance of a machine learning model by visualizing, especially when data is skewed.** Let’s see what exactly that means,

Consider **heart data **which consist of 13 features such as **age, sex, chol (cholesterol measurement). **Our goal is to predict whether an individual has heart disease based on the above features, this is a **binary classification problem**. That is, with only two classes. Let us suppose we have 100 samples (a sample corresponds to single patient’s information) amongst which 90 samples are positive (have a heart disease) so if…

Jack Bogle was the founder and chief executive of The Vanguard Group, one of the world’s largest investment companies and he is also regarded as the father of index funds. Below are the principles given by Jack in his book Common Sense on Mutual Funds.

**Invest you must**. The biggest risk is the long-term risk of not putting your money to work at a generous return, not the short-term but real risk of price volatility.

**Time is your friend**. Give yourself all the time you can. Begin to invest in your 20s, even if it’s only a small amount, and…

When it is your first day at school, you meet people you barely know and after spending days, you become friends with some of them, based on similarities. Clustering is exactly this.

**It refers to a very broad set of techniques for finding subgroups, or clusters in a data set**. When we cluster the observations of a dataset we seek to partition them into distinct groups so that the **observations within each group are quite similar to each other**, while observations in different groups are quite different. An application of clustering arises in marketing. We may have access to a…

t-SNE ( t-Distributed Stochastic Neighbor Embedding) is a technique that visualizes high dimensional data by giving each point a location in a two or three dimensional map. The technique is the variation of Stochastic Neighbor Embedding (SNE) that is much easier to optimize and produces significantly better visualization.

There can be several other techniques which can be used for visualizing high dimensional data, like: PCA, which is a linear technique that focuses on keeping the low dimensional representation of dissimilar data points far apart. For a high dimensional data that lies on or near a low dimensional nonlinear manifold, it…

Neural Network is the means by which computer learns to perform some task with the help of data. Let’s see this in detail with an example.

Whenever a real estate broker wants to sell any house he will look for the best price with profit at which the house could be sold. The broker itself figures out the price of the house on the basis of some **features **like: Size, No. of bedrooms and locality, his experience will help in figuring out the optimal price. How neural network will do the same task? Here in human what we called ‘experience’…

Interested in Various Domains | Owner of Club Linguistics