What is Logistic Regression
As you may be knowing Logistic Regression is a Machine Learning algorithm. It is used for binary classification problems. We also have multiclass logistic regression, where we basically re-run the binary classification multiple times.
It is a linear algorithm with a non-linear transform at the output. The input values are combined linearly using weights to predict the output value. The output value which is being modelled is a binary value rather than a numeric value.
Suppose we have the results of a set of students, where the criteria for passing is that the student should score 50% or more. Else the student is classified as failed.
We can classify this problem statement using linear regression. But if our data contains some outliers in the test data, it will affect the orientation of the best fit line.
NOTE: Outliers are the points that are apart from the usual group of points. These points can have a strong influence on the orientation of the best fit line.
So we go on with Logistic regression for such use cases.
A sigmoid function is a function which limits the value of the output in the range between 0 and 1. The graph and the equation of the sigmoid function are given as follows.
The value x in the equation is the cost function we calculate using the output and weights. The sigmoid function makes sure that the output value is classified in between 0 and 1 for any large value of x.
The cost function in the case of logistic regression is given by
It should be maximum for correct classification to happen.
- Suppose passing students are considered as positive output & failed students as negative output. Also, if the points classified above the classifying plane is taken as positive and those below the line are taken as negative. i.e., the value of weight Wi.
- The cost function is determined by the product of these two and hence if the point is correctly classified, the cost function will be positive and if it is incorrectly classified, the cost function will become negative.
- So we have to make sure that we have considered the desired weight value, to make sure that the points are correctly classified.
- Hence by looking at the cost function, we can say that whether a particular entry is correctly classified or not.
Hope you found this post useful. 😊😊
You can find me on Twitter