Dark Mode On/Off

Interactive Learning

C Language course

GO Lang course

Learn JavaScript

Learn HTML

Learn CSS

C Language

C Tutorial

C Programs (100+)

C Compiler

Execute C programs online.

C++ Language

C++ Tutorial

Standard Template Library

C++ Programs (100+)

C++ Compiler

Execute C++ programs online.

Python

Python Tutorial

Python Projects

Python Programs

Python How Tos

Numpy Module

Matplotlib Module

Tkinter Module

Network Programming with Python

Learn Web Scraping

Linear Regression with Gradient Descent

Technology #Linear Regression#Gradient Decent#ML#AI

In a regression problem, the program predicts real-valued output. In linear regression, we have a training set or dataset in which we have to fit the best possible straight line or say the best possible linear relation with the least chance of error.

Regression problem:

Let's say that we have a data set of used vehicles that consists of a relation between kilometers driven and price. We have observed the sale of used vehicles for 6 months and came up with this data, which we will term as our training set. Using the training set a linear relation(straight line) has to be generated with error as low as possible. The linear relation( let's name it hypothesis) will now be able to predict prices for other vehicles whenever we give it the input of kilometers driven.

Linear relation can be represented as:

O = a1 + (a2)*x { or y = mx+c equation of a straight line } ……………..(1)

logistic regression with gradient decent

Our dataset includes the sale statement of 50 used vehicles. So let's denote our number of training examples by T,

T = 50

And, x = input (which we will enter), and

O = output (predicted price by the program)

Now as you can see in the first equation above a1 and a2 are the parameters which will determine the linear relation or in simple words the orientation of the line (like vertical, horizontal etc).

The values of the parameters a1 and a2 should be selected or chosen such that the error possibility is least. Here the sum of squared error will be our first source to evaluate the relation or model, after which we have to work on our hypothesis which generate the best possible line with least error chances.

Here the squaring of error is necessary for eliminating the negative value. And after squaring the error we will get more accurate value.

Our hypothesis will be,

H(x) = a1 + (a2)*x

So our squared error cost function or mean squared error would look like this:

logistic regression with gradient decent

About H(x),

For fixed a2 this is a function of x,
For different values of a2 slope of this function varies (something like this),

About E(a2) {let's assume E for only one parameter for better understanding}

Function of parameter a2,
Controls the slope of the line (as seen below),

Now for minimizing the squared error cost function E, we have an algorithm called Gradient Descent. It is used all over in machine learning. Now let's look at the problem, we have E(a1, a2) and where we have a1 and a2 such that E(a1, a2) is minimum.

So what we are going to do is:

Start with some random values of a1 and a2.
Keep changing a1, a2 to reduce E(a1, a2) until we reach the minimum.

Gradient Descent Algorithm

Repeat until convergence {

logistic regression with gradient decent

}

Now we also have to update the values of our parameters simultaneously.

The correct way is:

logistic regression with gradient decent

a1 := Variable1

a2 := Variable2

here := is the assignment operator and = is considered as truth assertion and ? is the learning rate.

The incorrect way is:

logistic regression with gradient decent

About parameter a2,

If a2 > 0, then x(input or feature or predictor) and O(output or target) have a positive relationship, i.e. if x increase then O will also increase.
If a2 < 0, then x(input or feature or predictor) and O(output or target) have a negative relationship, i.e. if x increase then O will decrease.

About parameter a1,

It determines the intercept of the line generated by our hypothesis.

About learning rate ?,

If the learning rate is too small, gradient descent can be very slow.
If the learning rate is too large then the gradient descent can exceed or overshoot the minimum point in the function. As a result of which it may fail to converge or even diverge in some cases.
Gradient descent can converge to a local minimum, even with the learning rate ? is fixed.
As we approach a local minimum, gradient descent will automatically take a smaller step, hence we don't need to decrease ? overtime. We can see that in the image below,

Putting it all together,

Gradient Descent Algorithm is,

Repeat until convergence {

logistic regression with gradient decent

}

logistic regression with gradient decent

Now lets combine them together, for that simplification we need to do the partial derivative of E(a1, a2) with respect to a1 and a2,

logistic regression with gradient decent

[ substituting the value of H(x)]

After doing the partial derivative we get,

logistic regression with gradient decent

so here's how our algorithm looks like,

Repeat until convergence {

logistic regression with gradient decent

} and we have to update it simultaneously.

logistic regression with gradient decent

Want to learn coding?

Try our new interactive courses.

View All →

C Language Course^NEW

115+ Coding Exercise

GO Language Course

4.5 (50+) | 500+ users

JS Language Course

4.5 (343+) | 6k users

CSS Language Course

4.5 (306+) | 3.3k users

HTML Course

4.7 (2k+ ratings) | 13.5k learners

Over 20,000+ students enrolled.

Shashwat45

Shashwat Pandey is an accomplished technical author with a wealth of experience in writing about computer networks. His passion for the subject extends beyond his writing, and he enjoys exploring new networking technologies to stay on the trend.

C TUTORIAL

C PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

C++ TUTORIAL

C++ PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

PYTHON TUTORIAL

PYTHON HOW TOS

INTERVIEW TESTS

EXECUTE CODE

JAVA TUTORIAL

JAVA CODE EXAMPLES

SPRING TUTORIAL

MORE IN JAVA

COMPUTER ARCHITECTURE

COMPUTER NETWORK

OPERATING SYSTEM

DBMS & SQL

PL/SQL

MongoDB

EXECUTE SQL

ANDROID DEVELOPMENT

GO LANGUAGE

LINUX

DOCKER

HTML TAGS (A to Z)

CSS REFERENCES

SASS/SCSS

KOTLIN

GAME DEVELOPMENT

PHP

GIT GUIDE

JAVASCRIPT

ADVANCED DSA

Linear Regression with Gradient Descent

Table of Contents

Regression problem:

Gradient Descent Algorithm

IF YOU LIKE IT, THEN SHARE IT

RELATED POSTS