What is Linear Regression?
A linear regression is a fundamental or gate of the machine learning algorithems from which the begineer's are enter in to machine learning field. It is a statistical technique which is used to check the relationship between the input(independent variable) and output variable (called as dependent variable). The simplest form of the regression equation with one dependent and one independent variable is defined by the formula y = mx+c , where y = estimated dependent variable score, c = constant, m = regression coefficient, and x = score on the independent variable.
Three major uses for regression analysis are (1) determining the strength of predictors, (2) forecasting an effect, and (3) trend forecasting.
First, the regression might be used to identify the strength of the effect that the independent variable(s) have on a dependent variable. Typical questions are what is the strength of relationship between dose and effect, sales and marketing spending, age and income, or area and profit.
Second, it can be used to forecast effects or impact of changes. That is, the regression analysis helps us to understand how much the dependent variable changes with a change in one or more independent variables. A typical question is, “how much additional sales income do I get for each additional $1000 spent on marketing?”
Third, regression analysis predicts trends and future values. The regression analysis can be used to get point estimates. A typical question is, “what will the price of gold be in 6 months?”
Types of Linear Regression?
Simple linear Regression
1 independent variable, 1 dependent variables(interval or ratio or continues)
Mulitple linear Regression.
2+ independent variable, 1 dependent variables(interval or ratio or continues)
2+ independent variables, 1 dependent variables(interval or ratio or dichotomous)
1+ independent variable,1 dependent variables(nominal or dischotomous)
1+ independent variable, 1 dependent variables(interval or ratio or dischotomous)
1+ independent variable, 1 dependent variables(interval or ratio)
Is scaling is required for the linear regression?
It depends on how you're solving for the optimal solution. Let's say you're performing a OLS regression . There are two ways to solve for the optimal solution. You can solve it via the analytical solution or via an iterative algorithm such as gradient descent.
If you're using the analytical solution, feature scaling wont be of much use. In fact, you may want to refrain from feature scaling so that the model is more comprehensive. However, if you are using the gradient descent algorithm, feature scaling will help the solution converge in a shorter period of time.
Note: both methods will arrive at the same solution since the problem is a convex one. In higher dimension (when you have alot of independent variables), the effect of using unscaled features w/ gradient descent is like rolling a ball down a taco shell -> inefficient.
Is linear regression sensitive to outliers?
First, linear regression needs the relationship between the independent and dependent variables to be linear. It is also important to check for outliers since linear regression is sensitive to outlier effects.The slope of a line changes when we have a outliers in the data it leads to the error in predictions. Removing or handling the outliers is the best approach to get an good accuracy.