Simple Linear Regression¶

Model Training and Prediction Using Derived Formulas¶

In [1]:
import pandas as pd
import seaborn as sns
import plotly.express as px
import matplotlib.pyplot as plt
import numpy as np

Consider the tips dataset. Each row represents one table that ate at a restaurant. For example, the top row of the table was a table for 2 eating dinner on a Sunday, where the person who paid the check was female and not a smoker. This table had a \$16.99 total bill, and they tipped \\$1.01.

In [2]:
df = sns.load_dataset("tips")
df.head()
Out[2]:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

In many cultures, it is customary to tip based on the total bill. This data appears to have been collected from such a culture.

In [3]:
px.scatter(df, x = "total_bill", y = "tip")