import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
import plotly.figure_factory as ff
from plotly.subplots import make_subplots
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
In this notebook, we will use basic feature transformations (feature engineering) to model non-linear relationships using linear models.
To enable easy visualization of the model fitting process we will use a simple synthetic data set.
data = pd.read_csv("data/synthetic_data.csv")
data.head()
X0 | X1 | Y | |
---|---|---|---|
0 | -1.254599 | 4.507143 | 0.669332 |
1 | 2.319939 | 0.986585 | 5.430523 |
2 | -3.439814 | -3.440055 | 7.640933 |
3 | -4.419164 | 3.661761 | -1.836007 |
4 | 1.011150 | 2.080726 | 6.148833 |
We can visualize the data in three dimensions:
data_scatter = go.Scatter3d(x=data["X0"], y=data["X1"], z=data["Y"],
mode="markers",
marker=dict(size=2))
layout = dict(margin=dict(l=0, r=0, t=0, b=0),
height=600,
scene = dict(xaxis_title='X0', yaxis_title='X1', zaxis_title='Y'))
go.Figure([data_scatter], layout)