Note
Go to the end to download the full example code
Discrete first-order method vs. Lasso¶
Comparison of the support recovery of discrete first-order method and Lasso regression on a synthetic dataset.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from discretefirstorder import DFORegressor
Creating a synthetic dataset with some uninformative features¶
We create a synthetic dataset with only 10 informative features. We keep the true coefficients for comparison with the estimates.
Lasso¶
lasso = Lasso()
lasso.fit(X_train, y_train)
lasso_score = lasso.score(X_test, y_test)
lasso_coef_error = np.sqrt(np.sum((lasso.coef_ - coef) ** 2))
DFO¶
Comparison of the estimated coefficients¶
We can see that the discrete first-order method, in this case, seems to recover the coefficients more faithfully to the ground truth
fig, ax = plt.subplots(1, 1, figsize=(9, 12))
index = np.arange(X.shape[1])
bar_width = 0.25
ax.barh(index, coef, bar_width, label="Ground Truth")
ax.barh(
index + bar_width,
dfo.coef_,
bar_width,
label=f"DFO - R2={dfo_score:.4f}; coef. error={dfo_coef_error:.2f}",
)
ax.barh(
index + 2 * bar_width,
lasso.coef_,
bar_width,
label=f"Lasso - R2={lasso_score:.4f}; coef. error={lasso_coef_error:.2f}",
)
ax.set_xlabel("Coefficient value")
ax.set_ylabel("Feature")
ax.set_title(
"Lasso and DFO estimated regression coefficients compared to ground truth"
)
ax.set_yticks(index + bar_width)
ax.set_yticklabels(["feat_" + str(i) for i in range(X.shape[1])])
ax.grid(True)
ax.legend()
plt.show()

Total running time of the script: ( 0 minutes 0.910 seconds)