python – Sklearn Lasso回归比Ridge Regression差几个数量级?

python – Sklearn Lasso回归比Ridge Regression差几个数量级?,第1张

概述我目前使用sklearn.linear_model模块实现了Ridge和Lasso回归. 然而,套索回归似乎在同一数据集上做了3个数量级的恶化! 我不确定是什么问题,因为在数学上,这不应该发生.这是我的代码: def ridge_regression(X_train, Y_train, X_test, Y_test, model_alpha): clf = linear_model.Rid 我目前使用sklearn.linear_model模块实现了RIDge和Lasso回归.

然而,套索回归似乎在同一数据集上做了3个数量级的恶化!

我不确定是什么问题,因为在数学上,这不应该发生.这是我的代码:

def rIDge_regression(X_train,Y_train,X_test,Y_test,model_Alpha):    clf = linear_model.RIDge(model_Alpha)    clf.fit(X_train,Y_train)    predictions = clf.predict(X_test)    loss = np.sum((predictions - Y_test)**2)    return lossdef lasso_regression(X_train,model_Alpha):    clf = linear_model.Lasso(model_Alpha)    clf.fit(X_train,Y_train)    predictions = clf.predict(X_test)    loss = np.sum((predictions - Y_test)**2)    return lossX_train,Y_test = cross_valIDation.train_test_split(X,Y,test_size=0.1,random_state=0)for Alpha in [0,0.01,0.1,0.5,1,2,5,10,100,1000,10000]:    print("Lasso loss for Alpha=" + str(Alpha) +": " + str(lasso_regression(X_train,Alpha)))for Alpha in [1,1.25,1.5,1.75,10000,100000,1000000]:    print("RIDge loss for Alpha=" + str(Alpha) +": " + str(rIDge_regression(X_train,Alpha)))

这是我的输出:

Lasso loss for Alpha=0: 20575.7121727Lasso loss for Alpha=0.01: 19762.8763969Lasso loss for Alpha=0.1: 17656.9926418Lasso loss for Alpha=0.5: 15699.2014387Lasso loss for Alpha=1: 15619.9772649Lasso loss for Alpha=2: 15490.0433166Lasso loss for Alpha=5: 15328.4303197Lasso loss for Alpha=10: 15328.4303197Lasso loss for Alpha=100: 15328.4303197Lasso loss for Alpha=1000: 15328.4303197Lasso loss for Alpha=10000: 15328.4303197RIDge loss for Alpha=1: 61.6235890425RIDge loss for Alpha=1.25: 61.6360790934RIDge loss for Alpha=1.5: 61.6496312133RIDge loss for Alpha=1.75: 61.6636076713RIDge loss for Alpha=2: 61.6776331539RIDge loss for Alpha=5: 61.8206621527RIDge loss for Alpha=10: 61.9883144732RIDge loss for Alpha=100: 63.9106882674RIDge loss for Alpha=1000: 69.3266510866RIDge loss for Alpha=10000: 82.0056669678RIDge loss for Alpha=100000: 88.4479064159RIDge loss for Alpha=1000000: 91.7235727543

知道为什么吗?

谢谢!

解决方法 有趣的问题.我可以确认这不是算法实现的问题,而是对输入的正确响应.

这是一个想法:您没有规范我从您的描述中相信的数据.这可能会导致不稳定,因为您的功能具有显着不同的数量级和方差.套索比山脊更“全有或全无”(你可能已经注意到它选择的系数多于岭数0),因此不稳定性会被放大.

尝试规范化您的数据,看看您是否更喜欢您的结果.

另一个想法:可能是伯克利老师的故意,强调脊和套索之间根本不同的行为.

总结

以上是内存溢出为你收集整理的python – Sklearn Lasso回归比Ridge Regression差几个数量级?全部内容,希望文章能够帮你解决python – Sklearn Lasso回归比Ridge Regression差几个数量级?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1197522.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-03
下一篇 2022-06-03

发表评论

登录后才能评论

评论列表(0条)

保存