然而,套索回归似乎在同一数据集上做了3个数量级的恶化!
我不确定是什么问题,因为在数学上,这不应该发生.这是我的代码:
def rIDge_regression(X_train,Y_train,X_test,Y_test,model_Alpha): clf = linear_model.RIDge(model_Alpha) clf.fit(X_train,Y_train) predictions = clf.predict(X_test) loss = np.sum((predictions - Y_test)**2) return lossdef lasso_regression(X_train,model_Alpha): clf = linear_model.Lasso(model_Alpha) clf.fit(X_train,Y_train) predictions = clf.predict(X_test) loss = np.sum((predictions - Y_test)**2) return lossX_train,Y_test = cross_valIDation.train_test_split(X,Y,test_size=0.1,random_state=0)for Alpha in [0,0.01,0.1,0.5,1,2,5,10,100,1000,10000]: print("Lasso loss for Alpha=" + str(Alpha) +": " + str(lasso_regression(X_train,Alpha)))for Alpha in [1,1.25,1.5,1.75,10000,100000,1000000]: print("RIDge loss for Alpha=" + str(Alpha) +": " + str(rIDge_regression(X_train,Alpha)))
这是我的输出:
Lasso loss for Alpha=0: 20575.7121727Lasso loss for Alpha=0.01: 19762.8763969Lasso loss for Alpha=0.1: 17656.9926418Lasso loss for Alpha=0.5: 15699.2014387Lasso loss for Alpha=1: 15619.9772649Lasso loss for Alpha=2: 15490.0433166Lasso loss for Alpha=5: 15328.4303197Lasso loss for Alpha=10: 15328.4303197Lasso loss for Alpha=100: 15328.4303197Lasso loss for Alpha=1000: 15328.4303197Lasso loss for Alpha=10000: 15328.4303197RIDge loss for Alpha=1: 61.6235890425RIDge loss for Alpha=1.25: 61.6360790934RIDge loss for Alpha=1.5: 61.6496312133RIDge loss for Alpha=1.75: 61.6636076713RIDge loss for Alpha=2: 61.6776331539RIDge loss for Alpha=5: 61.8206621527RIDge loss for Alpha=10: 61.9883144732RIDge loss for Alpha=100: 63.9106882674RIDge loss for Alpha=1000: 69.3266510866RIDge loss for Alpha=10000: 82.0056669678RIDge loss for Alpha=100000: 88.4479064159RIDge loss for Alpha=1000000: 91.7235727543
知道为什么吗?
谢谢!
解决方法 有趣的问题.我可以确认这不是算法实现的问题,而是对输入的正确响应.这是一个想法:您没有规范我从您的描述中相信的数据.这可能会导致不稳定,因为您的功能具有显着不同的数量级和方差.套索比山脊更“全有或全无”(你可能已经注意到它选择的系数多于岭数0),因此不稳定性会被放大.
尝试规范化您的数据,看看您是否更喜欢您的结果.
另一个想法:可能是伯克利老师的故意,强调脊和套索之间根本不同的行为.
总结以上是内存溢出为你收集整理的python – Sklearn Lasso回归比Ridge Regression差几个数量级?全部内容,希望文章能够帮你解决python – Sklearn Lasso回归比Ridge Regression差几个数量级?所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)