只需替换:
#load classifier and predict classifier = joblib.load('class.pkl') #vectorize/transform the new title then predict vectorizer = TfidfVectorizer(sublinear_tf=True, max_df=0.5, ngram_range=(1, 3)) X_test = vectorizer.transform(title) predict = classifier.predict(X_test) return predict
通过:
# load the saved pipeline that includes both the vectorizer # and the classifier and predict classifier = joblib.load('class.pkl') predict = classifier.predict(X_test) return predict
class.pkl包括完整的管道,因此无需创建新的矢量化器实例。如错误消息所述,您需要重用最初训练的矢量化程序,因为从令牌(字符串ngram)到列索引的特征映射保存在矢量化程序本身中。此映射称为“词汇表”。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)