- 报错代码:
- 报错信息:
- 排查问题:
import org.apache.spark.ml.feature.BucketedRandomProjectionLSH import org.apache.spark.ml.linalg.Vectors import org.apache.spark.sql.SparkSession import org.apache.spark.sql.functions.col val dfA = spark.createDataframe(Seq( (0, Vectors.dense(1.0, 1.0)), (1, Vectors.dense(1.0, -1.0)), (2, Vectors.dense(-1.0, -1.0)), (3, Vectors.dense(-1.0, 1.0)) )).toDF("id", "features") val dfB = spark.createDataframe(Seq( (4, Vectors.dense(1.0, 0.0)), (5, Vectors.dense(-1.0, 0.0)), (6, Vectors.dense(0.0, 1.0)), (7, Vectors.dense(0.0, -1.0)) )).toDF("id", "features") val key = Vectors.dense(1.0, 0.0) val brp = new BucketedRandomProjectionLSH() .setBucketLength(2.0) .setNumHashTables(3) .setInputCol("features") .setOutputCol("hashes") val model = brp.fit(dfA) // Feature Transformation println("The hashed dataset where hashed values are stored in the column 'hashes':") model.transform(dfA).show() // Compute the locality sensitive hashes for the input rows, then perform approximate // similarity join. // We could avoid computing hashes by passing in the already-transformed dataset, e.g. // `model.approxSimilarityJoin(transformedA, transformedB, 1.5)` println("Approximately joining dfA and dfB on Euclidean distance smaller than 1.5:") model.approxSimilarityJoin(dfA, dfB, 1.5, "EuclideanDistance") .select(col("datasetA.id").alias("idA"), col("datasetB.id").alias("idB"), col("EuclideanDistance")).show() // Compute the locality sensitive hashes for the input rows, then perform approximate nearest // neighbor search. // We could avoid computing hashes by passing in the already-transformed dataset, e.g. // `model.approxNearestNeighbors(transformedA, key, 2)` println("Approximately searching dfA for 2 nearest neighbors of the key:") model.approxNearestNeighbors(dfA, key, 2).show()报错信息:
排查问题:java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.parse(Lorg/json4s/JsonInput;Z)
如果在pom文件引入了某一个包,并且打包之后也可以在具体的位置找到,但是最终还是报错NoSuchMethodError,那么基本上就是包和包之前版本不匹配,版本冲突会导致,所以我需要排查下是否我的包版本存在冲突:
本地idea的json4s版本:
org.json4s json4s-jackson_2.113.5.3 org.json4s json4s-native_2.113.5.3
环境中的spark版本:
这里排查后发现确实冲突了,所以这里将pom文件中的json包修改版本即可运行。
解决方式可以选择:
1、修改本地pom文件jar包版本,和环境spark的保持一致
2、使用shade打包,将重名的类修改为自定义,这样可以同时存在于spark中,因为类名不同所以不会引起冲突。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)