因为java 浮点类型(double/float)类型在做运算时会存在丢失精度的问题。
es是使用java开发实现,所以同样的问题在es也存在。现在以示例的方式展现在es中如何规避这个问题。
es版本: 6.5.4
docker run --name es6 --net host -e "discovery.type=single-node" docker.io/elasticsearch:6.5.4
实例演示
创建索引:
curl -X PUT http://127.0.0.1:9200/index01
删除索引:
curl -XDELETE http://127.0.0.1:9200/index01
创建mapping
curl -XPOST 'http://127.0.0.1:9200/index01/type01/_mapping?pretty' -H "Content-Type: application/json" \
-d '
{
"type01": {
"properties": {
"tm": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
},
"name": {
"type": "keyword"
},
"address": {
"type": "text"
},
"price1": {
"type": "double"
},
"price2": {
"type": "scaled_float",
"scaling_factor": 100
}
}
}
}'
查看mapping
curl -XGET http://127.0.0.1:9200/app_dataheart_factoring_business_waybill?pretty
插入数据:
curl -XPOST http://127.0.0.1:9200/index01/type01/01?pretty -H "Content-Type: application/json" \
-d '{
"name":"zhangsan",
"price1":1.0,
"price2":1.0,
"tm":"2022-01-01",
"address":"beijing daxing"
}'
curl -XPOST http://127.0.0.1:9200/index01/type01/02?pretty -H "Content-Type: application/json" \
-d '{
"name":"zhangsan",
"price1":20.2,
"price2":20.2,
"tm":"2022-01-01",
"address":"beijing daxing"
}'
curl -XPOST http://127.0.0.1:9200/index01/type01/03?pretty -H "Content-Type: application/json" \
-d '{
"name":"zhangsan",
"price1":300.03,
"price2":300.03,
"tm":"2022-01-01",
"address":"beijing daxing"
}'
查询数据:
curl -XGET http://127.0.0.1:9200/index01/type01/_search?pretty
聚合:
double类型聚合
curl -XGET http://127.0.0.1:9200/index01/type01/_search?pretty -H "Content-Type: application/json" \
-d '{
"aggs": {
"sum_price1": {
"sum":{
"field": "price1"
}
}
}
}
}'
结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "index01",
"_type" : "type01",
"_id" : "01",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"price1" : 1.0,
"price2" : 1.0,
"tm" : "2022-01-01",
"address" : "beijing daxing"
}
},
{
"_index" : "index01",
"_type" : "type01",
"_id" : "03",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"price1" : 300.03,
"price2" : 300.03,
"tm" : "2022-01-01",
"address" : "beijing daxing"
}
},
{
"_index" : "index01",
"_type" : "type01",
"_id" : "02",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"price1" : 20.2,
"price2" : 20.2,
"tm" : "2022-01-01",
"address" : "beijing daxing"
}
}
]
},
"aggregations" : {
"sum_price1" : {
"value" : 321.22999999999996
}
}
}
scaled_float类型聚合
curl -XGET http://127.0.0.1:9200/index01/type01/_search?pretty -H "Content-Type: application/json" \
-d '{
"aggs": {
"sum_price2": {
"sum":{
"field": "price2"
}
}
}
}
}'
结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "index01",
"_type" : "type01",
"_id" : "01",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"price1" : 1.0,
"price2" : 1.0,
"tm" : "2022-01-01",
"address" : "beijing daxing"
}
},
{
"_index" : "index01",
"_type" : "type01",
"_id" : "03",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"price1" : 300.03,
"price2" : 300.03,
"tm" : "2022-01-01",
"address" : "beijing daxing"
}
},
{
"_index" : "index01",
"_type" : "type01",
"_id" : "02",
"_score" : 1.0,
"_source" : {
"name" : "zhangsan",
"price1" : 20.2,
"price2" : 20.2,
"tm" : "2022-01-01",
"address" : "beijing daxing"
}
}
]
},
"aggregations" : {
"sum_price2" : {
"value" : 321.23
}
}
}
结果分析
- double类型在做运算时会存在丢失精度的问题
- scaled_float类型,在指定合适的缩放因子的前提下可以规避浮点类型运算丢失精度的问题
注意:
- 特别注意,需要知道导入price2字段的数据的最大精度,scaling_factor不能小于最大精度的小数位位数,否则可能丢失精度。
- 另外scaled_float缩放类型的浮点型,使用注意:必须指定缩放因子scaling_factor。
ES索引时,原始值会乘以该缩放因子并四舍五入得到新值,ES内部储存的是这个新值,但返回结果仍是原始值。例如:scale_factor为10的scaled_float字段将在内部存储2.34为23,
查询时,ES都会将查询参数*10再四舍五入得到的值与23匹配,若能匹配到返回结果为2.34。
参考:
ES数据类型
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)