问题1:在AWS Athena的json文件中存储多个元素
我需要将我的json文件重写为
{“ eventId”:“ 1”,“ eventName”:“ INSERT”,“ eventVersion”:“ 1.0”,“
eventSource”:“ aws:dynamodb”,“ awsRegion”:“ us-west-2”,“ image” :{“
Message”:“新项!”,“ Id”:101}},{“ eventId”:“ 2”,“ eventName”:“ MODIFY”,“
eventVersion”:“ 1.0”,“ eventSource”:“ aws:dynamodb“,” awsRegion“:” us-
west-2“,” image“:{” Message“:”此项已更改“,” Id“:101}},{” eventId“:” 3“, “
eventName”:“ REMOVE”,“ eventVersion”:“ 1.0”,“ eventSource”:“ aws:dynamodb”,“
awsRegion”:“ us-west-2”,“ image”:{“ Message”:“此项目已更改“,” Id“:101}}
那意味着
删除方括号[]将每个元素放在一行中
{.....................}{.....................}{.....................}
问题2。 访问非线性json属性
CREATE EXTERNAL TABLE IF NOT EXISTS <tablename> ( `eventId` string, `eventName` string, `eventVersion` string, `eventSource` string, `awsRegion` string, `image` struct <`Id` : string, `Message` : string>)ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'WITH SERDEPROPERTIES ( 'serialization.format' = '1', "dots.in.keys" = "true") LOCATION 's3://exampletablewithstream-us-west-2/';
查询:
select image.Id, image.message from <tablename>;
参考:
http://engineering.skybettingandgaming.com/2015/01/20/parsing-json-in-
hive/
https://github.com/rcongiu/Hive-JSON-Serde#mapping-hive-
keywords
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)