如何使用ParquetWriter将TIMESTAMP逻辑类型(INT96)写入Parquet?

如何使用ParquetWriter将TIMESTAMP逻辑类型(INT96)写入Parquet?,第1张

如何使用ParquetWriter将TIMESTAMP逻辑类型(INT96)写入Parquet?

我通过使用来自spark
sql的这段代码作为参考来弄清楚了。

INT96二进制编码分为两部分:前8个字​​节为自午夜以来的纳秒,最后4个字节为儒略日

String value = "2019-02-13 13:35:05";final long NANOS_PER_HOUR = TimeUnit.HOURS.tonanos(1);final long NANOS_PER_MINUTE = TimeUnit.MINUTES.tonanos(1);final long NANOS_PER_SECOND = TimeUnit.SECONDS.tonanos(1);// Parse dateSimpleDateFormat parser = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));cal.setTime(parser.parse(value));// Calculate Julian days and nanoseconds in the dayLocalDate dt = LocalDate.of(cal.get(Calendar.YEAR), cal.get(Calendar.MONTH)+1, cal.get(Calendar.DAY_OF_MONTH));int julianDays = (int) JulianFields.JULIAN_DAY.getFrom(dt);long nanos = (cal.get(Calendar.HOUR_OF_DAY) * NANOS_PER_HOUR)        + (cal.get(Calendar.MINUTE) * NANOS_PER_MINUTE)        + (cal.get(Calendar.SECOND) * NANOS_PER_SECOND);// Write INT96 timestampbyte[] timestampBuffer = new byte[12];ByteBuffer buf = ByteBuffer.wrap(timestampBuffer);buf.order(ByteOrder.LITTLE_ENDIAN).putLong(nanos).putInt(julianDays);// This is the properly enpred INT96 timestampBinary tsValue = Binary.fromReusedByteArray(timestampBuffer);


欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5623289.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-15
下一篇 2022-12-15

发表评论

登录后才能评论

评论列表(0条)

保存