Apache Kudu - Known Issues and LimitationsA new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast datahttps://kudu.apache.org/docs/known_issues.html#_scale
Apache Kudu - Apache Kudu Scaling Guidehttps://kudu.apache.org/docs/scaling_guide.html#memory 无论你多厉害,你总没有官网厉害。
参考官网连接如上:
总结:
官方推荐的
3台master, 100个tablet server
每台服务器差不多存1000个tablets
100台服务器差不多有8T内存,也就是每台80G的物理内存
每个表差不多有60个tablet,这里的60不知道包含replica没,猜测是有的。根据我之前文章提到的,kudu是建议-- 1M rows with 50 hash partitions = approximately 20,000 rows per partition.也就是默认表数据差不多都是百万。
每台tablet server差不多10G内存
现实配置:
3台master 300个tabletserver
10TB内存,每台服务器 30G内存
每个表4000+tablets 也就是partition 4000.。。
每台server 50G的内存
写到这都开始发慌了。。。都是土豪。。。
-
hot replica: A tablet replica that is continuously receiving writes. For example, in a time series use case, tablet replicas for the most recent range partition on a time column would be continuously receiving the latest data, and would be hot replicas.
-
cold replica: A tablet replica that is not hot, i.e. a replica that is not frequently receiving writes, for example, once every few minutes. A cold replica may be read from. For example, in a time series use case, tablet replicas for previous range partitions on a time column would not receive writes at all, or only occasionally receive late updates or additions, but may be constantly read.
-
data on disk: The total amount of data stored on a tablet server across all disks, post-replication, post-compression, and post-encoding.
热点数据:经常被访问的数据
冷点数据:不经常被访问的数据
比如我根据明星名字分区。。。迪丽热巴 古力娜扎 肯定是热点数据...贾玲韩红肯定是冷点数据
磁盘:所有数据都是存在磁盘里的和hdfs木有关系。
Example WorkloadThe sections below perform sample calculations using the following parameters:
-
200 hot replicas per tablet server
-
1600 cold replicas per tablet server
-
8TB of data on disk per tablet server (about 4.5GB/replica)
-
512MB block cache
-
40 cores per server
-
limit of 32000 file descriptors per server
-
a read workload with 1 frequently-scanned table with 40 columns
This workload resembles a time series use case, where the hot replicas correspond to the most recent range partition on time.
解读
每个table server 200个热点数据,1600个冷点数据 也就是1800个块 除以replica=3
8TB/4.5G=1777replica ,每个replica 这么大?
40 cores per server 每台服务器要40core 过分了把,一般来说 core都是分配给yarn的。
512Mb的 block cache 这个说下,问了组长,这个主要是读缓存?然后我们这个配置也不知道对不对
The flag --memory_limit_hard_bytes determines the maximum amount of memory that a Kudu tablet server may use. The amount of memory used by a tablet server scales with data size, write workload, and read concurrency. The following table provides numbers that can be used to compute a rough estimate of memory usage.
--memory_limit_hard_bytes 这个决定了kudu ts 最大能用的内存,注意这个包含了block cache,
这里有时间了。 着重说下两个内存
memory_limit_hard_bytes -- 这个kudu ts 能获取的最大内存量
block_cache_capacity_mb -- 这个是缓存用的量
由上图可知 我有一个是 总共10G给了block_cache 3G 一个是20G给了block_cache 5G
这里是20G的截图,可以看到
Total consumption =11.59G
root=6.46G
block_cache-sharded_lru_cache=5G 读缓存
log_cache=1G 日志缓存这个是缓存数据从a到b
server=468.74M 这个主要是读写取tablet
其中 root=block_cache-sharded_lru_cache+server+log_cache
Total consumption-root≈5G 这个5G做啥了
截图内存为10G的ts
Total consumption =10G
root=5.46G
block_cache-sharded_lru_cache=3G
server=1.47G
log_cache=1G
其实和上面的一样 主要是server不一样 因为这个ts的tablet有很多在被读
其实简单的总结下
block_cache-sharded_lru_cache 会被一直搞满,log_cache也保持为1G(没有看到设置的地方)
kudu的一些其余程序也会一直占用内存,实际上我们读写kudu的时候能够用的内存就非常小了,
像10G的内存就1-2G在被使用,
当然这里我要说下,block_cache-sharded_lru_cache 这个值不必要设置那么大,官方默认512M就可以了。
Apache Kudu - Apache Kudu TroubleshootingA new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast datahttps://kudu.apache.org/docs/troubleshooting.html
Service unavailable: Soft memory limit exceeded (at 96.35% of capacity) Service unavailable: Soft memory limit exceeded (at 96.35% of capacity) Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)
这个问题 相信只要用过kudu的人就遇到过,简单的来说就是内存不够
解决办法
1.增大--memory_limit_hard_bytes
2.调节--maintenance_manager_num_threads ,这个值和文件存储的磁盘有关,kudu 写的时候把memory的数据flush到disk, 这个时候要么增加磁盘数量,要么增加线程数量,磁盘:线程=1:3一般=3就行。我们的错误示范。。。
3.kudu 1.8以前 检查--block_cache_capacity_mb,这个值高了读的效率高,低了需要经常flush 导致写入吞吐量低
---后面有时间再研究下
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)