kudu内存分布，及cdh配置_随笔

kudu内存分布，及cdh配置

Apache Kudu - Known Issues and LimitationsA new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast datahttps://kudu.apache.org/docs/known_issues.html#_scale

Apache Kudu - Apache Kudu Scaling Guidehttps://kudu.apache.org/docs/scaling_guide.html#memory 无论你多厉害，你总没有官网厉害。

参考官网连接如上：

总结：

官方推荐的

3台master， 100个tablet server

每台服务器差不多存1000个tablets

100台服务器差不多有8T内存，也就是每台80G的物理内存

每个表差不多有60个tablet，这里的60不知道包含replica没，猜测是有的。根据我之前文章提到的，kudu是建议-- 1M rows with 50 hash partitions = approximately 20,000 rows per partition.也就是默认表数据差不多都是百万。

每台tablet server差不多10G内存

现实配置：

3台master 300个tabletserver

10TB内存，每台服务器 30G内存

每个表4000+tablets 也就是partition 4000.。。

每台server 50G的内存

写到这都开始发慌了。。。都是土豪。。。

hot replica: A tablet replica that is continuously receiving writes. For example, in a time series use case, tablet replicas for the most recent range partition on a time column would be continuously receiving the latest data, and would be hot replicas.
cold replica: A tablet replica that is not hot, i.e. a replica that is not frequently receiving writes, for example, once every few minutes. A cold replica may be read from. For example, in a time series use case, tablet replicas for previous range partitions on a time column would not receive writes at all, or only occasionally receive late updates or additions, but may be constantly read.
data on disk: The total amount of data stored on a tablet server across all disks, post-replication, post-compression, and post-encoding.

热点数据：经常被访问的数据

冷点数据：不经常被访问的数据

比如我根据明星名字分区。。。迪丽热巴古力娜扎肯定是热点数据...贾玲韩红肯定是冷点数据

磁盘：所有数据都是存在磁盘里的和hdfs木有关系。

Example Workload

The sections below perform sample calculations using the following parameters:

200 hot replicas per tablet server
1600 cold replicas per tablet server
8TB of data on disk per tablet server (about 4.5GB/replica)
512MB block cache
40 cores per server
limit of 32000 file descriptors per server
a read workload with 1 frequently-scanned table with 40 columns

This workload resembles a time series use case, where the hot replicas correspond to the most recent range partition on time.

解读

每个table server 200个热点数据，1600个冷点数据也就是1800个块除以replica=3

8TB/4.5G=1777replica ，每个replica 这么大？

40 cores per server 每台服务器要40core 过分了把，一般来说 core都是分配给yarn的。

512Mb的 block cache 这个说下，问了组长，这个主要是读缓存？然后我们这个配置也不知道对不对

The flag --memory_limit_hard_bytes determines the maximum amount of memory that a Kudu tablet server may use. The amount of memory used by a tablet server scales with data size, write workload, and read concurrency. The following table provides numbers that can be used to compute a rough estimate of memory usage.

--memory_limit_hard_bytes 这个决定了kudu ts 最大能用的内存，注意这个包含了block cache，

这里有时间了。着重说下两个内存

memory_limit_hard_bytes -- 这个kudu ts 能获取的最大内存量

block_cache_capacity_mb -- 这个是缓存用的量

由上图可知我有一个是总共10G给了block_cache 3G 一个是20G给了block_cache 5G

这里是20G的截图，可以看到

Total consumption =11.59G

root=6.46G

block_cache-sharded_lru_cache=5G 读缓存

log_cache=1G 日志缓存这个是缓存数据从a到b

server=468.74M 这个主要是读写取tablet

其中 root=block_cache-sharded_lru_cache+server+log_cache

Total consumption-root≈5G 这个5G做啥了

截图内存为10G的ts

Total consumption =10G

root=5.46G

block_cache-sharded_lru_cache=3G

server=1.47G

log_cache=1G

其实和上面的一样主要是server不一样因为这个ts的tablet有很多在被读

其实简单的总结下

block_cache-sharded_lru_cache 会被一直搞满，log_cache也保持为1G(没有看到设置的地方)

kudu的一些其余程序也会一直占用内存，实际上我们读写kudu的时候能够用的内存就非常小了，

像10G的内存就1-2G在被使用，

当然这里我要说下，block_cache-sharded_lru_cache 这个值不必要设置那么大，官方默认512M就可以了。

Apache Kudu - Apache Kudu TroubleshootingA new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast datahttps://kudu.apache.org/docs/troubleshooting.html

Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)
Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)
Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)

这个问题相信只要用过kudu的人就遇到过，简单的来说就是内存不够

解决办法

1.增大--memory_limit_hard_bytes

2.调节--maintenance_manager_num_threads ，这个值和文件存储的磁盘有关，kudu 写的时候把memory的数据flush到disk，这个时候要么增加磁盘数量，要么增加线程数量，磁盘:线程=1:3一般=3就行。我们的错误示范。。。

3.kudu 1.8以前检查--block_cache_capacity_mb，这个值高了读的效率高，低了需要经常flush 导致写入吞吐量低

---后面有时间再研究下

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5653588.html

kudu内存分布，及cdh配置

发表评论

评论列表（0条）