linux – 高负载平均,高等待,dmesg raid错误消息(debian nfs服务器)

linux – 高负载平均,高等待,dmesg raid错误消息(debian nfs服务器),第1张

概述Debian 6 on HP proliant(2 CPU)with raid(2 * 1.5T RAID1 2 * 2T RAID1加入RAID0制作3.5T)主要运行nfs& imapd(加上用于 Windows共享的samba和用于预览网页的本地www);本地ubuntu桌面客户端安装$HOME,笔记本电脑访问imap&奇数文件(例如视频)通过nfs / smb;通过家用路由器/交换机连接1 Debian 6 on HP proliant(2 cpu)with raID(2 * 1.5T RAID1 2 * 2T RAID1加入RAID0制作3.5T)主要运行nfs& imapd(加上用于 Windows共享的samba和用于预览网页的本地www);本地ubuntu桌面客户端安装$HOME,笔记本电脑访问imap&奇数文件(例如视频)通过nfs / smb;通过家用路由器/交换机连接100baseT或wifi的盒子

uname -a

linux prole 2.6.32-5-686 #1 SMP Wed Jan 11 12:29:30 UTC 2012 i686 GNU/linux

安装程序已经工作了几个月,但是很容易间歇性地进行非常缓慢的用户体验(从服务器或笔记本电脑播放视频的桌面安装$HOME)现在一直都很糟糕我不得不深入研究它以试图找出错误(! )

服务器在低负载下似乎没问题(笔记本电脑)客户端(在本地磁盘上使用$HOME)连接到服务器的imapd和nfs挂载RAID以访问1个文件:top显示load~0.1或更少,0等待

但当(桌面)客户端安装$HOME并启动用户KDE会话(所有访问服务器)时,顶部显示例如

top - 13:41:17 up  3:43,3 users,load average: 9.29,9.55,8.27Tasks: 158 total,1 running,157 sleePing,0 stopped,0 zombIEcpu(s):  0.4%us,0.4%sy,0.0%ni,49.0%ID,49.7%wa,0.0%hi,0.5%si,0.0%stMem:    903856k total,851784k used,52072k free,171152k buffersSwap:        0k total,0k used,0k free,476896k cached  PID USER      PR  NI  VIRT  RES  SHR S %cpu %MEM    TIME+  COMMAND                                                                                                      3935 root      20   0  2456 1088  784 R    2  0.1   0:00.02 top                                                                                                             1 root      20   0  2028  680  584 S    0  0.1   0:01.14 init                                                                                                            2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd                                                                                                        3 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/0                                                                                                     4 root      20   0     0    0    0 S    0  0.0   0:00.12 ksoftirqd/0                                                                                                     5 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0                                                                                                      6 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/1                                                                                                     7 root      20   0     0    0    0 S    0  0.0   0:00.16 ksoftirqd/1                                                                                                     8 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/1                                                                                                      9 root      20   0     0    0    0 S    0  0.0   0:00.42 events/0                                                                                                       10 root      20   0     0    0    0 S    0  0.0   0:02.26 events/1                                                                                                       11 root      20   0     0    0    0 S    0  0.0   0:00.00 cpuset                                                                                                         12 root      20   0     0    0    0 S    0  0.0   0:00.00 khelper                                                                                                        13 root      20   0     0    0    0 S    0  0.0   0:00.00 netns                                                                                                          14 root      20   0     0    0    0 S    0  0.0   0:00.00 async/mgr                                                                                                      15 root      20   0     0    0    0 S    0  0.0   0:00.00 pm                                                                                                             16 root      20   0     0    0    0 S    0  0.0   0:00.02 sync_supers                                                                                                    17 root      20   0     0    0    0 S    0  0.0   0:00.02 bdi-default                                                                                                    18 root      20   0     0    0    0 S    0  0.0   0:00.00 kintegrityd/0                                                                                                  19 root      20   0     0    0    0 S    0  0.0   0:00.00 kintegrityd/1                                                                                                  20 root      20   0     0    0    0 S    0  0.0   0:00.02 kblockd/0                                                                                                      21 root      20   0     0    0    0 S    0  0.0   0:00.08 kblockd/1                                                                                                      22 root      20   0     0    0    0 S    0  0.0   0:00.00 kacpID                                                                                                         23 root      20   0     0    0    0 S    0  0.0   0:00.00 kacpi_notify                                                                                                   24 root      20   0     0    0    0 S    0  0.0   0:00.00 kacpi_hotplug                                                                                                  25 root      20   0     0    0    0 S    0  0.0   0:00.00 kseriod                                                                                                        28 root      20   0     0    0    0 S    0  0.0   0:04.19 kondemand/0                                                                                                    29 root      20   0     0    0    0 S    0  0.0   0:02.93 kondemand/1                                                                                                    30 root      20   0     0    0    0 S    0  0.0   0:00.00 khungtaskd                                                                                                     31 root      20   0     0    0    0 S    0  0.0   0:00.18 kswapd0                                                                                                        32 root      25   5     0    0    0 S    0  0.0   0:00.00 ksmd                                                                                                           33 root      20   0     0    0    0 S    0  0.0   0:00.00 aio/0                                                                                                          34 root      20   0     0    0    0 S    0  0.0   0:00.00 aio/1                                                                                                          35 root      20   0     0    0    0 S    0  0.0   0:00.00 crypto/0                                                                                                       36 root      20   0     0    0    0 S    0  0.0   0:00.00 crypto/1                                                                                                      203 root      20   0     0    0    0 S    0  0.0   0:00.00 ksuspend_usbd                                                                                                 204 root      20   0     0    0    0 S    0  0.0   0:00.00 khubd                                                                                                         205 root      20   0     0    0    0 S    0  0.0   0:00.00 ata/0                                                                                                         206 root      20   0     0    0    0 S    0  0.0   0:00.00 ata/1                                                                                                         207 root      20   0     0    0    0 S    0  0.0   0:00.14 ata_aux                                                                                                       208 root      20   0     0    0    0 S    0  0.0   0:00.01 scsi_eh_0

dmesg建议存在磁盘问题:

.............. (prevIoUs episode)[13276.966004] raID1:md0: read error corrected (8 sectors at 489900360 on sdc7)[13276.966043] raID1: sdb7: redirecting sector 489898312 to another mirror[13279.569186] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0[13279.569211] ata4.00: irq_stat 0x40000008[13279.569230] ata4.00: Failed command: READ FPDMA QUEUED[13279.569257] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in[13279.569262]          res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F>[13279.569306] ata4.00: status: { DRDY ERR }[13279.569321] ata4.00: error: { UNC }[13279.575362] ata4.00: configured for UDMA/133[13279.575388] ata4: EH complete[13283.169224] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0[13283.169246] ata4.00: irq_stat 0x40000008[13283.169263] ata4.00: Failed command: READ FPDMA QUEUED[13283.169289] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in[13283.169294]          res 41/40:00:07:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F>[13283.169331] ata4.00: status: { DRDY ERR }[13283.169345] ata4.00: error: { UNC }[13283.176071] ata4.00: configured for UDMA/133[13283.176104] ata4: EH complete[13286.224814] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0[13286.224837] ata4.00: irq_stat 0x40000008[13286.224853] ata4.00: Failed command: READ FPDMA QUEUED[13286.224879] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in[13286.224884]          res 41/40:00:06:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F>[13286.224922] ata4.00: status: { DRDY ERR }[13286.224935] ata4.00: error: { UNC }[13286.231277] ata4.00: configured for UDMA/133[13286.231303] ata4: EH complete[13288.802623] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0[13288.802646] ata4.00: irq_stat 0x40000008[13288.802662] ata4.00: Failed command: READ FPDMA QUEUED[13288.802688] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in[13288.802693]          res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F>[13288.802731] ata4.00: status: { DRDY ERR }[13288.802745] ata4.00: error: { UNC }[13288.808901] ata4.00: configured for UDMA/133[13288.808927] ata4: EH complete[13291.380430] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0[13291.380453] ata4.00: irq_stat 0x40000008[13291.380470] ata4.00: Failed command: READ FPDMA QUEUED[13291.380496] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in[13291.380501]          res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F>[13291.380577] ata4.00: status: { DRDY ERR }[13291.380594] ata4.00: error: { UNC }[13291.386517] ata4.00: configured for UDMA/133[13291.386543] ata4: EH complete[13294.347147] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0[13294.347169] ata4.00: irq_stat 0x40000008[13294.347186] ata4.00: Failed command: READ FPDMA QUEUED[13294.347211] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in[13294.347217]          res 41/40:00:06:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F>[13294.347254] ata4.00: status: { DRDY ERR }[13294.347268] ata4.00: error: { UNC }[13294.353556] ata4.00: configured for UDMA/133[13294.353583] sd 3:0:0:0: [sdc] Unhandled sense code[13294.353590] sd 3:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE[13294.353599] sd 3:0:0:0: [sdc] Sense Key : Medium Error [current] [descriptor][13294.353610] Descriptor sense data with sense descriptors (in hex):[13294.353616]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [13294.353635]         23 05 6a 06 [13294.353644] sd 3:0:0:0: [sdc] Add. Sense: Unrecovered read error - auto reallocate Failed[13294.353657] sd 3:0:0:0: [sdc] CDB: Read(10): 28 00 23 05 6a 00 00 00 08 00[13294.353675] end_request: I/O error,dev sdc,sector 587557382[13294.353726] ata4: EH complete[13294.366953] raID1:md0: read error corrected (8 sectors at 489900544 on sdc7)[13294.366992] raID1: sdc7: redirecting sector 489898496 to another mirror

而且它们经常发生,我想这可能会导致性能问题(?)

#dmesg | grep镜子

[12433.561822] raID1: sdc7: redirecting sector 489900464 to another mirror[12449.428933] raID1: sdb7: redirecting sector 489900504 to another mirror[12464.807016] raID1: sdb7: redirecting sector 489900512 to another mirror[12480.196222] raID1: sdb7: redirecting sector 489900520 to another mirror[12495.585413] raID1: sdb7: redirecting sector 489900528 to another mirror[12510.974424] raID1: sdb7: redirecting sector 489900536 to another mirror[12526.374933] raID1: sdb7: redirecting sector 489900544 to another mirror[12542.619938] raID1: sdc7: redirecting sector 489900608 to another mirror[12559.431328] raID1: sdc7: redirecting sector 489900616 to another mirror[12576.553866] raID1: sdc7: redirecting sector 489900624 to another mirror[12592.065265] raID1: sdc7: redirecting sector 489900632 to another mirror[12607.621121] raID1: sdc7: redirecting sector 489900640 to another mirror[12623.165856] raID1: sdc7: redirecting sector 489900648 to another mirror[12638.699474] raID1: sdc7: redirecting sector 489900656 to another mirror[12655.610881] raID1: sdc7: redirecting sector 489900664 to another mirror[12672.255617] raID1: sdc7: redirecting sector 489900672 to another mirror[12672.288746] raID1: sdc7: redirecting sector 489900680 to another mirror[12672.332376] raID1: sdc7: redirecting sector 489900688 to another mirror[12672.362935] raID1: sdc7: redirecting sector 489900696 to another mirror[12674.201177] raID1: sdc7: redirecting sector 489900704 to another mirror[12698.045050] raID1: sdc7: redirecting sector 489900712 to another mirror[12698.089309] raID1: sdc7: redirecting sector 489900720 to another mirror[12698.111999] raID1: sdc7: redirecting sector 489900728 to another mirror[12698.134006] raID1: sdc7: redirecting sector 489900736 to another mirror[12719.034376] raID1: sdc7: redirecting sector 489900744 to another mirror[12734.545775] raID1: sdc7: redirecting sector 489900752 to another mirror[12734.590014] raID1: sdc7: redirecting sector 489900760 to another mirror[12734.624050] raID1: sdc7: redirecting sector 489900768 to another mirror[12734.647308] raID1: sdc7: redirecting sector 489900776 to another mirror[12734.664657] raID1: sdc7: redirecting sector 489900784 to another mirror[12734.710642] raID1: sdc7: redirecting sector 489900792 to another mirror[12734.721919] raID1: sdc7: redirecting sector 489900800 to another mirror[12734.744732] raID1: sdc7: redirecting sector 489900808 to another mirror[12734.779330] raID1: sdc7: redirecting sector 489900816 to another mirror[12782.604564] raID1: sdb7: redirecting sector 1242934216 to another mirror[12798.264153] raID1: sdc7: redirecting sector 1242935080 to another mirror[13245.832193] raID1: sdb7: redirecting sector 489898296 to another mirror[13261.376929] raID1: sdb7: redirecting sector 489898304 to another mirror[13276.966043] raID1: sdb7: redirecting sector 489898312 to another mirror[13294.366992] raID1: sdc7: redirecting sector 489898496 to another mirror

尽管阵列仍在所有磁盘上运行 – 但它们还没有放弃:

#cat / proc / mdstat

PersonalitIEs : [raID1] [raID0] md10 : active raID0 md0[0] md1[1]      3368770048 blocks super 1.2 512k chunksmd1 : active raID1 sde2[2] sdd2[1]      1464087824 blocks super 1.2 [2/2] [UU]md0 : active raID1 sdb7[0] sdc7[2]      1904684920 blocks super 1.2 [2/2] [UU]unused devices: <none>

所以我想我已经知道问题是什么,但我不是想象力最远的一个linux系统管理员专家,我真的很感激这里的线索检查我的诊断和我需要做什么:

>显然我需要为sdc采购另一个驱动器. (我猜我
如果价格合适,可以购买更大的驱动器:我在想
有一天,我需要增加数组的大小,那将是
一个较少的驱动器更换一个更大的驱动器)
>然后使用mdadm来失败现有的sdc,删除它并适合新的驱动器
> fdisk新驱动器与阵列的分区大小相同
>使用mdadm将新驱动器添加到阵列中

听起来不错?

解决方法 通常当您遇到磁盘错误时,磁盘会暂停一段时间以尝试纠正错误本身,并且linux RAID将容忍某些磁盘等待,直到它标记为坏.这个磁盘暂停可能是导致速度减慢的原因(特别是在您看到的错误率时).

您更换驱动器的计划是正确的.但是,我不建议使用更换RAID的分区部分的概念来获得更大的驱动程序,然后再部分使用其他东西.更接近原始磁盘(大小和速度)以保持阵列一致更为明智.也就是说,从理论上讲,你可以做得更大,并为数组替换分配确切的替换大小,然后为其他东西(甚至是数组的另一个成员)划分另一个分区.

可能有助于调试的旁注:我喜欢用作顶部替换的工具称为atop(http://www.atoptool.nl/),它可以让您更加可视化使用磁盘I / O的每个磁盘和进程以及瓶颈所在的位置(您可能会注意到等待I / O是针对具有问题的特定磁盘).

总结

以上是内存溢出为你收集整理的linux – 高负载平均,高等待,dmesg raid错误消息(debian nfs服务器)全部内容,希望文章能够帮你解决linux – 高负载平均,高等待,dmesg raid错误消息(debian nfs服务器)所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/yw/1041284.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-24
下一篇 2022-05-24

发表评论

登录后才能评论

评论列表(0条)

保存