linux – 数据库导入时LSI RAID控制器错误 – 如何排除故障?

linux – 数据库导入时LSI RAID控制器错误 – 如何排除故障?,第1张

概述我们正在Oracle系统上运行数据库转储导入 – (RHEL 5.9,2.6.18-348.6.1.el5).导入未完成,最终错误输出: ORA-15080: synchronous I/O operation to a disk failedWARNING: failed to write mirror side 1 of virtual extent 248 logical extent 0 我们正在Oracle系统上运行数据库转储导入 – (RHEL 5.9,2.6.18-348.6.1.el5).导入未完成,最终错误输出:
ORA-15080: synchronous I/O operation to a disk FailedWARNING: Failed to write mirror sIDe 1 of virtual extent 248 logical extent 0 of file 280 in group 1 on disk 1 allocation unit 986Errors in file /u01/app/oracle/diag/rdbms/dbprod/DBPROD/trace/DBPROD_lgwr_24520.trc:ORA-00345: redo log write error block 509314 count 2023ORA-00312: online log 1 thread 1: '+DATA/dbprod/redo01.log'ORA-15081: Failed to submit an I/O operation to a diskORA-15081: Failed to submit an I/O operation to a disk

环形缓冲区和/ var / log / messages中存在相应的错误:

包含导入的驱动器阵列是使用300GB 10k磁盘的RAID 1 0中的10磁盘SAS阵列. RAID控制器是LSI MegaRAID SAS 9260-8i.通过MegaCli报告没有磁盘或适配器错误.

>这是硬件问题吗?
>有什么方法可以排除故障吗? RAID控制器状态很好.磁盘和逻辑驱动器报告正常.
>这是linux *** 作系统还是调优问题?我将尝试使用不同的I / O调度程序. CFQ是默认的.

编辑:

其他调度程序已尝试使用相同的结果.此设置中有一个third-party (Vormetric) filesystem encryption module正在运行.删除它可以完成导入.所以现在我想知道这是模块中的缺陷还是它在LSI驱动程序中触发了一个坏的情况.

在导入期间,我们达到了14,000次写入IOPS.

在最近的尝试中,系统在控制台上完全停止以下 *** 作.

冻结前的最后一个输出.

Jun 12 18:54:42 db1-test kernel: megasas: build_ld_io error,sge_count = 51Jun 12 18:54:42 db1-test kernel: megasas: Err returned from build_and_issue_cmdJun 12 18:54:42 db1-test kernel: megasas: build_ld_io error,sge_count = 51Jun 12 18:54:42 db1-test kernel: megasas: Err returned from build_and_issue_cmdJun 12 18:54:42 db1-test kernel: sd 0:2:1:0: timing out command,waited 360sJun 12 18:54:42 db1-test kernel: sd 0:2:1:0: Unhandled error codeJun 12 18:54:42 db1-test kernel: sd 0:2:1:0: SCSI error: return code = 0x
ORA-15080: synchronous I/O operation to a disk FailedWARNING: Failed to write mirror sIDe 1 of virtual extent 248 logical extent 0 of file 280 in group 1 on disk 1 allocation unit 986Errors in file /u01/app/oracle/diag/rdbms/dbprod/DBPROD/trace/DBPROD_lgwr_24520.trc:ORA-00345: redo log write error block 509314 count 2023ORA-00312: online log 1 thread 1: '+DATA/dbprod/redo01.log'ORA-15081: Failed to submit an I/O operation to a diskORA-15081: Failed to submit an I/O operation to a disk
ORA-15080: synchronous I/O operation to a disk FailedWARNING: Failed to write mirror sIDe 1 of virtual extent 248 logical extent 0 of file 280 in group 1 on disk 1 allocation unit 986Errors in file /u01/app/oracle/diag/rdbms/dbprod/DBPROD/trace/DBPROD_lgwr_24520.trc:ORA-00345: redo log write error block 509314 count 2023ORA-00312: online log 1 thread 1: '+DATA/dbprod/redo01.log'ORA-15081: Failed to submit an I/O operation to a diskORA-15081: Failed to submit an I/O operation to a disk000
Jun 12 18:54:42 db1-test kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT,SUGGEST_OK
解决方法 最终 Sergey是对的 – 这是一个驱动程序问题.但是让我们先检查一下:

首先,您需要使用截止时间I / O调度程序而不是CFQ.顾名思义,截止日期确保所有IOP及时完成.

从megaraID卡中抓取事件:

megacli -adpeventlog -getevents -f /tmp/megaraID-$(date +%F_%T) -aALL

检查磁盘上的SMART数据(您需要构建一个新的smartmontools才能使其工作):

# megacli -pdList -a0 |grep 'Device ID'Device ID: 10Device ID: 9# smartctl -a /dev/sda -d megaraID,9«…»# smartctl -a /dev/sda -d megaraID,10«…»

如果一切正常,请继续尝试latest driver from LSI.

There is a third-party (Vormetric) filesystem encryption module running in this setup. Removing it allows the import to complete. So Now I’m wondering if this is a deficIEncy in the module or if it is triggering a bad condition in the LSI driver.

Voretric模块可能会做一些不兼容的事情,是的.我首先要与他们讨论他们的模块如何在高负载下拧紧系统.

总结

以上是内存溢出为你收集整理的linux – 数据库导入时LSI RAID控制器错误 – 如何排除故障?全部内容,希望文章能够帮你解决linux – 数据库导入时LSI RAID控制器错误 – 如何排除故障?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/yw/1041510.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-24
下一篇 2022-05-24

发表评论

登录后才能评论

评论列表(0条)

保存