检测DEVICE硬件错误

mcelog 是Linux 系统上用来检查硬件错误,特别是内存和CPU错误的工具。
比如服务器隔一段时间莫名的重启一次,而message和syslog又检测不到有价值的信息。
通常发生MCE报错的原因有如下:
1、内存报错或者ECC问题
2、处理器过热
3、系统总线错误
4、CPU或者硬件缓存错误
一般来说当有错误提示时,需要优先注意内存问题,但由于现在内存控制器是集成在cpu里,所以有个别情况是由CPU问题引起的。
安装mcelog

yum install mcelog -y
service mcelogd start

可以看到:

[root@103.96.77.11 /home]$ less /var/log/mcelog
mcelog: Warning: MCE buffer is overflowed.
Hardware event. This is not a software error.
MCE 0
CPU 31 BANK 11
MISC c900220002021c8c
TIME 1582975847 Sat Feb 29 03:30:47 2020
MCG status:
MCi status:
Error overflow
Corrected error
MCi_MISC register valid
MCA: MEMORY CONTROLLER RD_CHANNEL3_ERR
Transaction: Memory read error
MemCtrl:
STATUS c809fd4e00800093 MCGSTATUS 0
MCGCAP 1000814 APICID 2f SOCKETID 1
CPUID Vendor Intel Family 6 Model 45
Hardware event. This is not a software error.
MCE 1
CPU 30 BANK 11
MISC c900000222221c8c
TIME 1582975847 Sat Feb 29 03:30:47 2020
MCG status:
MCi status:
Error overflow
Corrected error
MCi_MISC register valid
MCA: MEMORY CONTROLLER RD_CHANNEL3_ERR
Transaction: Memory read error
MemCtrl:
STATUS c80316ce00800093 MCGSTATUS 0
MCGCAP 1000814 APICID 2d SOCKETID 1
CPUID Vendor Intel Family 6 Model 45
Hardware event. This is not a software error.
MCE 2
CPU 29 BANK 11
MISC c908002022001c8c
TIME 1582975847 Sat Feb 29 03:30:47 2020
MCG status:
MCi status:
Error overflow
Corrected error
MCi_MISC register valid
MCA: MEMORY CONTROLLER RD_CHANNEL3_ERR
Transaction: Memory read error
MemCtrl:
STATUS c800020e00800093 MCGSTATUS 0
MCGCAP 1000814 APICID 2b SOCKETID 1
CPUID Vendor Intel Family 6 Model 45
Hardware event. This is not a software error.
MCE 3
CPU 28 BANK 11
MISC c900022020201c8c
TIME 1582975847 Sat Feb 29 03:30:47 2020
MCG status:
MCi status:
Error overflow
Corrected error
MCi_MISC register valid
MCA: MEMORY CONTROLLER RD_CHANNEL3_ERR
Transaction: Memory read error
MemCtrl:
STATUS c80273ce00800093 MCGSTATUS 0
MCGCAP 1000814 APICID 29 SOCKETID 1
CPUID Vendor Intel Family 6 Model 45
Hardware event. This is not a software error.
MCE 4
CPU 27 BANK 11

————————————————
版权声明:本文为CSDN博主「陌小铠」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/cy309173854/article/details/79030234