hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

BioCracker · 发表于 2003-4-30 13:11:19

OS:RedHat Linux9
hda是唯一的一块硬盘

打开bios里的S.M.A.R.T功能开机时并没有看到警告信息,

内核启动日志出现
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

有人说我硬盘要报销了,这块硬盘我已经用了一年多了...

北南南北 · 发表于 2003-4-30 17:11:38

可能是linux在默认情况下不支持smart技术。如果是要支持这种支持，可能得编内核。smart技术是用来是看硬盘运转时是否读取的正常，能侦测磁盘坏道。

是不是这个smart呢？？
如果兄弟想研究smart的linux的支持情况，可以编译一下内核。

我对此不懂，请兄弟看以下的链接。

http://www.google.com/search?hl= ... 9C%E7%B4%A2&lr=

BioCracker · 发表于 2003-4-30 19:48:04

就是因为看到这个错误,所以才开smart看看的,不开smart照常有这么大串错误.

北南南北 · 发表于 2003-4-30 19:56:49

刚才查了一下资料，看一下面这个对话。。。

Hard Disk: BadCRC errors from dma_intr on bootup...

From Karthik Subramanian

Answered By Jay R. Ashworth, Chris Gianakopoulos, Didier Heyden, Johan H

Before i start, Many thanks for the good work :-)

(!) [Jay] We try.

Some of us are very trying, but you're expected to not notice. :-)

(?) I have a Samsung SV2042H (20 GB) as my primary master, and an ATAPI CD-ROM of unknown make as my primary slave.

I recently noticed the following messages on bootup: (extract from my /var/log/boot.msg)

<4>Freeing unused kernel memory: 112k freed
<4>hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
<4>hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
<4>hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
<4>hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
<4>hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
<4>hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
<4>hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
<4>hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
<4>hdb: DMA disabled
<4>ide0: reset: success

1) What do the dma_intr messages mean? Does my HDD go to the junk heap, or is it possible for me to continue working with it? I have had no problems with it so far, despite the error messages.

(!) [Jay] I've been seeing something similar; same results, ie: nothing.

I think the IDE drivers got changed...

(!) [Didier] The DMA interrupt handler in the IDE driver seems to detect a data transfer failure (BadCRC) 4 times consecutively. All drives present on the corresponding IDE interface are then reset; in such a case (at least if you run a 2.4.x kernel), (U)DMA is disabled on both drives, even though you're told so only for your /dev/hdb CDROM (don't ask me why the kernel people have chosen to do so - one would have expected the faulty drive, hda, to be mentioned in a `DMA disabled' message as well

The fact that everything works fine (?) after the reset (no more awful messages and your system does boot, obviously) is reassuring: if your hard drive is indeed ready for something this is not (yet) for being sold back to your worst enemy ;) Be careful, though...

(!) [Johan] If dma is enabled on a controller that is not well supported, these errors can appear. ( I had it on a VIA KT266a with kernel 2.2. Upgrading to kernel 2.4 fixed it beautifully.

If you are sure that the IDE controller is supported, the drive is on its way out. You can run fsck with the badblock option turned on to mark these blocks as bad... As a rule, once these errors start, we throw the disk away(This is a high availability production environment).

If you dont mind that the disk can crash in the near future, make a backup and continue using it, it might work for a long time to come.

If the disk is under guarantee... take it back, it is not worth risking data loss if the drive can be replaced for free.

This is how you hunt for and fix badblocks.

# e2fsck -c /dev/hda1

Make sure that you have a backup, badblock scans can destroy data running with certain switches.

# man badblocks && man e2fsck (And read them carefully)

To turn of dma per drive

# hdparm -d0 /dev/hd[a-d]

To list dma settings

# hdparm -d /dev/hd[a-d]

To turn dma on

# hdparm -d1 /dev/hd[a-d]

Where hd[a-d] is hda, hdb, hdc, hdd.

(?) 2) I didn't see any options to turn DMA off for the peripherals in my BIOS options - so why/how is DMA being disabled for hdb? ( i put in an 'hdparm -d1 /dev/hdb' in my /etc/boot.local to enable DMA for hdb. )

(!) [Didier] You can pass an `ide=nodma' option to the boot loader to achieve this. Note that in the present case you'd better remove the `hdparm' line from your bootup script (-d1 is for forcing DMA on). Unfortunately I don't think it can be done on a per-drive basis (nor even on a per-interface basis).

To clarify, (U)DMA at kernel startup can only be globally disabled. You'll have then to fiddle with the hdparm utility to change this for a given drive (at your own risks).

There doesn't seem to be any `hdx=nodma' (x = 'a', 'b', 'c' or 'd') nor `idex=nodma' (x = '0' or '1') kernel options available at present -- the so-called note has been inserted at a wrong place

Apart from this you could try setting your CDROM drive as master on the IDE1 interface.

(?) 3) What does the number 4 prepended to the messages in /var/log/boot.msg (there are other numbers for the other messages) mean?

(!) [Jay] You're running Mandrake, aren't you? :-)

It's got something to do with the "debug level" that produces that particular line of kprintf output, I believe.

(!) [Didier] This number is most probably the log level associated to the given kernel message (<4> is usually the default value and corresponds to the KERN_WARNING level). A log level of <0> is for emergency conditions (system unusable) and <7> for debug messages.

(!) [Chris] Hi there, I beleive that dma_intr implies that a DMA interrupt occurred that is associated with your hard disk controller. You might be getting a Seek Complete error due to a bad CRC. In other words, either your media (the actual sectors of your hard disk platter) might be corrupt, or you might have a problem with your cabling.

Before I trashed the drive, I would unplug and replug the IDE cable from your disk controller AND your hard drive. Your disk controller might reside on your motherboard, and in that case, you would unplug the cable from the motherboard. You might also try a different IDE cable (the 40-pin ribbon cable) between your disk and the disk controller.

I start to worry when I see the BadCRC error messages, because when that happened to me, the hard disk eventually became useless. Make sure that you back up any data that you want to keep.

I saw those error messages on my son's computer when I gave him one of my hard drives that happened to be laying around. It was a 2Gb hard drive. At first, the messages were an annoyance during boot up. As time passed, we could not even get the system to boot up without running through fsck. Finally, things got so bad that fsck couldn't fix the filesystems. The drive is now on display, in parts, so that my son can show off the disk platters to his friends.

Good luck, and don't forget to back up your data.

http://www.linuxgazette.com/issue76/tag/10.html

		自动登录	找回密码
密码			注册

hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

浏览过的版块