LinuxSir.cn,穿越时空的Linuxsir!

 找回密码
 注册
搜索
热搜: shell linux mysql
查看: 3740|回复: 2

[算法]纠错码:海明码如何纠错

[复制链接]
发表于 2006-8-11 11:47:37 | 显示全部楼层 |阅读模式
以前学习海明码的时候,一顿死记硬背,到现在早就记不得了,最近由于研究的相关性,又重新看了一遍海明码,有了一些想法,总感觉相关的书籍还有文档写的太抽象,有的恨不得直接搞到编码理论上去,其实我主要需要明白的是海明码怎么纠的错,而不是为什么能纠错,看了一堆的公式,感觉还是没有程序清楚明白,下面是我写的海明码,编码(encode)和纠错(check)的相关程序。

  1. //java program
  2. 1 int[] encode(int[] data){
  3. 2         int n = Math.floor(Math.log(data.length));//the length of hamming code
  4. 3        int[] res = new int[n];
  5. 4        for(int i = 0; i < n; i++){
  6. 5                for(int j = 0; j < data.length; j++){
  7. 6                        if((1<<i) & (j + 1) != 0)
  8. 7                                res[i] ^= data[j];
  9. 8                }
  10. 9        }
  11. 10        return res;
  12. 11 }
  13. 12 int[] check(int[] data, int[] hamming){
  14. 13        int pos = 0;
  15. 14        for(int i = 0; i < hamming.length; i++){
  16. 15                for(int j = 0; j < data.length; j++){
  17. 16                        if((1<<i) & (j + 1) != 0)
  18. 17                                hamming[i] ^= data[j];
  19. 18                }
  20. 19                if(hamming[i] != 0)
  21. 20                        pos += 1<<i;
  22. 21        }
  23. 22        if(pos > 0)
  24. 23                data[pos - 1] = ~data[pos - 1];
  25. 24        return data;
  26. 25 }
复制代码

海明码的纠错位数是1位,最多检测错位为2位,所需要记录海明码的位数floor(log(N)),N表示的是数据的位数,现在我参照程序来解释一下海明码是怎么纠的1位错,最多能检测2位错。

设数据位数为63位,依此存储在data数组中(data.length = 63),代码1,2行是计算所需要的海明码的位数,并声明存储海明码的数组。代码4-7行就是计算海明码的实际代码,可以看出海明码的计算是根据相应数据位的二进制数来计算的,例如对于7而言,它的二进制代码为000111,所以它对海明码的贡献就是影响了海明码第1,2,3位的值,那么假设在只有一个数据位p错的话,它所导致的校验不一致的海明码的位,就是它二进制代码所影响的海明码的位,反过来,只要我们找出所有校验不一致的海明码的位数(代码14-18行),那么也就知道了相应数据位的二进制的值(代码19-20行),也就知道是第几位出错,从而可以把相应的位取反(代码22,23行),达到纠错的效果。

那么为什么海明码最多能检测2位错呢,现在我们来看看海明码是如何检验出两位的错误的,假设第p1,p2(p1<>p2)位数据错误,可以知道,他们相应的二进制位所影响位必不完全相同,因为p1,p2是不同的数,所以必然导致海明码某个校验位不一致,也就检测出错误,但是我们并不能从所得到的不一致的位来推测出p1,p2的值,例如:第3位与第7位都错,那么最后导致不一致的校验位是第3位,根据程序显然得出的是第4位出错(实质上是x+y=z,由z求x,y的不定方程问题,解是不唯一的)。那么为什么海明码无法检测3位或3位以上的错误呢,这是因为3位或3位以上的错误有可能互相影响使得最后的校验值全部一致,例如:取第3,4,7位出错,就会导致这种情况。

实际上知道了上面的知识,也就无所谓海明码了,本质上就是二进制表示的问题,也不会记不住了吧。

附注:
程序没有实际运行过,可能会存在一些bug,不过并不影响思想的表达,还有就是本身水平有限,很多简单的问题我的认识的很不清楚,盼望不要被牛人bs。
发表于 2006-8-13 17:05:47 | 显示全部楼层
不错! hamming code是RAID2检错的基础之一。

找到一个讲解RAID的网址:
http://www.stor-age.com/zhuanti/htm2004/04033017YS4Y.asp
貌似不错,不过里头讲的hamming code怎么和我以前了解的不一样,将得稀里糊涂
回复 支持 反对

使用道具 举报

发表于 2006-8-13 17:45:26 | 显示全部楼层
计算hamming code 这个很清楚:
http://www.cs.fiu.edu/~downeyt/cop3402/hamming.html  
还有这个: http://www.ee.unb.ca/tervo/ee4253/hamming.htm

---------------------------
Calculating the Hamming Code

The key to the Hamming Code is the use of extra parity bits to allow the identification of a single error. Create the code word as follows:

   1. Mark all bit positions that are powers of two as parity bits. (positions 1, 2, 4, 8, 16, 32, 64, etc.)
   2. All other bit positions are for the data to be encoded. (positions 3, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 17, etc.)
   3. Each parity bit calculates the parity for some of the bits in the code word. The position of the parity bit determines the sequence of bits that it alternately checks and skips.
      Position 1: check 1 bit, skip 1 bit, check 1 bit, skip 1 bit, etc. (1,3,5,7,9,11,13,15,...)
      Position 2: check 2 bits, skip 2 bits, check 2 bits, skip 2 bits, etc. (2,3,6,7,10,11,14,15,...)
      Position 4: check 4 bits, skip 4 bits, check 4 bits, skip 4 bits, etc. (4,5,6,7,12,13,14,15,20,21,22,23,...)
      Position 8: check 8 bits, skip 8 bits, check 8 bits, skip 8 bits, etc. (8-15,24-31,40-47,...)
      Position 16: check 16 bits, skip 16 bits, check 16 bits, skip 16 bits, etc. (16-31,48-63,80-95,...)
      Position 32: check 32 bits, skip 32 bits, check 32 bits, skip 32 bits, etc. (32-63,96-127,160-191,...)
      etc.
   4. Set a parity bit to 1 if the total number of ones in the positions it checks is odd. Set a parity bit to 0 if the total number of ones in the positions it checks is even.

Here is an example:

A byte of data: 10011010
Create the data word, leaving spaces for the parity bits: _ _ 1 _ 0 0 1 _ 1 0 1 0
Calculate the parity for each parity bit (a ? represents the bit position being set):

    * Position 1 checks bits 1,3,5,7,9,11:
      ? _ 1 _ 0 0 1 _ 1 0 1 0. Even parity so set position 1 to a 0: 0 _ 1 _ 0 0 1 _ 1 0 1 0
    * Position 2 checks bits 2,3,6,7,10,11:
      0 ? 1 _ 0 0 1 _ 1 0 1 0. Odd parity so set position 2 to a 1: 0 1 1 _ 0 0 1 _ 1 0 1 0
    * Position 4 checks bits 4,5,6,7,12:
      0 1 1 ? 0 0 1 _ 1 0 1 0. Odd parity so set position 4 to a 1: 0 1 1 1 0 0 1 _ 1 0 1 0
    * Position 8 checks bits 8,9,10,11,12:
      0 1 1 1 0 0 1 ? 1 0 1 0. Even parity so set position 8 to a 0: 0 1 1 1 0 0 1 0 1 0 1 0
    * Code word: 011100101010.

Finding and fixing a bad bit

The above example created a code word of 011100101010. Suppose the word that was received was 011100101110 instead. Then the receiver could calculate which bit was wrong and correct it. The method is to verify each check bit. Write down all the incorrect parity bits. Doing so, you will discover that parity bits 2 and 8 are incorrect. It is not an accident that 2 + 8 = 10, and that bit position 10 is the location of the bad bit. In general, check each parity bit, and add the positions that are wrong, this will give you the location of the bad bit.
Try one yourself

Test if these code words are correct, assuming they were created using an even parity Hamming Code . If one is incorrect, indicate what the correct code word should have been. Also, indicate what the original data was.

    * 010101100011
    * 111110001100
    * 000010001010
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

快速回复 返回顶部 返回列表