一致性哈希Python语言描述

在网上看到别人用Java写的一个一致性hash算法,下午没事就自己用Py写写看…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#!/usr/env/python3
#coding : utf-8
import zlib,requests,re
from hashlib import md5
from collections import OrderedDict,Counter
class RingHash:
def __init__(self,nodelist=None,v_nodes=20):
self.nodelist = nodelist
self.v_nodes = v_nodes
self.ring = OrderedDict()
for node in self.nodelist:
self.add_node(node)
self._sort_vnode_list = sorted([v_node for inner in self.ring.values()
for v_node in inner])
def add_node(self,node):
v_node_list = [self.gener_key(node + ":" + str(v_node))
for v_node in range(self.v_nodes)]
self.ring.update({
node:v_node_list
})
def remove_node(self,node):
self.nodelist.remove(node)
self.ring.pop(node)
self._sort_vnode_list = sorted([v_node for inner in self.ring.values()
for v_node in inner])
def gener_key(self,node):
#return int(md5(node.encode('utf-8')).hexdigest(),base=16)
return zlib.crc32(node.encode('utf-8'))
def get_node(self,key):
crc_v = self.gener_key(key)
big_node_lst = [v_node for v_node in self._sort_vnode_list
if v_node > crc_v]
if big_node_lst != []:
return self.get_real_node(big_node_lst[0])
return self.get_real_node(self._sort_vnode_list[0])
def get_real_node(self,v_node):
for k,v in self.ring.items():
if v_node in v:
return k
def show_server(self):
print('-'*5+'the consistance hash ring contain server'+'-'*5)
print(self.ring.keys())

测试

1
2
3
4
5
6
7
8
9
10
11
def get_links_by_html(html):
webpage_regex = re.compile('<a[^>]+href=["\'](.*?)["\']', re.IGNORECASE)
return webpage_regex.findall(html)
server_list = ['192.168.0.1','192.168.0.2','192.168.0.3','192.168.0.4']
for i in range(1,300)[::10]:
ring = RingHash(server_list,i)
ring.show_server()
test_server_list = map(ring.get_node,set(get_links_by_html(
requests.get('http://zhxfei.com/').text)))
print(Counter(list(test_server_list)))

结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 29, '192.168.0.2': 15, '192.168.0.4': 12, '192.168.0.3': 10})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 23, '192.168.0.2': 18, '192.168.0.4': 13, '192.168.0.3': 12})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 23, '192.168.0.2': 19, '192.168.0.3': 13, '192.168.0.4': 11})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 22, '192.168.0.2': 22, '192.168.0.4': 12, '192.168.0.3': 10})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 25, '192.168.0.2': 19, '192.168.0.4': 16, '192.168.0.3': 6})
...

发现自己写的有问题……感觉不算很均匀啊….233333

感觉是hash算法的问题,刚开始准备用hashlib中的md5,但是对hash ring的描述是0-2**32-1啊,所以我用了官网提示的CRC32这个算法…
郁闷…..

如果把gener_key中改成生成为md5,那么分布性会大大提高

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 18, '192.168.0.2': 17, '192.168.0.3': 16, '192.168.0.4': 15})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 19, '192.168.0.2': 16, '192.168.0.3': 16, '192.168.0.4': 15})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.2': 18, '192.168.0.4': 16, '192.168.0.1': 16, '192.168.0.3': 16})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.2': 19, '192.168.0.1': 17, '192.168.0.4': 15, '192.168.0.3': 15})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 19, '192.168.0.2': 18, '192.168.0.4': 15, '192.168.0.3': 14})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.1': 19, '192.168.0.2': 18, '192.168.0.4': 15, '192.168.0.3': 14})
-----the consistance hash ring contain server-----
odict_keys(['192.168.0.1', '192.168.0.2', '192.168.0.3', '192.168.0.4'])
Counter({'192.168.0.2': 19, '192.168.0.1': 18, '192.168.0.4': 16, '192.168.0.3': 13})

但是md5是128bit啊…..

坚持原创技术分享,您的支持将鼓励我继续创作!