bloom filter
A Bloom filter is a data structure designed to tell you, rapidly and memory-efficiently, whether an element is present in a set.
The price paid for this efficiency is that a Bloom filter is a probabilistic data structure: it tells us that the element either definitely is not in the set or may be in the set.
The base data structure of a Bloom filter is a Bit Vector.
Bloom filter 是一个数据结构,可以快速查询一个元素是否在一个集合中。
它是概率性的数据结构,每次的结果会告诉,这个元素一定不再集合中,或者可能在集合中。
实现它的方法就是用一个bit vector
使用几个hash方程来hash存入的元素,并且将这些hash值再bit vector里面标记。
Bloom Filter 参数
Bloom filter with k hashes
, m bits
in the filter, and n elements
that have been inserted.
Bloom Filter 时间,空间
Given a Bloom filter with m bits and k hashing functions, both insertion and membership testing are O(k). That is, each time you want to add an element to the set or check set membership, you just need to run the element through the k hash functions and add it to the set or check those bits.
Reference:
https://llimllib.github.io/bloomfilter-tutorial/
https://en.wikipedia.org/wiki/Bloom_filter
https://blog.medium.com/what-are-bloom-filters-1ec2a50c68ff