cuckoo-filters: adding some documentation.
parent
eba31d687f
commit
ca05d4cefb
|
@ -0,0 +1,28 @@
|
||||||
|
USING: byte-arrays checksums help.markup help.syntax kernel ;
|
||||||
|
IN: cuckoo-filters
|
||||||
|
|
||||||
|
HELP: cuckoo-insert
|
||||||
|
{ $values { "bytes" byte-array } { "cuckoo-filter" cuckoo-filter } { "?" boolean } }
|
||||||
|
{ $description "Insert the data into the " { $snippet "cuckoo-filter" } ", returning " { $link t } " if the data was inserted." }
|
||||||
|
{ $notes "Attempting to insert data twice will result in the hashed fingerprint of the data appearing twice and the " { $link cuckoo-filter } " size being incremented twice." } ;
|
||||||
|
|
||||||
|
HELP: cuckoo-lookup
|
||||||
|
{ $values { "bytes" byte-array } { "cuckoo-filter" cuckoo-filter } { "?" boolean } }
|
||||||
|
{ $description "Lookup the data from the " { $snippet "cuckoo-filter" } ", returning " { $link t } " if the data appears to be a member. This is a probabilistic test, meaning there is a possibility of false positives." } ;
|
||||||
|
|
||||||
|
HELP: cuckoo-delete
|
||||||
|
{ $values { "bytes" byte-array } { "cuckoo-filter" cuckoo-filter } { "?" boolean } }
|
||||||
|
{ $description "Remove the data from the " { $snippet "cuckoo-filter" } ", returning " { $link t } " if the data appears to be removed." } ;
|
||||||
|
|
||||||
|
ARTICLE: "cuckoo-filters" "Cuckoo Filters"
|
||||||
|
"Cuckoo Filters are probabilistic data structures similar to Bloom Filters that provides support for removing elements without significantly degrading space and performance."
|
||||||
|
$nl
|
||||||
|
"Instead of storing the elements themselves, it stores a fingerprint obtained by using a " { $link checksum } ". This allows for item removal without false negatives (assuming you do not try and remove an item not contained in the filter."
|
||||||
|
$nl
|
||||||
|
"For applications that store many items and target low false-positive rates, Cuckoo Filters can have a lower space overhead than Bloom Filters."
|
||||||
|
$nl
|
||||||
|
"More information is available in the paper by Andersen, Kaminsky, and Mitzenmacher titled \"Cuckoo Filter: Practically Better Than Bloom\":"
|
||||||
|
$nl
|
||||||
|
{ $url "http://www.pdl.cmu.edu/PDL-FTP/FS/cuckoo-conext2014.pdf" } ;
|
||||||
|
|
||||||
|
ABOUT: "cuckoo-filters"
|
|
@ -54,43 +54,43 @@ TUPLE: cuckoo-filter buckets checksum size ;
|
||||||
: <cuckoo-filter> ( capacity -- cuckoo-filter )
|
: <cuckoo-filter> ( capacity -- cuckoo-filter )
|
||||||
<cuckoo-buckets> sha1 0 cuckoo-filter boa ;
|
<cuckoo-buckets> sha1 0 cuckoo-filter boa ;
|
||||||
|
|
||||||
:: cuckoo-insert ( obj cuckoo-filter -- ? )
|
:: cuckoo-insert ( bytes cuckoo-filter -- ? )
|
||||||
obj cuckoo-filter tag-indices :> ( tag! i1 i2 )
|
bytes cuckoo-filter tag-indices :> ( tag! i1 i2 )
|
||||||
cuckoo-filter buckets>> :> buckets
|
cuckoo-filter buckets>> :> buckets
|
||||||
buckets length :> cuckoo-size
|
buckets length :> n
|
||||||
{
|
{
|
||||||
[ tag i1 cuckoo-size mod buckets nth bucket-insert ]
|
[ tag i1 n mod buckets nth bucket-insert ]
|
||||||
[ tag i2 cuckoo-size mod buckets nth bucket-insert ]
|
[ tag i2 n mod buckets nth bucket-insert ]
|
||||||
} 0|| [
|
} 0|| [
|
||||||
cuckoo-filter [ 1 + ] change-size drop t
|
cuckoo-filter [ 1 + ] change-size drop t
|
||||||
] [
|
] [
|
||||||
cuckoo-filter checksum>> :> checksum
|
cuckoo-filter checksum>> :> checksum
|
||||||
{ i1 i2 } random :> i!
|
2 random zero? i1 i2 ? :> i!
|
||||||
max-cuckoo-count [
|
max-cuckoo-count [
|
||||||
drop
|
drop
|
||||||
tag i cuckoo-size mod buckets nth bucket-swap tag!
|
tag i n mod buckets nth bucket-swap tag!
|
||||||
tag i alt-index i!
|
tag i alt-index i!
|
||||||
|
|
||||||
tag i cuckoo-size mod buckets nth bucket-insert
|
tag i n mod buckets nth bucket-insert
|
||||||
dup [ cuckoo-filter [ 1 + ] change-size drop ] when
|
dup [ cuckoo-filter [ 1 + ] change-size drop ] when
|
||||||
] find-integer >boolean
|
] find-integer >boolean
|
||||||
] if ;
|
] if ;
|
||||||
|
|
||||||
:: cuckoo-lookup ( obj cuckoo-filter -- ? )
|
:: cuckoo-lookup ( bytes cuckoo-filter -- ? )
|
||||||
obj cuckoo-filter tag-indices :> ( tag i1 i2 )
|
bytes cuckoo-filter tag-indices :> ( tag i1 i2 )
|
||||||
cuckoo-filter buckets>> :> buckets
|
cuckoo-filter buckets>> :> buckets
|
||||||
buckets length :> cuckoo-size
|
buckets length :> n
|
||||||
{
|
{
|
||||||
[ tag i1 cuckoo-size mod buckets nth bucket-lookup ]
|
[ tag i1 n mod buckets nth bucket-lookup ]
|
||||||
[ tag i2 cuckoo-size mod buckets nth bucket-lookup ]
|
[ tag i2 n mod buckets nth bucket-lookup ]
|
||||||
} 0|| ;
|
} 0|| ;
|
||||||
|
|
||||||
:: cuckoo-delete ( obj cuckoo-filter -- ? )
|
:: cuckoo-delete ( bytes cuckoo-filter -- ? )
|
||||||
obj cuckoo-filter tag-indices :> ( tag i1 i2 )
|
bytes cuckoo-filter tag-indices :> ( tag i1 i2 )
|
||||||
cuckoo-filter buckets>> :> buckets
|
cuckoo-filter buckets>> :> buckets
|
||||||
buckets length :> cuckoo-size
|
buckets length :> n
|
||||||
{
|
{
|
||||||
[ tag i1 cuckoo-size mod buckets nth bucket-delete ]
|
[ tag i1 n mod buckets nth bucket-delete ]
|
||||||
[ tag i2 cuckoo-size mod buckets nth bucket-delete ]
|
[ tag i2 n mod buckets nth bucket-delete ]
|
||||||
} 0||
|
} 0||
|
||||||
dup [ cuckoo-filter [ 1 - ] change-size drop ] when ;
|
dup [ cuckoo-filter [ 1 - ] change-size drop ] when ;
|
||||||
|
|
Loading…
Reference in New Issue