math.extras: adding normalized compression distance and compression based dissimilarity.

db4
John Benediktsson 2013-03-25 10:33:41 -07:00
parent 4ffbfc2602
commit c3917cdd02
1 changed files with 15 additions and 6 deletions

View File

@ -1,12 +1,12 @@
! Copyright (C) 2012 John Benediktsson ! Copyright (C) 2012 John Benediktsson
! See http://factorcode.org/license.txt for BSD license ! See http://factorcode.org/license.txt for BSD license
USING: arrays assocs assocs.extras combinators USING: accessors arrays assocs assocs.extras byte-arrays
combinators.short-circuit fry grouping kernel locals math combinators combinators.short-circuit compression.zlib fry
math.combinatorics math.constants math.functions math.order grouping kernel locals math math.combinatorics math.constants
math.primes math.ranges math.ranges.private math.statistics math.functions math.order math.primes math.ranges
math.vectors memoize random sequences sequences.extras sets math.ranges.private math.statistics math.vectors memoize random
sorting ; sequences sequences.extras sets sorting ;
IN: math.extras IN: math.extras
@ -229,3 +229,12 @@ PRIVATE>
pick = [ 1 + ] [ 1 - ] if pick = [ 1 + ] [ 1 - ] if
] if ] if
] each zero? [ drop f ] when ; ] each zero? [ drop f ] when ;
: compression-lengths ( a b -- len(a+b) len(a) len(b) )
[ append ] 2keep [ >byte-array compress data>> length ] tri@ ;
: compression-distance ( a b -- n )
compression-lengths sort-pair [ - ] [ / ] bi* ;
: compression-dissimilarity ( a b -- n )
compression-lengths + / ;