math.vectors.simd: slightly faster 'sum' on 256-bit vectors: add the two components then do horizontal add, instead of doing a horizontal add on each one and adding the results

2009-09-04 02:23:25 -05:00 · 2009-09-04 02:23:25 -05:00 · 6494e7a53b
parent c92e54b560
commit 6494e7a53b
3 changed files with 5 additions and 7 deletions
--- a/basis/math/vectors/simd/functor/functor.factor
+++ b/basis/math/vectors/simd/functor/functor.factor
@ -140,10 +140,8 @@ INSTANCE: A sequence
    [ [ [ underlying2>> ] bi@ A-rep ] dip call ] 3bi
    \ A boa ; inline

-: A-v->n-op ( v1 quot scalar-quot -- v2 )
-    [
-        [ [ underlying1>> A-rep ] dip call ]
-        [ [ underlying2>> A-rep ] dip call ] 2bi
-    ] dip call ; inline
+: A-v->n-op ( v1 combine-quot reduce-quot -- v2 )
+    [ [ [ underlying1>> ] [ underlying2>> ] bi A-rep ] dip call A-rep ]
+    dip call ; inline

 ;FUNCTOR
--- a/basis/math/vectors/simd/simd-docs.factor
+++ b/basis/math/vectors/simd/simd-docs.factor
@ -139,7 +139,7 @@ M\ actor advance optimized.">
 <" USE: compiler.tree.debugger

 M\ actor advance test-mr mr.">
-} ;
+"An example of a high-performance algorithm that uses SIMD primitives can be found in the " { $vocab-link "benchmark.nbody-simd" } " vocabulary." } ;

 ARTICLE: "math.vectors.simd.intrinsics" "Low-level SIMD primitives"
 "The words in the " { $vocab-link "math.vectors.simd.intrinsics" } " vocabulary are used to implement SIMD support. These words have three disadvantages compared to the higher-level " { $link "math-vectors" } " words:"
--- a/basis/math/vectors/simd/simd.factor
+++ b/basis/math/vectors/simd/simd.factor
@ -174,7 +174,7 @@ PRIVATE>
    { v/ [ [ (simd-v/) ] double-4-vv->v-op ] }
    { vmin [ [ (simd-vmin) ] double-4-vv->v-op ] }
    { vmax [ [ (simd-vmax) ] double-4-vv->v-op ] }
-    { sum [ [ (simd-sum) ] [ + ] double-4-v->n-op ] }
+    { sum [ [ (simd-v+) ] [ (simd-sum) ] double-4-v->n-op ] }
 } simd-vector-words

 >>