math.vectors.simd: slightly faster 'sum' on 256-bit vectors: add the two components then do horizontal add, instead of doing a horizontal add on each one and adding the results

db4
Slava Pestov 2009-09-04 02:23:25 -05:00
parent c92e54b560
commit 6494e7a53b
3 changed files with 5 additions and 7 deletions

View File

@ -140,10 +140,8 @@ INSTANCE: A sequence
[ [ [ underlying2>> ] bi@ A-rep ] dip call ] 3bi
\ A boa ; inline
: A-v->n-op ( v1 quot scalar-quot -- v2 )
[
[ [ underlying1>> A-rep ] dip call ]
[ [ underlying2>> A-rep ] dip call ] 2bi
] dip call ; inline
: A-v->n-op ( v1 combine-quot reduce-quot -- v2 )
[ [ [ underlying1>> ] [ underlying2>> ] bi A-rep ] dip call A-rep ]
dip call ; inline
;FUNCTOR

View File

@ -139,7 +139,7 @@ M\ actor advance optimized.">
<" USE: compiler.tree.debugger
M\ actor advance test-mr mr.">
} ;
"An example of a high-performance algorithm that uses SIMD primitives can be found in the " { $vocab-link "benchmark.nbody-simd" } " vocabulary." } ;
ARTICLE: "math.vectors.simd.intrinsics" "Low-level SIMD primitives"
"The words in the " { $vocab-link "math.vectors.simd.intrinsics" } " vocabulary are used to implement SIMD support. These words have three disadvantages compared to the higher-level " { $link "math-vectors" } " words:"

View File

@ -174,7 +174,7 @@ PRIVATE>
{ v/ [ [ (simd-v/) ] double-4-vv->v-op ] }
{ vmin [ [ (simd-vmin) ] double-4-vv->v-op ] }
{ vmax [ [ (simd-vmax) ] double-4-vv->v-op ] }
{ sum [ [ (simd-sum) ] [ + ] double-4-v->n-op ] }
{ sum [ [ (simd-v+) ] [ (simd-sum) ] double-4-v->n-op ] }
} simd-vector-words
>>