Slava Pestov
b8f75f404e
cpu.ppc: fix typo
2009-10-15 05:01:20 -05:00
Slava Pestov
9e5a9abfe1
cpu.ppc: updates for write barrier and allocation changes (untested)
2009-10-15 04:54:16 -05:00
Slava Pestov
7a6e2f9b07
cpu.ppc.bootstrap: update for JIT relocation changes
2009-10-15 04:47:54 -05:00
Slava Pestov
22e79e8495
compiler: tweak ##write-barrier-imm
2009-10-15 02:40:23 -05:00
Slava Pestov
bfd1f0d6d2
vm: rt-vm relocation now supports accessing a field directly
2009-10-14 19:24:23 -05:00
Slava Pestov
10ad5cad53
Working on adding support for the new write barrier to optimized code
2009-10-14 02:06:01 -05:00
Joe Groff
b82c8b4416
use TEST reg, reg to compare integer equality with zero
2009-10-10 13:13:53 -05:00
Joe Groff
2577ab83a6
only emit ##alien-vector/##set-alien-vector insns if the rep is available
2009-10-10 12:53:10 -05:00
Joe Groff
1e3c9321ae
don't use MOVSLDUP/MOVSHDUP to do specialized shuffles unless sse3 is available
2009-10-10 12:00:47 -05:00
Joe Groff
a7a77cd03e
fix x86 uchar %scalar>integer
2009-10-10 10:39:23 -05:00
Joe Groff
5158a12d32
rename ##shuffle-vector to ##shuffle-vector-imm, and add a new ##shuffle-vector for dynamic shuffles. have vshuffle use ##shuffle-vector to do word and byte shuffles on x86
2009-10-09 21:26:27 -05:00
Slava Pestov
5f0d4abb4a
cpu.architecture: move dummy -reps words here, from cpu.ppc
2009-10-08 03:48:03 -05:00
Joe Groff
98836a9e2e
break vector compare intrinsics into %compare, %or, and %not instructions that map directly to cpu instructions
2009-10-07 15:27:03 -05:00
Joe Groff
43b51ef2eb
decompose %unpack-vector-head/tail into %compare-vector/%merge-vector-head/tail or %tail>head-vector/%unpack-vector-head insns when there isn't an actual unpack insn; get rid of fake x86 implementations
2009-10-07 14:09:46 -05:00
Joe Groff
5152c3b06d
sse doesn't actually have an unsigned->unsigned pack instruction
2009-10-07 12:00:31 -05:00
Joe Groff
a13e75f4f4
don't generate a ##not-vector instruction if the cpu doesn't have one; instead, fall back to a ##fill-vector/##xor-vector combo. get rid of pretend %not-vector in cpu.x86
2009-10-07 11:59:36 -05:00
Joe Groff
444624e79f
fix x86 %unpack-vector insns
2009-10-06 20:38:51 -05:00
Joe Groff
2edccca0bb
oops...PACKUSDW is sse4 only
2009-10-06 20:09:50 -05:00
Joe Groff
425ea05529
%float>integer-vector should truncate
2009-10-06 13:57:54 -05:00
Joe Groff
84ecb1266d
add insns for vector pack, unpack, integer>float, and float>integer
2009-10-05 22:34:14 -05:00
Slava Pestov
931107397c
compiler.cfg: remove _gc instruction, it doesn't need to exist, and change GC checks to ensure that the right amount of space is available instead of blindly checking for 1Kb
2009-10-05 05:27:49 -05:00
Joe Groff
dca9d3e535
add %merge-vector-head and %merge-vector-tail instructions to back vmerge
2009-10-03 21:48:53 -05:00
Joe Groff
335df20713
add intrinsics for v<=, v<, v=, v>, v>=, vunordered?
2009-10-03 11:29:34 -05:00
Joe Groff
14a97b6a05
Merge branch 'master' of git://factorcode.org/git/factor
2009-10-03 10:02:26 -05:00
Joe Groff
b1ec36a324
extend x86 %compare-vector to cover all comparison codes, sometimes stupidly for now
2009-10-02 23:19:56 -05:00
Slava Pestov
dba965486c
cpu.arm.assembler: dust it off, update to work with contemporary Factor, and clean it up a bit
2009-10-02 20:18:34 -05:00
Joe Groff
e2e75c6b3a
add intrinsic for vnot/vbitnot
2009-10-02 20:04:28 -05:00
Slava Pestov
9ac62dc476
cpu.ppc: remove useless comment
2009-10-02 03:31:53 -05:00
Slava Pestov
aa0359f78d
Merge branch 'reentrantvm' of git://github.com/phildawes/factor
2009-10-02 03:28:21 -05:00
Joe Groff
9d424a1092
Merge branch 'master' of git://factorcode.org/git/factor
...
Conflicts:
basis/compiler/codegen/codegen.factor
2009-10-01 23:14:16 -05:00
Joe Groff
7b13fa4283
fold test-vector/branch sequences into a test-vector-branch instruction
2009-10-01 19:53:30 -05:00
Joe Groff
228ad950bb
%test-vector instruction for vany?, vall?, vnone?
2009-10-01 15:35:38 -05:00
Joe Groff
94070c11aa
%compare-vector instruction (only does v= for now)
2009-10-01 14:31:37 -05:00
Joe Groff
3ba79be651
Revert "add a %blend-vector intrinsic for v?"
...
This reverts commit 21e4b28b67 .
2009-09-30 23:40:37 -05:00
Joe Groff
37a091a188
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-30 23:04:04 -05:00
Joe Groff
21e4b28b67
add a %blend-vector intrinsic for v?
2009-09-30 23:03:59 -05:00
Slava Pestov
65421b111b
math.vectors.simd: use fallbacks for hlshift, hrshift, vshuffle if parameter is not a literal;al; element access in int-4 on x86-64 now sign-extends the value; don't throw error at compile time if parameter for vshuffle does not have enough elements
2009-09-30 20:04:37 -05:00
Slava Pestov
de58c3c294
cpu.ppc: update for alien intrinsic changes
2009-09-30 18:22:59 -05:00
Phil Dawes
86593598d0
ppc asm to pass vm pointer: alien + compiled code
2009-09-30 21:23:53 +01:00
Slava Pestov
8e201ca4b7
Various minor compiler tweaks: Combine address calculation with dereferencing in alien accessors; convert SIMD XOR of a vector with itself into an XOR of the destination with itself; convert SIMD unbox of zero vector into XOR of the destination with itself; fix SIMD indexing on x86-64
2009-09-30 05:00:36 -05:00
Slava Pestov
2b13245704
math.vectors.simd: add fast intrinsic for 'nth', replace broadcast primitive with shuffles
2009-09-29 04:48:11 -05:00
Slava Pestov
a6e8277b2c
math.vectors.simd: add vshuffle intrinsic
2009-09-28 23:12:13 -05:00
Slava Pestov
db217295b0
Work in progress
2009-09-28 17:31:34 -05:00
Slava Pestov
e343b46479
cpu.ppc: update for %unary/binary-float-function change
2009-09-28 16:40:52 -05:00
Slava Pestov
49dba53760
cpu.x86: cleanups
2009-09-28 16:38:35 -05:00
Joe Groff
4e2e45b70d
use PSHUFD for longlong-2 broadcast when dst != src to avoid a %copy
2009-09-28 12:04:08 -05:00
Joe Groff
3f90473f09
use MOVDDUP for double-2 broadcast to eliminate a %copy
2009-09-28 12:00:03 -05:00
Joe Groff
467c389948
cpu.x86.assembler: make SSE shuffle instructions accept an array of indexes so they're easier to use
2009-09-28 11:45:45 -05:00
Joe Groff
f7d416a4e4
SSE integer gather and broadcast
2009-09-28 11:24:08 -05:00
Slava Pestov
f08521bf83
Fixing various test failures caused by C type parser change, and clarify C type docs some more
2009-09-28 08:48:39 -05:00
Slava Pestov
1109fb5725
math.vectors.simd: add intrinsic for int-4-boa, uint-4-boa, fix tests for C type parser change, fix software fallback for horizontal shifts
2009-09-28 06:34:22 -05:00
Slava Pestov
dc1b6043dc
cpu.x86: shifts didn't work if dst != src1; re-organize file a bit
2009-09-28 05:39:53 -05:00
Slava Pestov
542dd577d9
cpu.x86.32: fix %unary/binary-float-function on Windows; need to look up symbols in libm and not VM binary
2009-09-28 04:51:53 -05:00
Phil Dawes
6f0d25a8b3
ppc asm to pass vm pointer: initial bootstrap
2009-09-28 07:48:37 +01:00
Slava Pestov
daf8f0ebba
cpu.x86: fix regression: fsqrt intrinsic wasn't used
2009-09-28 02:27:55 -05:00
Slava Pestov
10c5fe5933
math.vectors.simd: add hlshift, hrshift (128-bit shift), vbitandn intrinsics
2009-09-28 02:17:46 -05:00
Slava Pestov
e8cfaccef0
compiler.cfg: nuke ##bignum>integer and ##integer>bignum since they were unused
2009-09-27 20:36:05 -05:00
Slava Pestov
6dd8e4657e
Merge branch 'master' into more_aggressive_coalescing
2009-09-27 19:29:50 -05:00
Slava Pestov
6f2a4eba51
compiler.cfg.linear-scan: fix partial sync point logic in case where dst == src, and clean up spilling code
2009-09-27 19:28:20 -05:00
Slava Pestov
2efab6efad
cpu.x86.32: implement %unary-float-function and %binary-float-function; speeds up partial-sums and struct-arrays benchmarks
2009-09-27 18:06:30 -05:00
Slava Pestov
a267100781
compiler.cfg.ssa.destruction: more aggressive coalescing work in progress
2009-09-27 17:17:26 -05:00
Joe Groff
bf3eef9e2d
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-26 20:38:19 -05:00
Joe Groff
e30819bcac
move alien.inline, alien.cxx, alien.marshall to unmaintained; nuke alien.structs
2009-09-26 20:37:42 -05:00
sheeple
24b27f4c42
Fixing PPC backend for ##slot change
2009-09-26 13:21:42 -05:00
sheeple
2b35f52ed2
Merge branch 'slots' of git://factorcode.org/git/factor into slots
...
Conflicts:
basis/cpu/x86/x86.factor
2009-09-26 03:12:42 -05:00
Daniel Ehrenberg
01082d743d
An attempt at porting the slot change to PPC
2009-09-26 02:58:18 -05:00
Daniel Ehrenberg
364332bd70
Completing slot and set-slot changes on x86
2009-09-26 01:39:48 -05:00
Daniel Ehrenberg
fb7f6ab455
Making ##slot and ##set-slot not have a temporary parameter
2009-09-26 00:28:14 -05:00
Slava Pestov
2e1be3f513
cpu: cleanups
2009-09-25 21:47:05 -05:00
Phil Dawes
64aa4fba9f
removed %vm-invoke-*-arg completely
2009-09-25 20:03:03 +01:00
Phil Dawes
5b404aae7e
moved %(un)nest-stacks out to cpu specific files to eliminate %vm-invoke from compiler.codegen
2009-09-25 19:32:08 +01:00
Phil Dawes
f9e736c1f0
isolated %vm-invoke-blah-arg crap to 64.factor
2009-09-25 19:02:41 +01:00
Phil Dawes
baa41f451f
removed param-reg-* HOOKs
2009-09-25 18:58:55 +01:00
Phil Dawes
c0957ed908
compiler.codegen passes temp reg to %call-gc
2009-09-25 18:48:13 +01:00
Phil Dawes
aa71248937
made inline_gc a VM_C_API function
2009-09-25 18:29:07 +01:00
Slava Pestov
fab916fb97
Merge branch 'fix_stack_alignment' of git://github.com/phildawes/factor
2009-09-24 19:54:51 -05:00
Phil Dawes
8b005f5b1d
make inline_gc regparm(3) and cleaned up %call-gc stack alignment
2009-09-24 21:45:56 +01:00
Slava Pestov
a562722c4c
cpu.ppc: add representation hooks for shifts
2009-09-24 13:00:12 -05:00
Slava Pestov
2ea0b9da1d
Merge branch 'vm_cleanup' of git://github.com/phildawes/factor
2009-09-24 04:31:55 -05:00
Slava Pestov
1b30310a35
cpu.x86: don't generate SSE2 instructions if only SSE1 is available
2009-09-24 04:07:15 -05:00
Slava Pestov
a702bfa215
cpu.ppc: fix compile errors
2009-09-24 03:55:01 -05:00
Slava Pestov
24039cb56a
math.vectors.simd: add v<< and v>> intrinsics for bitwise shifts on elements
2009-09-24 03:32:39 -05:00
Phil Dawes
c747e39923
x86 bootstrap cleanup: renamed arg to arg1
2009-09-24 08:16:57 +01:00
Phil Dawes
911471c411
removed superflous whitspace lines
2009-09-24 08:02:14 +01:00
Slava Pestov
a345c26a14
cpu.ppc: make it load
2009-09-24 00:13:27 -05:00
Slava Pestov
7c4632d2b9
cpu.ppc: fix typos
2009-09-23 23:38:17 -05:00
Slava Pestov
3581d0b09b
cpu.x86/ppc: unify register-to-register moves using %copy so that better coalescing can eliminate more moves later
2009-09-23 22:49:54 -05:00
Slava Pestov
5854fa0c03
cpu.ppc: add dummy vector ops
2009-09-23 20:31:12 -05:00
Slava Pestov
165496d2f2
Add longlong-2, ulonglong-2, longlong-4, ulonglong-4 SIMD types, fix int-4 multiplication on SSE2
2009-09-23 20:23:25 -05:00
Slava Pestov
960602059d
cpu.x86.assembler: cleanup
2009-09-23 19:30:36 -05:00
Slava Pestov
34a533d9f4
cpu.x86.features: don't fold away sse-version, instead memoize it and recompute on startup
2009-09-23 05:13:15 -05:00
Slava Pestov
abac963882
math.vectors.simd: new operations: vabs vsqrt vbitand vbitor vbitxor
2009-09-23 02:47:14 -05:00
Slava Pestov
fda8870848
Merge branch 'master' into integer-simd
2009-09-22 20:21:40 -05:00
Slava Pestov
9b26bd059d
cpu.ppc: fix load errors
2009-09-22 05:24:34 -05:00
Slava Pestov
e4872212b1
cpu.x86: fix using list
2009-09-20 23:24:30 -05:00
Slava Pestov
e04fba6bc7
Fix conflict
2009-09-20 23:18:07 -05:00
Slava Pestov
66871995c9
math.vectors.simd: add saturated arithmetic operations
2009-09-20 23:16:02 -05:00
Slava Pestov
78c949b9b7
math.vectors: add v+- word which is accelerated by SSE3
2009-09-20 17:43:16 -05:00
Slava Pestov
dfb43bd2ca
More integer SIMD work
...
- move generated vocab support from specialized-arrays to vocabs.generated
- add fuzz testing to math.vectors.simd
- add alien type support for integer SIMD vectors
- SIMD: parsing word generates a SIMD type, instead of pre-generating them all in math.vectors.simd
2009-09-20 16:48:17 -05:00
Slava Pestov
0d77efef29
cpu.x86: cleanup
2009-09-20 04:17:34 -05:00
Slava Pestov
fc5fe2bd2a
Merge Phil Dawes' VM work
2009-09-20 03:48:08 -05:00
Slava Pestov
ea2bcd69c7
math.vectors.simd: redesign to be more flexible, integer SIMD work in progress
2009-09-20 02:08:32 -05:00
Joe Groff
4a1422e7fe
move some allocation words that don't really have much to do with c types out of alien.c-types into a new alien.data vocab
2009-09-17 22:36:05 -05:00
Joe Groff
db2eba9b58
disambiguate math:float in cpu.ppc
2009-09-17 19:10:40 -05:00
Joe Groff
ac5ea1769b
get compiler tests loading
2009-09-16 09:20:47 -05:00
Phil Dawes
30b8b98446
small x86 asm cleanup
2009-09-16 08:22:17 +01:00
Phil Dawes
a73886942a
vm passed in primitives as arg0 for x86.64
2009-09-16 08:22:17 +01:00
Phil Dawes
123c6ce703
fixed up some alien boxing (x86 32 & 64)
2009-09-16 08:20:50 +01:00
Phil Dawes
46dac01d50
fixed vm ptr passing to to_value_struct
2009-09-16 08:20:50 +01:00
Phil Dawes
54d8285c7e
fixed vm ptr passing to box_small_struct
2009-09-16 08:20:50 +01:00
Phil Dawes
0841b7ee90
fixed vm ptr passing to box_value_struct
2009-09-16 08:20:50 +01:00
Phil Dawes
898f5be1e0
quick test vocab for mt stuff
2009-09-16 08:20:50 +01:00
Phil Dawes
26586c24f0
added vm passing to some alien/boxing functions and added some vm asserts
2009-09-16 08:20:10 +01:00
Phil Dawes
d7e2f770c0
vm ptr passed to lazy_jit_compile on x86.64
2009-09-16 08:20:10 +01:00
Phil Dawes
44d2d8672e
Primitives now pass vm ptr on 64bit x86
2009-09-16 08:20:09 +01:00
Phil Dawes
f5e6d43e1e
separated vm-1st-arg and vm-3rd-arg asm invoke words (needed for ppc & x86.64)
2009-09-16 08:20:09 +01:00
Phil Dawes
6e5ddc0c33
vm pointer passed to nest_stacks and unnest_stacks (win32)
2009-09-16 08:17:26 +01:00
Phil Dawes
b629429086
Dev checkpoint
2009-09-16 08:17:26 +01:00
Phil Dawes
6c046ec5bf
added vm ptr to x86.32 boxing asm
2009-09-16 08:16:33 +01:00
Phil Dawes
780415b159
added code to pass vm ptr to some unboxers
2009-09-16 08:16:32 +01:00
Phil Dawes
2a1a4ccf27
fixed up getenv compiler intrinsic to use vm struct userenv
2009-09-16 08:16:32 +01:00
Phil Dawes
cb3df86491
moved cards_offset and decks_offset into vm struct (for x86)
2009-09-16 08:16:31 +01:00
Phil Dawes
fd72e140d2
nursery global variable moved into vm
2009-09-16 08:16:31 +01:00
Phil Dawes
6da959ff3b
renamed to vm-field-offset. Slava's better at naming than me
2009-09-16 08:16:31 +01:00
sheeple
3602f86ab1
ppc asm to get stack_chain using vm ptr
2009-09-16 08:16:31 +01:00
Phil Dawes
77a13b1b6a
Added a vm C-STRUCT, using it for struct offsets in x86 asm
2009-09-16 08:16:31 +01:00
Phil Dawes
f9f1031dd8
moved stack_chain into vm struct
2009-09-16 08:16:31 +01:00
Phil Dawes
53aa98902e
throw_impl now forwards the vm ptr
2009-09-16 08:16:30 +01:00
Phil Dawes
60d0300876
passing vm ptr to lazy_jit_compile mostly working
2009-09-16 08:16:30 +01:00
Phil Dawes
1fda8af73b
Added %vm-invoke to pass vm ptr to vm functions (x86.32 only, otherwise uses singleton vm)
2009-09-16 08:16:30 +01:00
Phil Dawes
df37e010d4
vm ptr passed to primitives on X86.32 (other cpus still use singleton vm ptr)
2009-09-16 08:16:30 +01:00
Joe Groff
334e93bbbf
get things to a point where they bootstrap again
2009-09-15 21:43:18 -05:00
Joe Groff
e33857a0c3
Merge branch 'master' into c-type-words
2009-09-15 19:14:41 -05:00
Joe Groff
02b797f11b
struct classes now make their own C type without help from alien.structs. remove alien.structs dependencies from everywhere outside of alien and compiler, and have the FFI handle both alien.structs and classes.struct c-types
2009-09-15 17:38:49 -05:00
Joe Groff
e5145b5a48
convert compiler cpu backends to use c-type words
2009-09-15 16:08:42 -05:00
Slava Pestov
680e6424bc
cpu.ppc: fix %single>double-float and %double>single-float
2009-09-10 13:04:58 -05:00
Joe Groff
687a86fbb7
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-09 17:14:48 -05:00
Joe Groff
54b8f04433
altivec instructions for powerpc assembler
2009-09-09 17:14:36 -05:00
Slava Pestov
c04fb12f4c
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-09 13:56:20 -05:00
Slava Pestov
66f500bdd7
Fix the build
2009-09-09 13:44:54 -05:00
Slava Pestov
9f33d7e0fa
cpu.ppc: fix bootstrap
2009-09-08 23:53:51 -05:00
Slava Pestov
dd56449145
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-08 22:34:17 -05:00
Slava Pestov
19a5f58b53
cpu.x86: tweak SIMD intrinsics
2009-09-08 22:34:01 -05:00
Joe Groff
fe015ce2f0
no really, update ppc for argument order changes
2009-09-08 22:21:00 -05:00
Joe Groff
b71f50ee04
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-08 21:58:25 -05:00
Joe Groff
b64b4a5cd9
update cpu.ppc for argument order changes
2009-09-08 21:58:11 -05:00
Slava Pestov
020e3b5713
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-08 21:51:21 -05:00
Slava Pestov
092b31910d
compiler: separate ##save-context instruction from ##alien-invoke, generate a ##save-context for libm calls, and add a pass to combine multiple context saves within a basic block. Fixes crashes with FP traps thrown by libm functions on x86-32
2009-09-08 21:50:55 -05:00
Joe Groff
f4e574383c
typos in cpu.ppc
2009-09-08 21:44:11 -05:00
Slava Pestov
fe0c137a1b
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-08 19:35:14 -05:00
Slava Pestov
3e90786bc1
Fix various test failures
2009-09-08 19:18:56 -05:00
Doug Coleman
8351100f7e
Merge branch 'master' of git://factorcode.org/git/factor
2009-09-08 17:05:58 -05:00
Joe Groff
025a5b7b15
split unordered and ordered float comparison intrinsics in compiler; generate only unordered comparisons for now
2009-09-08 17:04:26 -05:00
Doug Coleman
74dea1e898
duplicate using
2009-09-08 17:02:31 -05:00
Slava Pestov
6396e901ca
cpu.x86.features: better wording
2009-09-08 14:17:05 -05:00
Slava Pestov
05bffecab7
cpu.x86.features: add -sse-version command-line switch to override SSE detection
2009-09-08 13:56:37 -05:00
Slava Pestov
8eeeeb5c5b
inline alien-vector and set-alien-vector if SIMD is not available for a small speedup
2009-09-08 13:56:17 -05:00
Slava Pestov
ef09991500
Fixes
2009-09-08 00:13:18 -05:00
Slava Pestov
17821626c3
Fix conflicts
2009-09-07 23:51:25 -05:00
Joe Groff
9430fdc4b6
i had comisd/ucomisd backwards on x86
2009-09-04 12:30:30 -05:00
Slava Pestov
09c8175919
fix some typos in cpu.ppc
2009-09-04 11:18:41 -05:00
Slava Pestov
7f0ab1dc1e
Merge branch 'master' of git://factorcode.org/git/factor into ppc-float-compare
2009-09-04 10:58:50 -05:00
Joe Groff
e36700feb0
update powerpc compiler to generate correct float comparisons
2009-09-04 10:51:12 -05:00
Slava Pestov
7571d50bd3
cpu.ppc: fix typo
2009-09-04 06:41:33 -05:00
Slava Pestov
1f5193198b
compiler: clean up code generation for alien boxing/unboxing a bit
2009-09-03 21:22:43 -05:00
Joe Groff
b1ba82c84f
convert comparison branch code in compiler to use locals
2009-09-03 21:19:39 -05:00
Slava Pestov
20dfbf7ac8
More SIMD work
...
- Rename SIMD types and register representations: <type>-<count> rather than <count><type>-array
- Make a functor to define 256-bit vector types, use it to define float-8 type
- Make SIMD instructions pure-insns so that they participate in value numbering
2009-09-03 20:58:56 -05:00
Joe Groff
0b9e5c034a
add compiler comparison codes for floating-point unordered comparisons; update x86 backend to generate proper code for all floating-point comparisons
2009-09-03 20:32:05 -05:00
Slava Pestov
80ed4bc918
Merge branch 'master' into simd
2009-09-03 03:45:58 -05:00
Slava Pestov
f811208271
Detect SSE version and enable the correct set of SIMD intrinsics
2009-09-03 03:28:38 -05:00
Slava Pestov
52b99c050e
Initial implementation of SSE vector intrinsics:
...
- cpu.architecture: add SSE vector representations
- compiler.cfg.intrinsics.alien: remove an attempt at optimization that value numbering handles now
- compiler.cfg.representations: support instructions where the representation is set in the 'rep' slot, and support conversions between single and double floats
- alien-float, set-alien-float now use the single float representation, and the conversion is implicit; this fixes a long-standing bug where a register could get clobbered because of how %set-alien-float was defined on x86
- math.vectors.specialization: add support for SIMD specialization (where the vector word's body is replaced by another quotation), also specialize the 'sum' word
- math.vectors.simd: 4float-array, 2double-array, 4double-array types, and specializers for the math.vectors words
2009-09-03 02:33:07 -05:00
Joe Groff
e9a5ed5931
i suck at reading tech docs--those were m64 instructions, not mm instructions
2009-09-02 12:58:35 -05:00
Joe Groff
0ddf19d033
get rid of useless mm->xmm instructions in cpu.x86.assembler, add MOVHLPS and MOVLHPS
2009-09-02 11:06:08 -05:00
Slava Pestov
775b9af2f7
compiler: eliminate boilerplate by centralizing info in declarative INSN: syntax
2009-09-02 06:22:37 -05:00
Slava Pestov
14a063dd92
cpu.ppc: implement fast float function calls; 3x speedup on benchmark.struct-arrays on PowerPC
2009-09-01 15:19:26 -05:00
Slava Pestov
e659203907
cpu.ppc: fix %box-displaced-alien
2009-08-30 20:56:04 -05:00
Slava Pestov
b35a01879e
%box-displaced-alien: fix clobberage found by Doug
2009-08-30 05:11:08 -05:00
Slava Pestov
f6a836d1e9
compiler.cfg.linear-scan now supports partial sync-points where all registers are spilled; taking advantage of this, there are new trigonometric intrinsics which yield a 2x performance boost on benchmark.struct-arrays and a 25% boost on benchmark.partial-sums
2009-08-30 04:52:01 -05:00
Slava Pestov
f30aa5d20e
compiler: add fixnum-min/max intrinsics; ~10% speedup on benchmark.yuv-to-rgb
2009-08-28 19:02:59 -05:00
Slava Pestov
99bf9fadfb
Performance improvements to make struct-arrays benchmark faster
...
- improved optimization of ##unbox-any-c-ptr on ##box-displaced-alien; convert it to ##unbox-c-ptr where possible using class info stored in the ##bda instruction
- make fcos, fsin, etc inline again; everything in math.libm inline again, except for fsqrt which is an intrinsic
- convert min and max on floats to float-min and float-max
- make min and max not inline, so that the above can work
- struct-arrays: rice a bit so that more fixnums come up
2009-08-28 05:21:16 -05:00
sheeple
8970cbc961
cpu.ppc: fix ##box-displaced-alien
2009-08-27 04:43:45 -05:00
Slava Pestov
9caf3f9248
compiler: new inline intrinsic for <displaced-alien> where the inputs have known types; value numbering now eliminates unnecessary allocation of displaced aliens if the result is immediately unboxed again
2009-08-27 00:06:19 -05:00
Slava Pestov
4fe0257169
cpu.x86: use SQRTSD instruction for math.libm:fsqrt word
2009-08-25 23:22:15 -05:00
Slava Pestov
e44b2eb875
cpu.ppc.assembler: LOAD32 assembler macro was busted
2009-08-25 22:37:10 -05:00
Slava Pestov
9805dde418
basis/cpu: eliminate some usages of rot
2009-08-25 19:38:48 -05:00
Slava Pestov
e9818be8ae
cpu.ppc.assembler: fix FMR and FMR. opcodes
2009-08-25 19:33:35 -05:00
sheeple
90b3921b31
cpu.ppc: integer>fixnum scratch area overlapped with the rest of stack frame, very bad
2009-08-22 20:23:28 -05:00
Slava Pestov
1b8636bad5
Merge branch 'master' of git://factorcode.org/git/factor
2009-08-21 18:48:44 -05:00
Slava Pestov
b1a12f85e4
cpu.ppc: work in progress
2009-08-21 18:48:34 -05:00
Doug Coleman
d1ce837569
Delete empty unit tests files, remove 1- and 1+, reorder IN: lines in a lot of places, minor refactoring
2009-08-13 19:21:44 -05:00
Slava Pestov
2d575d7ec9
compiler.cfg: virtual registers are integers now, and representations are stored off to the side. Fix bug in representation selection that would manifest if a value was used as a float and a fixnum in different branches; cannot globally unbox float in this case
2009-08-08 04:02:18 -05:00
Slava Pestov
4d2160799f
Split off the notion of a register representation from a register class
2009-08-07 17:44:50 -05:00
Slava Pestov
a7e61632d9
cpu.x86.assembler: make some words private
2009-08-05 18:30:42 -05:00
Slava Pestov
203a64f236
cpu.ppc: put spill slots and GC roots in stack frame where subroutine calls can't clobber them
2009-07-31 23:47:07 -05:00
Slava Pestov
ebec4ffd75
Merge branch 'master' of git://factorcode.org/git/factor
2009-07-31 17:59:00 -05:00
Slava Pestov
0e59d29282
cpu.ppc: fix small typos
2009-07-31 17:57:15 -05:00
Doug Coleman
c33343b302
fix using list on win64
2009-07-31 16:27:18 -05:00
Slava Pestov
e32477fd59
cpu.ppc: Updating PowerPC backend for codegen changes over the last two months: new shift intrinsics added, fixnum overflow intrinsics are now treated like conditionals, GC checks are more complex and have a different API
2009-07-30 21:44:22 -05:00
Slava Pestov
db55a031df
Move a bunch of GC check generation logic to platform-independent side
2009-07-30 21:28:27 -05:00
Slava Pestov
d09013b311
Merge branch 'master' of git://factorcode.org/git/factor
2009-07-30 19:11:02 -05:00
Joe Groff
b49fb43b60
Merge branch 'master' of git://factorcode.org/git/factor
2009-07-30 11:05:36 -05:00
Joe Groff
c59c619364
add additional SSE2 packed integer operations
2009-07-30 11:05:12 -05:00
Slava Pestov
99216b8435
compiler.cfg: Get inline GC checks working again, using a dataflow analysis to compute uninitialized stack locations in compiler.cfg.stacks.uninitialized. Re-enable intrinsics which use inline allocation
2009-07-30 09:19:44 -05:00
Slava Pestov
e3c38262ed
Oopsie
2009-07-30 08:27:52 -05:00
Slava Pestov
c9feb6f012
cpu.x86: Fix shuffle bug. Shuffling bugs occurring in code that runs before optimizer/stack checker is online are only caught at runtime during bootstrap, what a pain
2009-07-30 05:12:40 -05:00
Slava Pestov
4842641e75
cpu.x86: fix a bug in small-register logic on 32-bit. Also, on 32-bit, we don't need to do any special register shuffling to work with 16-bit operands since all registers have 16-bit variants. So now only 8-bit operands on x86-32 require special treatment
2009-07-30 05:04:46 -05:00
Slava Pestov
32a3abc9b4
cpu.x86: update non-optimizing compiler backends for assembler vocab split
2009-07-30 02:22:37 -05:00
Slava Pestov
226908d2d2
cpu.x86.assembler: fix extended 8-bit registers (DIL, SIL, SPL, BPL)
2009-07-29 22:32:22 -05:00
Slava Pestov
0899934220
cpu.x86: use full set of 8-bit, 16-bit and 32-bit registers on x86-64 to avoid clumsy save/restore logic
2009-07-29 21:56:37 -05:00
Slava Pestov
7831293fda
cpu.x86.assembler: move operands to operands sub-vocabulary, clean up small-reg-* code in compiler backend
2009-07-29 21:44:08 -05:00
Slava Pestov
c1fd97d515
Merge branch 'dcn'
2009-07-28 12:37:45 -05:00
Joe Groff
4c664a469a
SSE4 opcodes for x86 assembler
2009-07-28 12:19:37 -05:00
Slava Pestov
afd914c808
Merge branch 'master' into dcn
2009-07-28 11:20:43 -05:00
Joe Groff
1fe11f7c87
SSE1–SSSE3 opcodes + branch hints for x86 assembler
2009-07-28 00:22:27 -05:00
Slava Pestov
f0a5ac3fbb
cpu.x86: compile a load of zero, and adds, subs where dst = src1 more efficiently
2009-07-27 22:27:54 -05:00
Slava Pestov
39a70db831
Improve code generation for shift word: add intrinsics for fixnum-shift-fast in the case where the shift count is not constant, transform 1 swap shift into a more overflow check with open-coded fast case, transform bitand into fixnum-bitand in more cases
2009-07-16 23:50:48 -05:00
Slava Pestov
99faf3c79f
Overflowing fixnum intrinsics now expand into several CFG nodes. This speeds up the common case since only the uncommon case is now a stack syncpoint
2009-07-16 18:29:40 -05:00
Slava Pestov
1eae4286cd
compiler.cfg: split off condition codes into a comparisons sub-vocabulary
2009-07-13 14:42:52 -05:00
Slava Pestov
27c0577c91
cpu.x86.32: don't emit sub %esp,0x0 in prologue on Linux and Windows
2009-07-01 18:13:45 -05:00
Slava Pestov
554559c0b1
%dispatch: sometimes the generated sequence is one byte longer, so instead of hard-coding it, compute the right length
2009-06-30 18:11:15 -05:00
Slava Pestov
4782c737ab
cpu.x86: don't clobber src in %dispatch
2009-06-30 16:47:22 -05:00
Slava Pestov
a61a992bfd
cpu.x86.assembler: IMUL2 instruction was busted for immediate operands
...
When given a register and an immediate, it would generate imul imm,dst,dst however the 64-bit prefix was generated wrong and if dst was an extended register only the first operand would be an extended register. To fix this, change IMUL2 to not work on immediates anymore, and added a new IMUL3 that takes a destination register, source register, and immediate. Also, change compiler.cfg.two-operand to not two-operandize %mul-imm, since this isn't needed anymore.
This fixes the sporadic benchmark.tuple-arrays crash on 64-bit machines.
2009-06-08 21:15:52 -05:00
Slava Pestov
0d265fe016
Remove %dispatch-label since its tehe same on all platforms; fix %gc on PowerPC
2009-06-07 21:46:28 -05:00
Slava Pestov
f0b132fa7f
Fix 32-bit bootstrap
2009-06-03 03:23:55 -05:00
Slava Pestov
fd710385e5
cpu.x86: fix small register intrinsics on x86-64
2009-06-03 03:22:46 -05:00
Slava Pestov
7aca076408
GC checks now save and restore registers
2009-06-02 18:23:47 -05:00
Slava Pestov
3de85158de
Merge branch 'master' into global_optimization
2009-06-01 03:12:32 -05:00
Slava Pestov
096803e58f
Redo compiler.codegen.fixup and get %dispatch to work
2009-06-01 02:32:36 -05:00
Slava Pestov
64114947d2
Various improvements aimed at getting local optimization regressions fixed:
...
- Rename _gc to ##gc
- Absolute labels are now supported
- Generate _dispatch-label
2009-05-31 23:28:08 -05:00
Slava Pestov
e2b8b04d15
cpu.x86.features: add RDTSC support. This is a new vocabulary with words: sse2? instruction-counter count-instructions
2009-05-31 15:02:14 -05:00
Slava Pestov
40949800bf
Fixing various bugs; alias analysis wasn't handling ##phi nodes, stack analysis incorrectly handled height-changing back edges and ##fixnum-*, clean up ##dispatch generation
2009-05-29 01:39:14 -05:00
U-C4\Administrator
9c85bc8ce3
fix duplicate using lines
2009-05-17 20:29:32 -05:00
Slava Pestov
514956537f
Fix cpu.ppc for strict vocabulary search path semantics
2009-05-16 08:58:10 -05:00
Slava Pestov
ba04d5af1e
Update documentation for stricter vocabulary search path semantics
2009-05-16 00:29:21 -05:00
Slava Pestov
3e7269731b
cpu.ppc: really fix bool type
2009-05-10 19:10:20 -05:00
Slava Pestov
9b491d1442
Fix bool type on PowerPC
2009-05-10 19:01:38 -05:00
Slava Pestov
931db821d1
Merge branch 'master' of git://factorcode.org/git/factor
2009-05-07 23:26:37 -05:00
Slava Pestov
e007cb56e8
cpu.ppc: fix alien-indirect
2009-05-07 23:26:33 -05:00
Slava Pestov
5e35f19312
cpu.ppc: bools are 4 bytes on OS X/PowerPC
2009-05-07 23:18:41 -05:00
Slava Pestov
02bd871575
Merge branch 'master' of git://factorcode.org/git/factor
2009-05-07 19:47:42 -05:00
Slava Pestov
cb9d50887c
Update PowerPC %jump and %dispatch-label, and add PIC-related functions to cpu-ppc.hpp
2009-05-07 19:47:38 -05:00
Slava Pestov
e78c043acb
Merge branch 'master' of git://factorcode.org/git/factor
2009-05-07 19:41:54 -05:00
Slava Pestov
b45284421d
cpu.ppc.bootstrap: updates
2009-05-07 19:40:25 -05:00
Slava Pestov
db6ae46c47
Fix x86-64 backend
2009-05-07 16:58:18 -05:00
Slava Pestov
9b419aa0b1
Count megamorphic cache hits
2009-05-07 14:26:08 -05:00
Slava Pestov
74094142fe
Fix tail call PICs on x86-64
2009-05-06 22:44:30 -05:00
Slava Pestov
4f0a1b024e
Clean up bootstrap.image, and implement new calling convention for tail calls; tail call sites now have PICs
2009-05-06 22:04:01 -05:00
Slava Pestov
c1e25f3b43
JIT now supports multiple relocations per code template. This simplifies non-optimizing compiler backends
2009-05-06 20:04:49 -05:00
Slava Pestov
d3b85c14c9
Working on inline caching for tail call sites
2009-05-06 19:22:22 -05:00
Slava Pestov
478d29a175
Better separation of concerns: cpu.{x86,ppc}.assembler no longer depends on compiler.codegen.fixup and cpu.architecture. Rename rt-xt-direct to rt-xt-pic to better explain its purpose
2009-05-06 16:14:53 -05:00