Commit Graph

640 Commits (d649daaf4fe78994d98c78c95ce3893dee431342)

Author SHA1 Message Date
Slava Pestov b8f75f404e cpu.ppc: fix typo 2009-10-15 05:01:20 -05:00
Slava Pestov 9e5a9abfe1 cpu.ppc: updates for write barrier and allocation changes (untested) 2009-10-15 04:54:16 -05:00
Slava Pestov 7a6e2f9b07 cpu.ppc.bootstrap: update for JIT relocation changes 2009-10-15 04:47:54 -05:00
Slava Pestov 22e79e8495 compiler: tweak ##write-barrier-imm 2009-10-15 02:40:23 -05:00
Slava Pestov bfd1f0d6d2 vm: rt-vm relocation now supports accessing a field directly 2009-10-14 19:24:23 -05:00
Slava Pestov 10ad5cad53 Working on adding support for the new write barrier to optimized code 2009-10-14 02:06:01 -05:00
Joe Groff b82c8b4416 use TEST reg, reg to compare integer equality with zero 2009-10-10 13:13:53 -05:00
Joe Groff 2577ab83a6 only emit ##alien-vector/##set-alien-vector insns if the rep is available 2009-10-10 12:53:10 -05:00
Joe Groff 1e3c9321ae don't use MOVSLDUP/MOVSHDUP to do specialized shuffles unless sse3 is available 2009-10-10 12:00:47 -05:00
Joe Groff a7a77cd03e fix x86 uchar %scalar>integer 2009-10-10 10:39:23 -05:00
Joe Groff 5158a12d32 rename ##shuffle-vector to ##shuffle-vector-imm, and add a new ##shuffle-vector for dynamic shuffles. have vshuffle use ##shuffle-vector to do word and byte shuffles on x86 2009-10-09 21:26:27 -05:00
Slava Pestov 5f0d4abb4a cpu.architecture: move dummy -reps words here, from cpu.ppc 2009-10-08 03:48:03 -05:00
Joe Groff 98836a9e2e break vector compare intrinsics into %compare, %or, and %not instructions that map directly to cpu instructions 2009-10-07 15:27:03 -05:00
Joe Groff 43b51ef2eb decompose %unpack-vector-head/tail into %compare-vector/%merge-vector-head/tail or %tail>head-vector/%unpack-vector-head insns when there isn't an actual unpack insn; get rid of fake x86 implementations 2009-10-07 14:09:46 -05:00
Joe Groff 5152c3b06d sse doesn't actually have an unsigned->unsigned pack instruction 2009-10-07 12:00:31 -05:00
Joe Groff a13e75f4f4 don't generate a ##not-vector instruction if the cpu doesn't have one; instead, fall back to a ##fill-vector/##xor-vector combo. get rid of pretend %not-vector in cpu.x86 2009-10-07 11:59:36 -05:00
Joe Groff 444624e79f fix x86 %unpack-vector insns 2009-10-06 20:38:51 -05:00
Joe Groff 2edccca0bb oops...PACKUSDW is sse4 only 2009-10-06 20:09:50 -05:00
Joe Groff 425ea05529 %float>integer-vector should truncate 2009-10-06 13:57:54 -05:00
Joe Groff 84ecb1266d add insns for vector pack, unpack, integer>float, and float>integer 2009-10-05 22:34:14 -05:00
Slava Pestov 931107397c compiler.cfg: remove _gc instruction, it doesn't need to exist, and change GC checks to ensure that the right amount of space is available instead of blindly checking for 1Kb 2009-10-05 05:27:49 -05:00
Joe Groff dca9d3e535 add %merge-vector-head and %merge-vector-tail instructions to back vmerge 2009-10-03 21:48:53 -05:00
Joe Groff 335df20713 add intrinsics for v<=, v<, v=, v>, v>=, vunordered? 2009-10-03 11:29:34 -05:00
Joe Groff 14a97b6a05 Merge branch 'master' of git://factorcode.org/git/factor 2009-10-03 10:02:26 -05:00
Joe Groff b1ec36a324 extend x86 %compare-vector to cover all comparison codes, sometimes stupidly for now 2009-10-02 23:19:56 -05:00
Slava Pestov dba965486c cpu.arm.assembler: dust it off, update to work with contemporary Factor, and clean it up a bit 2009-10-02 20:18:34 -05:00
Joe Groff e2e75c6b3a add intrinsic for vnot/vbitnot 2009-10-02 20:04:28 -05:00
Slava Pestov 9ac62dc476 cpu.ppc: remove useless comment 2009-10-02 03:31:53 -05:00
Slava Pestov aa0359f78d Merge branch 'reentrantvm' of git://github.com/phildawes/factor 2009-10-02 03:28:21 -05:00
Joe Groff 9d424a1092 Merge branch 'master' of git://factorcode.org/git/factor
Conflicts:
	basis/compiler/codegen/codegen.factor
2009-10-01 23:14:16 -05:00
Joe Groff 7b13fa4283 fold test-vector/branch sequences into a test-vector-branch instruction 2009-10-01 19:53:30 -05:00
Joe Groff 228ad950bb %test-vector instruction for vany?, vall?, vnone? 2009-10-01 15:35:38 -05:00
Joe Groff 94070c11aa %compare-vector instruction (only does v= for now) 2009-10-01 14:31:37 -05:00
Joe Groff 3ba79be651 Revert "add a %blend-vector intrinsic for v?"
This reverts commit 21e4b28b67.
2009-09-30 23:40:37 -05:00
Joe Groff 37a091a188 Merge branch 'master' of git://factorcode.org/git/factor 2009-09-30 23:04:04 -05:00
Joe Groff 21e4b28b67 add a %blend-vector intrinsic for v? 2009-09-30 23:03:59 -05:00
Slava Pestov 65421b111b math.vectors.simd: use fallbacks for hlshift, hrshift, vshuffle if parameter is not a literal;al; element access in int-4 on x86-64 now sign-extends the value; don't throw error at compile time if parameter for vshuffle does not have enough elements 2009-09-30 20:04:37 -05:00
Slava Pestov de58c3c294 cpu.ppc: update for alien intrinsic changes 2009-09-30 18:22:59 -05:00
Phil Dawes 86593598d0 ppc asm to pass vm pointer: alien + compiled code 2009-09-30 21:23:53 +01:00
Slava Pestov 8e201ca4b7 Various minor compiler tweaks: Combine address calculation with dereferencing in alien accessors; convert SIMD XOR of a vector with itself into an XOR of the destination with itself; convert SIMD unbox of zero vector into XOR of the destination with itself; fix SIMD indexing on x86-64 2009-09-30 05:00:36 -05:00
Slava Pestov 2b13245704 math.vectors.simd: add fast intrinsic for 'nth', replace broadcast primitive with shuffles 2009-09-29 04:48:11 -05:00
Slava Pestov a6e8277b2c math.vectors.simd: add vshuffle intrinsic 2009-09-28 23:12:13 -05:00
Slava Pestov db217295b0 Work in progress 2009-09-28 17:31:34 -05:00
Slava Pestov e343b46479 cpu.ppc: update for %unary/binary-float-function change 2009-09-28 16:40:52 -05:00
Slava Pestov 49dba53760 cpu.x86: cleanups 2009-09-28 16:38:35 -05:00
Joe Groff 4e2e45b70d use PSHUFD for longlong-2 broadcast when dst != src to avoid a %copy 2009-09-28 12:04:08 -05:00
Joe Groff 3f90473f09 use MOVDDUP for double-2 broadcast to eliminate a %copy 2009-09-28 12:00:03 -05:00
Joe Groff 467c389948 cpu.x86.assembler: make SSE shuffle instructions accept an array of indexes so they're easier to use 2009-09-28 11:45:45 -05:00
Joe Groff f7d416a4e4 SSE integer gather and broadcast 2009-09-28 11:24:08 -05:00
Slava Pestov f08521bf83 Fixing various test failures caused by C type parser change, and clarify C type docs some more 2009-09-28 08:48:39 -05:00
Slava Pestov 1109fb5725 math.vectors.simd: add intrinsic for int-4-boa, uint-4-boa, fix tests for C type parser change, fix software fallback for horizontal shifts 2009-09-28 06:34:22 -05:00
Slava Pestov dc1b6043dc cpu.x86: shifts didn't work if dst != src1; re-organize file a bit 2009-09-28 05:39:53 -05:00
Slava Pestov 542dd577d9 cpu.x86.32: fix %unary/binary-float-function on Windows; need to look up symbols in libm and not VM binary 2009-09-28 04:51:53 -05:00
Phil Dawes 6f0d25a8b3 ppc asm to pass vm pointer: initial bootstrap 2009-09-28 07:48:37 +01:00
Slava Pestov daf8f0ebba cpu.x86: fix regression: fsqrt intrinsic wasn't used 2009-09-28 02:27:55 -05:00
Slava Pestov 10c5fe5933 math.vectors.simd: add hlshift, hrshift (128-bit shift), vbitandn intrinsics 2009-09-28 02:17:46 -05:00
Slava Pestov e8cfaccef0 compiler.cfg: nuke ##bignum>integer and ##integer>bignum since they were unused 2009-09-27 20:36:05 -05:00
Slava Pestov 6dd8e4657e Merge branch 'master' into more_aggressive_coalescing 2009-09-27 19:29:50 -05:00
Slava Pestov 6f2a4eba51 compiler.cfg.linear-scan: fix partial sync point logic in case where dst == src, and clean up spilling code 2009-09-27 19:28:20 -05:00
Slava Pestov 2efab6efad cpu.x86.32: implement %unary-float-function and %binary-float-function; speeds up partial-sums and struct-arrays benchmarks 2009-09-27 18:06:30 -05:00
Slava Pestov a267100781 compiler.cfg.ssa.destruction: more aggressive coalescing work in progress 2009-09-27 17:17:26 -05:00
Joe Groff bf3eef9e2d Merge branch 'master' of git://factorcode.org/git/factor 2009-09-26 20:38:19 -05:00
Joe Groff e30819bcac move alien.inline, alien.cxx, alien.marshall to unmaintained; nuke alien.structs 2009-09-26 20:37:42 -05:00
sheeple 24b27f4c42 Fixing PPC backend for ##slot change 2009-09-26 13:21:42 -05:00
sheeple 2b35f52ed2 Merge branch 'slots' of git://factorcode.org/git/factor into slots
Conflicts:

	basis/cpu/x86/x86.factor
2009-09-26 03:12:42 -05:00
Daniel Ehrenberg 01082d743d An attempt at porting the slot change to PPC 2009-09-26 02:58:18 -05:00
Daniel Ehrenberg 364332bd70 Completing slot and set-slot changes on x86 2009-09-26 01:39:48 -05:00
Daniel Ehrenberg fb7f6ab455 Making ##slot and ##set-slot not have a temporary parameter 2009-09-26 00:28:14 -05:00
Slava Pestov 2e1be3f513 cpu: cleanups 2009-09-25 21:47:05 -05:00
Phil Dawes 64aa4fba9f removed %vm-invoke-*-arg completely 2009-09-25 20:03:03 +01:00
Phil Dawes 5b404aae7e moved %(un)nest-stacks out to cpu specific files to eliminate %vm-invoke from compiler.codegen 2009-09-25 19:32:08 +01:00
Phil Dawes f9e736c1f0 isolated %vm-invoke-blah-arg crap to 64.factor 2009-09-25 19:02:41 +01:00
Phil Dawes baa41f451f removed param-reg-* HOOKs 2009-09-25 18:58:55 +01:00
Phil Dawes c0957ed908 compiler.codegen passes temp reg to %call-gc 2009-09-25 18:48:13 +01:00
Phil Dawes aa71248937 made inline_gc a VM_C_API function 2009-09-25 18:29:07 +01:00
Slava Pestov fab916fb97 Merge branch 'fix_stack_alignment' of git://github.com/phildawes/factor 2009-09-24 19:54:51 -05:00
Phil Dawes 8b005f5b1d make inline_gc regparm(3) and cleaned up %call-gc stack alignment 2009-09-24 21:45:56 +01:00
Slava Pestov a562722c4c cpu.ppc: add representation hooks for shifts 2009-09-24 13:00:12 -05:00
Slava Pestov 2ea0b9da1d Merge branch 'vm_cleanup' of git://github.com/phildawes/factor 2009-09-24 04:31:55 -05:00
Slava Pestov 1b30310a35 cpu.x86: don't generate SSE2 instructions if only SSE1 is available 2009-09-24 04:07:15 -05:00
Slava Pestov a702bfa215 cpu.ppc: fix compile errors 2009-09-24 03:55:01 -05:00
Slava Pestov 24039cb56a math.vectors.simd: add v<< and v>> intrinsics for bitwise shifts on elements 2009-09-24 03:32:39 -05:00
Phil Dawes c747e39923 x86 bootstrap cleanup: renamed arg to arg1 2009-09-24 08:16:57 +01:00
Phil Dawes 911471c411 removed superflous whitspace lines 2009-09-24 08:02:14 +01:00
Slava Pestov a345c26a14 cpu.ppc: make it load 2009-09-24 00:13:27 -05:00
Slava Pestov 7c4632d2b9 cpu.ppc: fix typos 2009-09-23 23:38:17 -05:00
Slava Pestov 3581d0b09b cpu.x86/ppc: unify register-to-register moves using %copy so that better coalescing can eliminate more moves later 2009-09-23 22:49:54 -05:00
Slava Pestov 5854fa0c03 cpu.ppc: add dummy vector ops 2009-09-23 20:31:12 -05:00
Slava Pestov 165496d2f2 Add longlong-2, ulonglong-2, longlong-4, ulonglong-4 SIMD types, fix int-4 multiplication on SSE2 2009-09-23 20:23:25 -05:00
Slava Pestov 960602059d cpu.x86.assembler: cleanup 2009-09-23 19:30:36 -05:00
Slava Pestov 34a533d9f4 cpu.x86.features: don't fold away sse-version, instead memoize it and recompute on startup 2009-09-23 05:13:15 -05:00
Slava Pestov abac963882 math.vectors.simd: new operations: vabs vsqrt vbitand vbitor vbitxor 2009-09-23 02:47:14 -05:00
Slava Pestov fda8870848 Merge branch 'master' into integer-simd 2009-09-22 20:21:40 -05:00
Slava Pestov 9b26bd059d cpu.ppc: fix load errors 2009-09-22 05:24:34 -05:00
Slava Pestov e4872212b1 cpu.x86: fix using list 2009-09-20 23:24:30 -05:00
Slava Pestov e04fba6bc7 Fix conflict 2009-09-20 23:18:07 -05:00
Slava Pestov 66871995c9 math.vectors.simd: add saturated arithmetic operations 2009-09-20 23:16:02 -05:00
Slava Pestov 78c949b9b7 math.vectors: add v+- word which is accelerated by SSE3 2009-09-20 17:43:16 -05:00
Slava Pestov dfb43bd2ca More integer SIMD work
- move generated vocab support from specialized-arrays to vocabs.generated
- add fuzz testing to math.vectors.simd
- add alien type support for integer SIMD vectors
- SIMD: parsing word generates a SIMD type, instead of pre-generating them all in math.vectors.simd
2009-09-20 16:48:17 -05:00
Slava Pestov 0d77efef29 cpu.x86: cleanup 2009-09-20 04:17:34 -05:00
Slava Pestov fc5fe2bd2a Merge Phil Dawes' VM work 2009-09-20 03:48:08 -05:00
Slava Pestov ea2bcd69c7 math.vectors.simd: redesign to be more flexible, integer SIMD work in progress 2009-09-20 02:08:32 -05:00
Joe Groff 4a1422e7fe move some allocation words that don't really have much to do with c types out of alien.c-types into a new alien.data vocab 2009-09-17 22:36:05 -05:00
Joe Groff db2eba9b58 disambiguate math:float in cpu.ppc 2009-09-17 19:10:40 -05:00
Joe Groff ac5ea1769b get compiler tests loading 2009-09-16 09:20:47 -05:00
Phil Dawes 30b8b98446 small x86 asm cleanup 2009-09-16 08:22:17 +01:00
Phil Dawes a73886942a vm passed in primitives as arg0 for x86.64 2009-09-16 08:22:17 +01:00
Phil Dawes 123c6ce703 fixed up some alien boxing (x86 32 & 64) 2009-09-16 08:20:50 +01:00
Phil Dawes 46dac01d50 fixed vm ptr passing to to_value_struct 2009-09-16 08:20:50 +01:00
Phil Dawes 54d8285c7e fixed vm ptr passing to box_small_struct 2009-09-16 08:20:50 +01:00
Phil Dawes 0841b7ee90 fixed vm ptr passing to box_value_struct 2009-09-16 08:20:50 +01:00
Phil Dawes 898f5be1e0 quick test vocab for mt stuff 2009-09-16 08:20:50 +01:00
Phil Dawes 26586c24f0 added vm passing to some alien/boxing functions and added some vm asserts 2009-09-16 08:20:10 +01:00
Phil Dawes d7e2f770c0 vm ptr passed to lazy_jit_compile on x86.64 2009-09-16 08:20:10 +01:00
Phil Dawes 44d2d8672e Primitives now pass vm ptr on 64bit x86 2009-09-16 08:20:09 +01:00
Phil Dawes f5e6d43e1e separated vm-1st-arg and vm-3rd-arg asm invoke words (needed for ppc & x86.64) 2009-09-16 08:20:09 +01:00
Phil Dawes 6e5ddc0c33 vm pointer passed to nest_stacks and unnest_stacks (win32) 2009-09-16 08:17:26 +01:00
Phil Dawes b629429086 Dev checkpoint 2009-09-16 08:17:26 +01:00
Phil Dawes 6c046ec5bf added vm ptr to x86.32 boxing asm 2009-09-16 08:16:33 +01:00
Phil Dawes 780415b159 added code to pass vm ptr to some unboxers 2009-09-16 08:16:32 +01:00
Phil Dawes 2a1a4ccf27 fixed up getenv compiler intrinsic to use vm struct userenv 2009-09-16 08:16:32 +01:00
Phil Dawes cb3df86491 moved cards_offset and decks_offset into vm struct (for x86) 2009-09-16 08:16:31 +01:00
Phil Dawes fd72e140d2 nursery global variable moved into vm 2009-09-16 08:16:31 +01:00
Phil Dawes 6da959ff3b renamed to vm-field-offset. Slava's better at naming than me 2009-09-16 08:16:31 +01:00
sheeple 3602f86ab1 ppc asm to get stack_chain using vm ptr 2009-09-16 08:16:31 +01:00
Phil Dawes 77a13b1b6a Added a vm C-STRUCT, using it for struct offsets in x86 asm 2009-09-16 08:16:31 +01:00
Phil Dawes f9f1031dd8 moved stack_chain into vm struct 2009-09-16 08:16:31 +01:00
Phil Dawes 53aa98902e throw_impl now forwards the vm ptr 2009-09-16 08:16:30 +01:00
Phil Dawes 60d0300876 passing vm ptr to lazy_jit_compile mostly working 2009-09-16 08:16:30 +01:00
Phil Dawes 1fda8af73b Added %vm-invoke to pass vm ptr to vm functions (x86.32 only, otherwise uses singleton vm) 2009-09-16 08:16:30 +01:00
Phil Dawes df37e010d4 vm ptr passed to primitives on X86.32 (other cpus still use singleton vm ptr) 2009-09-16 08:16:30 +01:00
Joe Groff 334e93bbbf get things to a point where they bootstrap again 2009-09-15 21:43:18 -05:00
Joe Groff e33857a0c3 Merge branch 'master' into c-type-words 2009-09-15 19:14:41 -05:00
Joe Groff 02b797f11b struct classes now make their own C type without help from alien.structs. remove alien.structs dependencies from everywhere outside of alien and compiler, and have the FFI handle both alien.structs and classes.struct c-types 2009-09-15 17:38:49 -05:00
Joe Groff e5145b5a48 convert compiler cpu backends to use c-type words 2009-09-15 16:08:42 -05:00
Slava Pestov 680e6424bc cpu.ppc: fix %single>double-float and %double>single-float 2009-09-10 13:04:58 -05:00
Joe Groff 687a86fbb7 Merge branch 'master' of git://factorcode.org/git/factor 2009-09-09 17:14:48 -05:00
Joe Groff 54b8f04433 altivec instructions for powerpc assembler 2009-09-09 17:14:36 -05:00
Slava Pestov c04fb12f4c Merge branch 'master' of git://factorcode.org/git/factor 2009-09-09 13:56:20 -05:00
Slava Pestov 66f500bdd7 Fix the build 2009-09-09 13:44:54 -05:00
Slava Pestov 9f33d7e0fa cpu.ppc: fix bootstrap 2009-09-08 23:53:51 -05:00
Slava Pestov dd56449145 Merge branch 'master' of git://factorcode.org/git/factor 2009-09-08 22:34:17 -05:00
Slava Pestov 19a5f58b53 cpu.x86: tweak SIMD intrinsics 2009-09-08 22:34:01 -05:00
Joe Groff fe015ce2f0 no really, update ppc for argument order changes 2009-09-08 22:21:00 -05:00
Joe Groff b71f50ee04 Merge branch 'master' of git://factorcode.org/git/factor 2009-09-08 21:58:25 -05:00
Joe Groff b64b4a5cd9 update cpu.ppc for argument order changes 2009-09-08 21:58:11 -05:00
Slava Pestov 020e3b5713 Merge branch 'master' of git://factorcode.org/git/factor 2009-09-08 21:51:21 -05:00
Slava Pestov 092b31910d compiler: separate ##save-context instruction from ##alien-invoke, generate a ##save-context for libm calls, and add a pass to combine multiple context saves within a basic block. Fixes crashes with FP traps thrown by libm functions on x86-32 2009-09-08 21:50:55 -05:00
Joe Groff f4e574383c typos in cpu.ppc 2009-09-08 21:44:11 -05:00
Slava Pestov fe0c137a1b Merge branch 'master' of git://factorcode.org/git/factor 2009-09-08 19:35:14 -05:00
Slava Pestov 3e90786bc1 Fix various test failures 2009-09-08 19:18:56 -05:00
Doug Coleman 8351100f7e Merge branch 'master' of git://factorcode.org/git/factor 2009-09-08 17:05:58 -05:00
Joe Groff 025a5b7b15 split unordered and ordered float comparison intrinsics in compiler; generate only unordered comparisons for now 2009-09-08 17:04:26 -05:00
Doug Coleman 74dea1e898 duplicate using 2009-09-08 17:02:31 -05:00
Slava Pestov 6396e901ca cpu.x86.features: better wording 2009-09-08 14:17:05 -05:00
Slava Pestov 05bffecab7 cpu.x86.features: add -sse-version command-line switch to override SSE detection 2009-09-08 13:56:37 -05:00
Slava Pestov 8eeeeb5c5b inline alien-vector and set-alien-vector if SIMD is not available for a small speedup 2009-09-08 13:56:17 -05:00
Slava Pestov ef09991500 Fixes 2009-09-08 00:13:18 -05:00
Slava Pestov 17821626c3 Fix conflicts 2009-09-07 23:51:25 -05:00
Joe Groff 9430fdc4b6 i had comisd/ucomisd backwards on x86 2009-09-04 12:30:30 -05:00
Slava Pestov 09c8175919 fix some typos in cpu.ppc 2009-09-04 11:18:41 -05:00
Slava Pestov 7f0ab1dc1e Merge branch 'master' of git://factorcode.org/git/factor into ppc-float-compare 2009-09-04 10:58:50 -05:00
Joe Groff e36700feb0 update powerpc compiler to generate correct float comparisons 2009-09-04 10:51:12 -05:00
Slava Pestov 7571d50bd3 cpu.ppc: fix typo 2009-09-04 06:41:33 -05:00
Slava Pestov 1f5193198b compiler: clean up code generation for alien boxing/unboxing a bit 2009-09-03 21:22:43 -05:00
Joe Groff b1ba82c84f convert comparison branch code in compiler to use locals 2009-09-03 21:19:39 -05:00
Slava Pestov 20dfbf7ac8 More SIMD work
- Rename SIMD types and register representations: <type>-<count> rather than <count><type>-array
- Make a functor to define 256-bit vector types, use it to define float-8 type
- Make SIMD instructions pure-insns so that they participate in value numbering
2009-09-03 20:58:56 -05:00
Joe Groff 0b9e5c034a add compiler comparison codes for floating-point unordered comparisons; update x86 backend to generate proper code for all floating-point comparisons 2009-09-03 20:32:05 -05:00
Slava Pestov 80ed4bc918 Merge branch 'master' into simd 2009-09-03 03:45:58 -05:00
Slava Pestov f811208271 Detect SSE version and enable the correct set of SIMD intrinsics 2009-09-03 03:28:38 -05:00
Slava Pestov 52b99c050e Initial implementation of SSE vector intrinsics:
- cpu.architecture: add SSE vector representations
- compiler.cfg.intrinsics.alien: remove an attempt at optimization that value numbering handles now
- compiler.cfg.representations: support instructions where the representation is set in the 'rep' slot, and support conversions between single and double floats
- alien-float, set-alien-float now use the single float representation, and the conversion is implicit; this fixes a long-standing bug where a register could get clobbered because of how %set-alien-float was defined on x86
- math.vectors.specialization: add support for SIMD specialization (where the vector word's body is replaced by another quotation), also specialize the 'sum' word
- math.vectors.simd: 4float-array, 2double-array, 4double-array types, and specializers for the math.vectors words
2009-09-03 02:33:07 -05:00
Joe Groff e9a5ed5931 i suck at reading tech docs--those were m64 instructions, not mm instructions 2009-09-02 12:58:35 -05:00
Joe Groff 0ddf19d033 get rid of useless mm->xmm instructions in cpu.x86.assembler, add MOVHLPS and MOVLHPS 2009-09-02 11:06:08 -05:00
Slava Pestov 775b9af2f7 compiler: eliminate boilerplate by centralizing info in declarative INSN: syntax 2009-09-02 06:22:37 -05:00
Slava Pestov 14a063dd92 cpu.ppc: implement fast float function calls; 3x speedup on benchmark.struct-arrays on PowerPC 2009-09-01 15:19:26 -05:00
Slava Pestov e659203907 cpu.ppc: fix %box-displaced-alien 2009-08-30 20:56:04 -05:00
Slava Pestov b35a01879e %box-displaced-alien: fix clobberage found by Doug 2009-08-30 05:11:08 -05:00
Slava Pestov f6a836d1e9 compiler.cfg.linear-scan now supports partial sync-points where all registers are spilled; taking advantage of this, there are new trigonometric intrinsics which yield a 2x performance boost on benchmark.struct-arrays and a 25% boost on benchmark.partial-sums 2009-08-30 04:52:01 -05:00
Slava Pestov f30aa5d20e compiler: add fixnum-min/max intrinsics; ~10% speedup on benchmark.yuv-to-rgb 2009-08-28 19:02:59 -05:00
Slava Pestov 99bf9fadfb Performance improvements to make struct-arrays benchmark faster
- improved optimization of ##unbox-any-c-ptr on ##box-displaced-alien; convert it to ##unbox-c-ptr where possible using class info stored in the ##bda instruction
- make fcos, fsin, etc inline again; everything in math.libm inline again, except for fsqrt which is an intrinsic
- convert min and max on floats to float-min and float-max
- make min and max not inline, so that the above can work
- struct-arrays: rice a bit so that more fixnums come up
2009-08-28 05:21:16 -05:00
sheeple 8970cbc961 cpu.ppc: fix ##box-displaced-alien 2009-08-27 04:43:45 -05:00
Slava Pestov 9caf3f9248 compiler: new inline intrinsic for <displaced-alien> where the inputs have known types; value numbering now eliminates unnecessary allocation of displaced aliens if the result is immediately unboxed again 2009-08-27 00:06:19 -05:00
Slava Pestov 4fe0257169 cpu.x86: use SQRTSD instruction for math.libm:fsqrt word 2009-08-25 23:22:15 -05:00
Slava Pestov e44b2eb875 cpu.ppc.assembler: LOAD32 assembler macro was busted 2009-08-25 22:37:10 -05:00
Slava Pestov 9805dde418 basis/cpu: eliminate some usages of rot 2009-08-25 19:38:48 -05:00
Slava Pestov e9818be8ae cpu.ppc.assembler: fix FMR and FMR. opcodes 2009-08-25 19:33:35 -05:00
sheeple 90b3921b31 cpu.ppc: integer>fixnum scratch area overlapped with the rest of stack frame, very bad 2009-08-22 20:23:28 -05:00
Slava Pestov 1b8636bad5 Merge branch 'master' of git://factorcode.org/git/factor 2009-08-21 18:48:44 -05:00
Slava Pestov b1a12f85e4 cpu.ppc: work in progress 2009-08-21 18:48:34 -05:00
Doug Coleman d1ce837569 Delete empty unit tests files, remove 1- and 1+, reorder IN: lines in a lot of places, minor refactoring 2009-08-13 19:21:44 -05:00
Slava Pestov 2d575d7ec9 compiler.cfg: virtual registers are integers now, and representations are stored off to the side. Fix bug in representation selection that would manifest if a value was used as a float and a fixnum in different branches; cannot globally unbox float in this case 2009-08-08 04:02:18 -05:00
Slava Pestov 4d2160799f Split off the notion of a register representation from a register class 2009-08-07 17:44:50 -05:00
Slava Pestov a7e61632d9 cpu.x86.assembler: make some words private 2009-08-05 18:30:42 -05:00
Slava Pestov 203a64f236 cpu.ppc: put spill slots and GC roots in stack frame where subroutine calls can't clobber them 2009-07-31 23:47:07 -05:00
Slava Pestov ebec4ffd75 Merge branch 'master' of git://factorcode.org/git/factor 2009-07-31 17:59:00 -05:00
Slava Pestov 0e59d29282 cpu.ppc: fix small typos 2009-07-31 17:57:15 -05:00
Doug Coleman c33343b302 fix using list on win64 2009-07-31 16:27:18 -05:00
Slava Pestov e32477fd59 cpu.ppc: Updating PowerPC backend for codegen changes over the last two months: new shift intrinsics added, fixnum overflow intrinsics are now treated like conditionals, GC checks are more complex and have a different API 2009-07-30 21:44:22 -05:00
Slava Pestov db55a031df Move a bunch of GC check generation logic to platform-independent side 2009-07-30 21:28:27 -05:00
Slava Pestov d09013b311 Merge branch 'master' of git://factorcode.org/git/factor 2009-07-30 19:11:02 -05:00
Joe Groff b49fb43b60 Merge branch 'master' of git://factorcode.org/git/factor 2009-07-30 11:05:36 -05:00
Joe Groff c59c619364 add additional SSE2 packed integer operations 2009-07-30 11:05:12 -05:00
Slava Pestov 99216b8435 compiler.cfg: Get inline GC checks working again, using a dataflow analysis to compute uninitialized stack locations in compiler.cfg.stacks.uninitialized. Re-enable intrinsics which use inline allocation 2009-07-30 09:19:44 -05:00
Slava Pestov e3c38262ed Oopsie 2009-07-30 08:27:52 -05:00
Slava Pestov c9feb6f012 cpu.x86: Fix shuffle bug. Shuffling bugs occurring in code that runs before optimizer/stack checker is online are only caught at runtime during bootstrap, what a pain 2009-07-30 05:12:40 -05:00
Slava Pestov 4842641e75 cpu.x86: fix a bug in small-register logic on 32-bit. Also, on 32-bit, we don't need to do any special register shuffling to work with 16-bit operands since all registers have 16-bit variants. So now only 8-bit operands on x86-32 require special treatment 2009-07-30 05:04:46 -05:00
Slava Pestov 32a3abc9b4 cpu.x86: update non-optimizing compiler backends for assembler vocab split 2009-07-30 02:22:37 -05:00
Slava Pestov 226908d2d2 cpu.x86.assembler: fix extended 8-bit registers (DIL, SIL, SPL, BPL) 2009-07-29 22:32:22 -05:00
Slava Pestov 0899934220 cpu.x86: use full set of 8-bit, 16-bit and 32-bit registers on x86-64 to avoid clumsy save/restore logic 2009-07-29 21:56:37 -05:00
Slava Pestov 7831293fda cpu.x86.assembler: move operands to operands sub-vocabulary, clean up small-reg-* code in compiler backend 2009-07-29 21:44:08 -05:00
Slava Pestov c1fd97d515 Merge branch 'dcn' 2009-07-28 12:37:45 -05:00
Joe Groff 4c664a469a SSE4 opcodes for x86 assembler 2009-07-28 12:19:37 -05:00
Slava Pestov afd914c808 Merge branch 'master' into dcn 2009-07-28 11:20:43 -05:00
Joe Groff 1fe11f7c87 SSE1–SSSE3 opcodes + branch hints for x86 assembler 2009-07-28 00:22:27 -05:00
Slava Pestov f0a5ac3fbb cpu.x86: compile a load of zero, and adds, subs where dst = src1 more efficiently 2009-07-27 22:27:54 -05:00
Slava Pestov 39a70db831 Improve code generation for shift word: add intrinsics for fixnum-shift-fast in the case where the shift count is not constant, transform 1 swap shift into a more overflow check with open-coded fast case, transform bitand into fixnum-bitand in more cases 2009-07-16 23:50:48 -05:00
Slava Pestov 99faf3c79f Overflowing fixnum intrinsics now expand into several CFG nodes. This speeds up the common case since only the uncommon case is now a stack syncpoint 2009-07-16 18:29:40 -05:00
Slava Pestov 1eae4286cd compiler.cfg: split off condition codes into a comparisons sub-vocabulary 2009-07-13 14:42:52 -05:00
Slava Pestov 27c0577c91 cpu.x86.32: don't emit sub %esp,0x0 in prologue on Linux and Windows 2009-07-01 18:13:45 -05:00
Slava Pestov 554559c0b1 %dispatch: sometimes the generated sequence is one byte longer, so instead of hard-coding it, compute the right length 2009-06-30 18:11:15 -05:00
Slava Pestov 4782c737ab cpu.x86: don't clobber src in %dispatch 2009-06-30 16:47:22 -05:00
Slava Pestov a61a992bfd cpu.x86.assembler: IMUL2 instruction was busted for immediate operands
When given a register and an immediate, it would generate imul imm,dst,dst however the 64-bit prefix was generated wrong and if dst was an extended register only the first operand would be an extended register. To fix this, change IMUL2 to not work on immediates anymore, and added a new IMUL3 that takes a destination register, source register, and immediate. Also, change compiler.cfg.two-operand to not two-operandize %mul-imm, since this isn't needed anymore.
This fixes the sporadic benchmark.tuple-arrays crash on 64-bit machines.
2009-06-08 21:15:52 -05:00
Slava Pestov 0d265fe016 Remove %dispatch-label since its tehe same on all platforms; fix %gc on PowerPC 2009-06-07 21:46:28 -05:00
Slava Pestov f0b132fa7f Fix 32-bit bootstrap 2009-06-03 03:23:55 -05:00
Slava Pestov fd710385e5 cpu.x86: fix small register intrinsics on x86-64 2009-06-03 03:22:46 -05:00
Slava Pestov 7aca076408 GC checks now save and restore registers 2009-06-02 18:23:47 -05:00
Slava Pestov 3de85158de Merge branch 'master' into global_optimization 2009-06-01 03:12:32 -05:00
Slava Pestov 096803e58f Redo compiler.codegen.fixup and get %dispatch to work 2009-06-01 02:32:36 -05:00
Slava Pestov 64114947d2 Various improvements aimed at getting local optimization regressions fixed:
- Rename _gc to ##gc
- Absolute labels are now supported
- Generate _dispatch-label
2009-05-31 23:28:08 -05:00
Slava Pestov e2b8b04d15 cpu.x86.features: add RDTSC support. This is a new vocabulary with words: sse2? instruction-counter count-instructions 2009-05-31 15:02:14 -05:00
Slava Pestov 40949800bf Fixing various bugs; alias analysis wasn't handling ##phi nodes, stack analysis incorrectly handled height-changing back edges and ##fixnum-*, clean up ##dispatch generation 2009-05-29 01:39:14 -05:00
U-C4\Administrator 9c85bc8ce3 fix duplicate using lines 2009-05-17 20:29:32 -05:00
Slava Pestov 514956537f Fix cpu.ppc for strict vocabulary search path semantics 2009-05-16 08:58:10 -05:00
Slava Pestov ba04d5af1e Update documentation for stricter vocabulary search path semantics 2009-05-16 00:29:21 -05:00
Slava Pestov 3e7269731b cpu.ppc: really fix bool type 2009-05-10 19:10:20 -05:00
Slava Pestov 9b491d1442 Fix bool type on PowerPC 2009-05-10 19:01:38 -05:00
Slava Pestov 931db821d1 Merge branch 'master' of git://factorcode.org/git/factor 2009-05-07 23:26:37 -05:00
Slava Pestov e007cb56e8 cpu.ppc: fix alien-indirect 2009-05-07 23:26:33 -05:00
Slava Pestov 5e35f19312 cpu.ppc: bools are 4 bytes on OS X/PowerPC 2009-05-07 23:18:41 -05:00
Slava Pestov 02bd871575 Merge branch 'master' of git://factorcode.org/git/factor 2009-05-07 19:47:42 -05:00
Slava Pestov cb9d50887c Update PowerPC %jump and %dispatch-label, and add PIC-related functions to cpu-ppc.hpp 2009-05-07 19:47:38 -05:00
Slava Pestov e78c043acb Merge branch 'master' of git://factorcode.org/git/factor 2009-05-07 19:41:54 -05:00
Slava Pestov b45284421d cpu.ppc.bootstrap: updates 2009-05-07 19:40:25 -05:00
Slava Pestov db6ae46c47 Fix x86-64 backend 2009-05-07 16:58:18 -05:00
Slava Pestov 9b419aa0b1 Count megamorphic cache hits 2009-05-07 14:26:08 -05:00
Slava Pestov 74094142fe Fix tail call PICs on x86-64 2009-05-06 22:44:30 -05:00
Slava Pestov 4f0a1b024e Clean up bootstrap.image, and implement new calling convention for tail calls; tail call sites now have PICs 2009-05-06 22:04:01 -05:00
Slava Pestov c1e25f3b43 JIT now supports multiple relocations per code template. This simplifies non-optimizing compiler backends 2009-05-06 20:04:49 -05:00
Slava Pestov d3b85c14c9 Working on inline caching for tail call sites 2009-05-06 19:22:22 -05:00
Slava Pestov 478d29a175 Better separation of concerns: cpu.{x86,ppc}.assembler no longer depends on compiler.codegen.fixup and cpu.architecture. Rename rt-xt-direct to rt-xt-pic to better explain its purpose 2009-05-06 16:14:53 -05:00