Commit Graph

136 Commits (22e79e8495629c4799d1a139e8fab71a42b64a39)

Author SHA1 Message Date
Phil Dawes baa41f451f removed param-reg-* HOOKs 2009-09-25 18:58:55 +01:00
Phil Dawes aa71248937 made inline_gc a VM_C_API function 2009-09-25 18:29:07 +01:00
Phil Dawes 8b005f5b1d make inline_gc regparm(3) and cleaned up %call-gc stack alignment 2009-09-24 21:45:56 +01:00
Slava Pestov 1b30310a35 cpu.x86: don't generate SSE2 instructions if only SSE1 is available 2009-09-24 04:07:15 -05:00
Slava Pestov 24039cb56a math.vectors.simd: add v<< and v>> intrinsics for bitwise shifts on elements 2009-09-24 03:32:39 -05:00
Slava Pestov 3581d0b09b cpu.x86/ppc: unify register-to-register moves using %copy so that better coalescing can eliminate more moves later 2009-09-23 22:49:54 -05:00
Slava Pestov 165496d2f2 Add longlong-2, ulonglong-2, longlong-4, ulonglong-4 SIMD types, fix int-4 multiplication on SSE2 2009-09-23 20:23:25 -05:00
Slava Pestov abac963882 math.vectors.simd: new operations: vabs vsqrt vbitand vbitor vbitxor 2009-09-23 02:47:14 -05:00
Slava Pestov e4872212b1 cpu.x86: fix using list 2009-09-20 23:24:30 -05:00
Slava Pestov e04fba6bc7 Fix conflict 2009-09-20 23:18:07 -05:00
Slava Pestov 66871995c9 math.vectors.simd: add saturated arithmetic operations 2009-09-20 23:16:02 -05:00
Slava Pestov 78c949b9b7 math.vectors: add v+- word which is accelerated by SSE3 2009-09-20 17:43:16 -05:00
Slava Pestov dfb43bd2ca More integer SIMD work
- move generated vocab support from specialized-arrays to vocabs.generated
- add fuzz testing to math.vectors.simd
- add alien type support for integer SIMD vectors
- SIMD: parsing word generates a SIMD type, instead of pre-generating them all in math.vectors.simd
2009-09-20 16:48:17 -05:00
Slava Pestov 0d77efef29 cpu.x86: cleanup 2009-09-20 04:17:34 -05:00
Slava Pestov fc5fe2bd2a Merge Phil Dawes' VM work 2009-09-20 03:48:08 -05:00
Slava Pestov ea2bcd69c7 math.vectors.simd: redesign to be more flexible, integer SIMD work in progress 2009-09-20 02:08:32 -05:00
Phil Dawes f5e6d43e1e separated vm-1st-arg and vm-3rd-arg asm invoke words (needed for ppc & x86.64) 2009-09-16 08:20:09 +01:00
Phil Dawes 6e5ddc0c33 vm pointer passed to nest_stacks and unnest_stacks (win32) 2009-09-16 08:17:26 +01:00
Phil Dawes 780415b159 added code to pass vm ptr to some unboxers 2009-09-16 08:16:32 +01:00
Phil Dawes 2a1a4ccf27 fixed up getenv compiler intrinsic to use vm struct userenv 2009-09-16 08:16:32 +01:00
Phil Dawes cb3df86491 moved cards_offset and decks_offset into vm struct (for x86) 2009-09-16 08:16:31 +01:00
Phil Dawes fd72e140d2 nursery global variable moved into vm 2009-09-16 08:16:31 +01:00
Phil Dawes 6da959ff3b renamed to vm-field-offset. Slava's better at naming than me 2009-09-16 08:16:31 +01:00
Phil Dawes 77a13b1b6a Added a vm C-STRUCT, using it for struct offsets in x86 asm 2009-09-16 08:16:31 +01:00
Phil Dawes f9f1031dd8 moved stack_chain into vm struct 2009-09-16 08:16:31 +01:00
Phil Dawes 1fda8af73b Added %vm-invoke to pass vm ptr to vm functions (x86.32 only, otherwise uses singleton vm) 2009-09-16 08:16:30 +01:00
Joe Groff e5145b5a48 convert compiler cpu backends to use c-type words 2009-09-15 16:08:42 -05:00
Slava Pestov 19a5f58b53 cpu.x86: tweak SIMD intrinsics 2009-09-08 22:34:01 -05:00
Slava Pestov 092b31910d compiler: separate ##save-context instruction from ##alien-invoke, generate a ##save-context for libm calls, and add a pass to combine multiple context saves within a basic block. Fixes crashes with FP traps thrown by libm functions on x86-32 2009-09-08 21:50:55 -05:00
Joe Groff 025a5b7b15 split unordered and ordered float comparison intrinsics in compiler; generate only unordered comparisons for now 2009-09-08 17:04:26 -05:00
Slava Pestov ef09991500 Fixes 2009-09-08 00:13:18 -05:00
Slava Pestov 17821626c3 Fix conflicts 2009-09-07 23:51:25 -05:00
Joe Groff 9430fdc4b6 i had comisd/ucomisd backwards on x86 2009-09-04 12:30:30 -05:00
Slava Pestov 1f5193198b compiler: clean up code generation for alien boxing/unboxing a bit 2009-09-03 21:22:43 -05:00
Joe Groff b1ba82c84f convert comparison branch code in compiler to use locals 2009-09-03 21:19:39 -05:00
Slava Pestov 20dfbf7ac8 More SIMD work
- Rename SIMD types and register representations: <type>-<count> rather than <count><type>-array
- Make a functor to define 256-bit vector types, use it to define float-8 type
- Make SIMD instructions pure-insns so that they participate in value numbering
2009-09-03 20:58:56 -05:00
Joe Groff 0b9e5c034a add compiler comparison codes for floating-point unordered comparisons; update x86 backend to generate proper code for all floating-point comparisons 2009-09-03 20:32:05 -05:00
Slava Pestov f811208271 Detect SSE version and enable the correct set of SIMD intrinsics 2009-09-03 03:28:38 -05:00
Slava Pestov 52b99c050e Initial implementation of SSE vector intrinsics:
- cpu.architecture: add SSE vector representations
- compiler.cfg.intrinsics.alien: remove an attempt at optimization that value numbering handles now
- compiler.cfg.representations: support instructions where the representation is set in the 'rep' slot, and support conversions between single and double floats
- alien-float, set-alien-float now use the single float representation, and the conversion is implicit; this fixes a long-standing bug where a register could get clobbered because of how %set-alien-float was defined on x86
- math.vectors.specialization: add support for SIMD specialization (where the vector word's body is replaced by another quotation), also specialize the 'sum' word
- math.vectors.simd: 4float-array, 2double-array, 4double-array types, and specializers for the math.vectors words
2009-09-03 02:33:07 -05:00
Slava Pestov 775b9af2f7 compiler: eliminate boilerplate by centralizing info in declarative INSN: syntax 2009-09-02 06:22:37 -05:00
Slava Pestov b35a01879e %box-displaced-alien: fix clobberage found by Doug 2009-08-30 05:11:08 -05:00
Slava Pestov f30aa5d20e compiler: add fixnum-min/max intrinsics; ~10% speedup on benchmark.yuv-to-rgb 2009-08-28 19:02:59 -05:00
Slava Pestov 99bf9fadfb Performance improvements to make struct-arrays benchmark faster
- improved optimization of ##unbox-any-c-ptr on ##box-displaced-alien; convert it to ##unbox-c-ptr where possible using class info stored in the ##bda instruction
- make fcos, fsin, etc inline again; everything in math.libm inline again, except for fsqrt which is an intrinsic
- convert min and max on floats to float-min and float-max
- make min and max not inline, so that the above can work
- struct-arrays: rice a bit so that more fixnums come up
2009-08-28 05:21:16 -05:00
sheeple 8970cbc961 cpu.ppc: fix ##box-displaced-alien 2009-08-27 04:43:45 -05:00
Slava Pestov 9caf3f9248 compiler: new inline intrinsic for <displaced-alien> where the inputs have known types; value numbering now eliminates unnecessary allocation of displaced aliens if the result is immediately unboxed again 2009-08-27 00:06:19 -05:00
Slava Pestov 4fe0257169 cpu.x86: use SQRTSD instruction for math.libm:fsqrt word 2009-08-25 23:22:15 -05:00
Doug Coleman d1ce837569 Delete empty unit tests files, remove 1- and 1+, reorder IN: lines in a lot of places, minor refactoring 2009-08-13 19:21:44 -05:00
Slava Pestov 4d2160799f Split off the notion of a register representation from a register class 2009-08-07 17:44:50 -05:00
Slava Pestov db55a031df Move a bunch of GC check generation logic to platform-independent side 2009-07-30 21:28:27 -05:00
Slava Pestov 99216b8435 compiler.cfg: Get inline GC checks working again, using a dataflow analysis to compute uninitialized stack locations in compiler.cfg.stacks.uninitialized. Re-enable intrinsics which use inline allocation 2009-07-30 09:19:44 -05:00
Slava Pestov c9feb6f012 cpu.x86: Fix shuffle bug. Shuffling bugs occurring in code that runs before optimizer/stack checker is online are only caught at runtime during bootstrap, what a pain 2009-07-30 05:12:40 -05:00
Slava Pestov 4842641e75 cpu.x86: fix a bug in small-register logic on 32-bit. Also, on 32-bit, we don't need to do any special register shuffling to work with 16-bit operands since all registers have 16-bit variants. So now only 8-bit operands on x86-32 require special treatment 2009-07-30 05:04:46 -05:00
Slava Pestov 0899934220 cpu.x86: use full set of 8-bit, 16-bit and 32-bit registers on x86-64 to avoid clumsy save/restore logic 2009-07-29 21:56:37 -05:00
Slava Pestov 7831293fda cpu.x86.assembler: move operands to operands sub-vocabulary, clean up small-reg-* code in compiler backend 2009-07-29 21:44:08 -05:00
Slava Pestov f0a5ac3fbb cpu.x86: compile a load of zero, and adds, subs where dst = src1 more efficiently 2009-07-27 22:27:54 -05:00
Slava Pestov 39a70db831 Improve code generation for shift word: add intrinsics for fixnum-shift-fast in the case where the shift count is not constant, transform 1 swap shift into a more overflow check with open-coded fast case, transform bitand into fixnum-bitand in more cases 2009-07-16 23:50:48 -05:00
Slava Pestov 99faf3c79f Overflowing fixnum intrinsics now expand into several CFG nodes. This speeds up the common case since only the uncommon case is now a stack syncpoint 2009-07-16 18:29:40 -05:00
Slava Pestov 1eae4286cd compiler.cfg: split off condition codes into a comparisons sub-vocabulary 2009-07-13 14:42:52 -05:00
Slava Pestov a61a992bfd cpu.x86.assembler: IMUL2 instruction was busted for immediate operands
When given a register and an immediate, it would generate imul imm,dst,dst however the 64-bit prefix was generated wrong and if dst was an extended register only the first operand would be an extended register. To fix this, change IMUL2 to not work on immediates anymore, and added a new IMUL3 that takes a destination register, source register, and immediate. Also, change compiler.cfg.two-operand to not two-operandize %mul-imm, since this isn't needed anymore.
This fixes the sporadic benchmark.tuple-arrays crash on 64-bit machines.
2009-06-08 21:15:52 -05:00
Slava Pestov 0d265fe016 Remove %dispatch-label since its tehe same on all platforms; fix %gc on PowerPC 2009-06-07 21:46:28 -05:00
Slava Pestov fd710385e5 cpu.x86: fix small register intrinsics on x86-64 2009-06-03 03:22:46 -05:00
Slava Pestov 7aca076408 GC checks now save and restore registers 2009-06-02 18:23:47 -05:00
Slava Pestov 096803e58f Redo compiler.codegen.fixup and get %dispatch to work 2009-06-01 02:32:36 -05:00
Slava Pestov 64114947d2 Various improvements aimed at getting local optimization regressions fixed:
- Rename _gc to ##gc
- Absolute labels are now supported
- Generate _dispatch-label
2009-05-31 23:28:08 -05:00
Slava Pestov 40949800bf Fixing various bugs; alias analysis wasn't handling ##phi nodes, stack analysis incorrectly handled height-changing back edges and ##fixnum-*, clean up ##dispatch generation 2009-05-29 01:39:14 -05:00
Slava Pestov 74094142fe Fix tail call PICs on x86-64 2009-05-06 22:44:30 -05:00
Slava Pestov d3b85c14c9 Working on inline caching for tail call sites 2009-05-06 19:22:22 -05:00
Slava Pestov 478d29a175 Better separation of concerns: cpu.{x86,ppc}.assembler no longer depends on compiler.codegen.fixup and cpu.architecture. Rename rt-xt-direct to rt-xt-pic to better explain its purpose 2009-05-06 16:14:53 -05:00
Slava Pestov 44bfff7c7b Rename ##load-indirect to ##load-reference since this is more descriptive; value numbering doesn't assign expressions to ##load-reference nodes since this would end up folding literals which were eq? but not = 2009-01-29 01:44:58 -06:00
Slava Pestov 8a8f0c925c Use BSR instruction to implement fixnum-log2 intrinsic 2008-12-06 15:31:17 -06:00
Slava Pestov a56d480aa6 Various optimizations leading to a 10% speedup on compiling empty EBNF parser:
- open-code getenv primitive
- inline tuple predicates in finalization
- faster partial dispatch
- faster built-in type predicates
- faster tuple predicates
- faster lo-tag dispatch
- compile V{ } clone and H{ } clone more efficiently
- add fixnum fast-path to =; avoid indirect branch if two fixnums not eq
- faster >alist on hashtables
2008-12-06 09:16:29 -06:00
Slava Pestov 82cf6530c6 set-string-nth-fast intrinsic was busted 2008-12-05 23:52:09 -06:00
Slava Pestov e256846acd Tweak string representation; high bit indicates if character has high bits in aux vector. Avoids memory access in common case. Split set-string-nth into two primitives; set-string-nth-fast is open-coded by optimizing compiler. 13% improvement on reverse-complement 2008-12-05 06:38:51 -06:00
Slava Pestov e5ed7447ed Removing more >r/r> usages 2008-12-03 08:46:16 -06:00
Slava Pestov e7f4563374 fixnum* intrinsic for x86 2008-11-30 07:26:49 -06:00
Slava Pestov f44506089d More work on overflow instructions: don't need temp register anymore, add -tail variants which don't need stack frame 2008-11-28 06:36:30 -06:00
Slava Pestov 5634becda1 ##fixnum-add, ##fixnum-sub instructions open-code overflow check 2008-11-28 05:33:58 -06:00
Slava Pestov ab689c098b Clean up direct literal code and make a first attempt at PowerPC support 2008-11-24 08:16:14 -06:00
Slava Pestov 2aaf860f47 Experimental optimizations 2008-11-24 06:40:51 -06:00
Slava Pestov 20f5541d35 Refactoring FFI for Win64 2008-11-17 13:34:37 -06:00
Slava Pestov eb05dd3a12 Optimize a ##dispatch that is applied to the result of a ##sub-imm or ##add-imm; this eliminates an instruction from the common 1 fixnum-fast { ... } dispatch and 8 fixnum-fast { ... } dispatch code sequences appearing in generic word expansions 2008-11-13 04:16:08 -06:00
unknown f7fe84e563 Working on Win64 FFI 2008-11-08 21:40:47 -06:00
unknown 7365959f01 Starting work on Win64 port 2008-11-07 20:33:32 -06:00
sheeple d2ec46e38f PowerPC backend almost functional; some new compiler unit tests added,
better compilation of 'f eq?'; f becomes an immediate operand
move aux-offset to compiler.constants
2008-11-06 06:27:27 -06:00
Slava Pestov 53cd75b06c Add string-nth intrinsic 2008-11-06 01:11:28 -06:00
Slava Pestov 8b7c47a68b Clean up x86 backend: move cpu.x86.architecture to cpu.x86, use branchless arithmetic in some intrinsics 2008-11-05 04:15:48 -06:00