Commit Graph

471 Commits (334e93bbbfbbbe143cc28cb712381ce929f74333)

Author SHA1 Message Date
Joe Groff 02b797f11b struct classes now make their own C type without help from alien.structs. remove alien.structs dependencies from everywhere outside of alien and compiler, and have the FFI handle both alien.structs and classes.struct c-types 2009-09-15 17:38:49 -05:00
Slava Pestov 1b5614f974 math.functions: more accurate log10 (fixes problem reported by OneEyed) 2009-09-14 16:19:58 -05:00
Slava Pestov 8c46388272 compiler.cfg.builder: don't run certain tests if float intrinsics are not available 2009-09-13 23:12:47 -05:00
Slava Pestov 427bfb4ab8 math: add unordered comparison operators u< u<= u> u>= which behave exactly like < <= > >= except no floating point exceptions are set if one or both inputs are NaNs; also add efficient intrinsic for unordered? predicate, and fix propagation type functions for abs, absq, and bitnot 2009-09-12 22:20:13 -05:00
Slava Pestov 89d6096130 compiler.cfg.intrinsics: compile float-mod as a ##binary-float-function instead of a primitive call 2009-09-11 21:00:17 -05:00
Slava Pestov 66f500bdd7 Fix the build 2009-09-09 13:44:54 -05:00
Slava Pestov 72eec2c53e compiler.cfg.save-contexts: add new pass 2009-09-08 21:56:28 -05:00
Slava Pestov 092b31910d compiler: separate ##save-context instruction from ##alien-invoke, generate a ##save-context for libm calls, and add a pass to combine multiple context saves within a basic block. Fixes crashes with FP traps thrown by libm functions on x86-32 2009-09-08 21:50:55 -05:00
Joe Groff 025a5b7b15 split unordered and ordered float comparison intrinsics in compiler; generate only unordered comparisons for now 2009-09-08 17:04:26 -05:00
Slava Pestov 17821626c3 Fix conflicts 2009-09-07 23:51:25 -05:00
Slava Pestov e20e9008ea compiler.cfg.value-numbering: update tests for Joe's condition code changes 2009-09-04 03:11:56 -05:00
Slava Pestov 555543faae compiler: tweak generated code 2009-09-04 03:01:18 -05:00
Slava Pestov 1c87486320 math.vectors.simd: allow punning SIMD vectors between types 2009-09-04 02:35:58 -05:00
Slava Pestov 8223715a07 compiler.cfg.intrinsics: fix type detection on the alien type for vector accessors 2009-09-04 02:22:54 -05:00
Slava Pestov 1f5193198b compiler: clean up code generation for alien boxing/unboxing a bit 2009-09-03 21:22:43 -05:00
Slava Pestov 20dfbf7ac8 More SIMD work
- Rename SIMD types and register representations: <type>-<count> rather than <count><type>-array
- Make a functor to define 256-bit vector types, use it to define float-8 type
- Make SIMD instructions pure-insns so that they participate in value numbering
2009-09-03 20:58:56 -05:00
Joe Groff 0b9e5c034a add compiler comparison codes for floating-point unordered comparisons; update x86 backend to generate proper code for all floating-point comparisons 2009-09-03 20:32:05 -05:00
Slava Pestov 9cc705f6ba math.vectors.simd: split off intrinsics into a sub-vocabulary, to avoid loading most of the SIMD code on bootstrap 2009-09-03 03:43:43 -05:00
Slava Pestov 52b99c050e Initial implementation of SSE vector intrinsics:
- cpu.architecture: add SSE vector representations
- compiler.cfg.intrinsics.alien: remove an attempt at optimization that value numbering handles now
- compiler.cfg.representations: support instructions where the representation is set in the 'rep' slot, and support conversions between single and double floats
- alien-float, set-alien-float now use the single float representation, and the conversion is implicit; this fixes a long-standing bug where a register could get clobbered because of how %set-alien-float was defined on x86
- math.vectors.specialization: add support for SIMD specialization (where the vector word's body is replaced by another quotation), also specialize the 'sum' word
- math.vectors.simd: 4float-array, 2double-array, 4double-array types, and specializers for the math.vectors words
2009-09-03 02:33:07 -05:00
Slava Pestov 775b9af2f7 compiler: eliminate boilerplate by centralizing info in declarative INSN: syntax 2009-09-02 06:22:37 -05:00
Slava Pestov 14a063dd92 cpu.ppc: implement fast float function calls; 3x speedup on benchmark.struct-arrays on PowerPC 2009-09-01 15:19:26 -05:00
Slava Pestov 0cf3151216 compiler.cfg.intrinsics: cleanup: the "intrinsic" word property is now a quotation, not a boolean, making this mechanism more extensible 2009-08-30 22:20:49 -05:00
Slava Pestov 43af9b06a4 compiler.cfg.linear-scan.live-intervals: dead-value-error is never thrown anymore 2009-08-30 05:15:18 -05:00
Slava Pestov b35a01879e %box-displaced-alien: fix clobberage found by Doug 2009-08-30 05:11:08 -05:00
Slava Pestov f6a836d1e9 compiler.cfg.linear-scan now supports partial sync-points where all registers are spilled; taking advantage of this, there are new trigonometric intrinsics which yield a 2x performance boost on benchmark.struct-arrays and a 25% boost on benchmark.partial-sums 2009-08-30 04:52:01 -05:00
Slava Pestov fa64522421 compiler.cfg.value-numbering: fix ##box-displaced-alien simplification 2009-08-28 19:05:49 -05:00
Slava Pestov f30aa5d20e compiler: add fixnum-min/max intrinsics; ~10% speedup on benchmark.yuv-to-rgb 2009-08-28 19:02:59 -05:00
Slava Pestov 99bf9fadfb Performance improvements to make struct-arrays benchmark faster
- improved optimization of ##unbox-any-c-ptr on ##box-displaced-alien; convert it to ##unbox-c-ptr where possible using class info stored in the ##bda instruction
- make fcos, fsin, etc inline again; everything in math.libm inline again, except for fsqrt which is an intrinsic
- convert min and max on floats to float-min and float-max
- make min and max not inline, so that the above can work
- struct-arrays: rice a bit so that more fixnums come up
2009-08-28 05:21:16 -05:00
Slava Pestov c9cba1cc00 compiler.cfg.instructions: forgot that ##box-displaced-alien needs a GC check; fixes segfault in benchmark.mandel 2009-08-27 04:09:35 -05:00
Slava Pestov 9caf3f9248 compiler: new inline intrinsic for <displaced-alien> where the inputs have known types; value numbering now eliminates unnecessary allocation of displaced aliens if the result is immediately unboxed again 2009-08-27 00:06:19 -05:00
Slava Pestov 40b522c9d0 compiler.cfg.linear-scan: fix unit tests for new fake-representations 2009-08-26 08:58:00 -05:00
Slava Pestov d5fb53d417 compiler.cfg.debugger: fix fake-representations so that low-level-ir tests can pass on x86 2009-08-25 23:44:01 -05:00
Slava Pestov 4fe0257169 cpu.x86: use SQRTSD instruction for math.libm:fsqrt word 2009-08-25 23:22:15 -05:00
Slava Pestov 009d3a87f6 Add some unit tests 2009-08-22 17:15:10 -05:00
Slava Pestov c15555056e compiler.cfg.dataflow-analysis: when intersecting sets, treat uninitialized sets as universal rather than empty; reduces number of stack instructions generated by 1% 2009-08-20 18:15:41 -05:00
Slava Pestov 54ee3c3d01 compiler.cfg.stacks.local: more accurate local replace set computation; optimizes out 'swap swap' 2009-08-19 22:00:21 -05:00
Slava Pestov 552d069e9f compiler: add unit tests for new bugs 2009-08-19 16:56:26 -05:00
Daniel Ehrenberg d93f6ed1f3 Merge branch 'master' of git://factorcode.org/git/factor 2009-08-14 20:11:54 -05:00
Daniel Ehrenberg 595e3b96cd Improving write barrier elimination; change to compiler.cfg.utilities to support this 2009-08-14 19:41:41 -05:00
Daniel Ehrenberg 54389b5e5c Write barriers are hoisted out of loops when their target is slot-available 2009-08-13 20:26:44 -05:00
Doug Coleman 9f1030030d Merge branch 'master' of git://factorcode.org/git/factor
Conflicts:
	basis/calendar/calendar.factor
2009-08-13 19:40:02 -05:00
Doug Coleman d1ce837569 Delete empty unit tests files, remove 1- and 1+, reorder IN: lines in a lot of places, minor refactoring 2009-08-13 19:21:44 -05:00
Daniel Ehrenberg 25fad6550f Global write barrier elimination tracks newly allocated objects 2009-08-13 15:18:47 -05:00
Daniel Ehrenberg f80416b40e Fixing write-barrier elimination; adding bb as a parameter to join-sets in dataflow analysis 2009-08-12 23:52:29 -05:00
Daniel Ehrenberg 82d20d292c Making write barrier elimination global 2009-08-11 21:21:21 -05:00
Slava Pestov 55d1b76ad7 compiler.tree.escape-analysis: if the output of an #introduce node has an immutable tuple class type declaration, and it is not passed to any subroutine calls, or returned from the word, then unbox it. This speeds up vector arithmetic words on specialized arrays, because the specialized array is unboxed up-front, eliminating an indirection on every loop iteration 2009-08-09 16:29:21 -05:00
Slava Pestov 12ab2b9e9d _gc instruction doesn't need slot to hold GC root area size, since that's just tagged-values>> length 2009-08-09 03:08:13 -05:00
Slava Pestov ca2d989547 compiler.cfg.linearization: change order to fit older unit tests 2009-08-08 23:06:57 -05:00
Slava Pestov f3903e2ac3 compiler.cfg.two-operand: sometimes we can eliminate a copy in the x = y <op> y case 2009-08-08 20:03:42 -05:00
Slava Pestov 38ef8adde0 compiler.cfg.representation: OK to unbox output of ##load-reference globally 2009-08-08 20:03:13 -05:00