factor

Commit Graph

Author	SHA1	Message	Date
Slava Pestov	4d5a4222b6	More SIMD work - Rename SIMD types and register representations: <type>-<count> rather than <count><type>-array - Make a functor to define 256-bit vector types, use it to define float-8 type - Make SIMD instructions pure-insns so that they participate in value numbering	2009-09-03 20:58:56 -05:00
Slava Pestov	ff8c70dbe0	Initial implementation of SSE vector intrinsics: - cpu.architecture: add SSE vector representations - compiler.cfg.intrinsics.alien: remove an attempt at optimization that value numbering handles now - compiler.cfg.representations: support instructions where the representation is set in the 'rep' slot, and support conversions between single and double floats - alien-float, set-alien-float now use the single float representation, and the conversion is implicit; this fixes a long-standing bug where a register could get clobbered because of how %set-alien-float was defined on x86 - math.vectors.specialization: add support for SIMD specialization (where the vector word's body is replaced by another quotation), also specialize the 'sum' word - math.vectors.simd: 4float-array, 2double-array, 4double-array types, and specializers for the math.vectors words	2009-09-03 02:33:07 -05:00
Slava Pestov	85a2bfab6c	compiler: eliminate boilerplate by centralizing info in declarative INSN: syntax	2009-09-02 06:22:37 -05:00
Slava Pestov	2bb6293217	compiler: add fixnum-min/max intrinsics; ~10% speedup on benchmark.yuv-to-rgb	2009-08-28 19:02:59 -05:00
Slava Pestov	d957ae4e44	Performance improvements to make struct-arrays benchmark faster - improved optimization of ##unbox-any-c-ptr on ##box-displaced-alien; convert it to ##unbox-c-ptr where possible using class info stored in the ##bda instruction - make fcos, fsin, etc inline again; everything in math.libm inline again, except for fsqrt which is an intrinsic - convert min and max on floats to float-min and float-max - make min and max not inline, so that the above can work - struct-arrays: rice a bit so that more fixnums come up	2009-08-28 05:21:16 -05:00
Doug Coleman	3f3d57032b	Delete empty unit tests files, remove 1- and 1+, reorder IN: lines in a lot of places, minor refactoring	2009-08-13 19:21:44 -05:00
Slava Pestov	24a50c8006	compiler.cfg.two-operand: sometimes we can eliminate a copy in the x = y <op> y case	2009-08-08 20:03:42 -05:00
Slava Pestov	4b7ba38aab	compiler.cfg: virtual registers are integers now, and representations are stored off to the side. Fix bug in representation selection that would manifest if a value was used as a float and a fixnum in different branches; cannot globally unbox float in this case	2009-08-08 04:02:18 -05:00
Slava Pestov	725280d424	Split off the notion of a register representation from a register class	2009-08-07 17:44:50 -05:00
Slava Pestov	370f4c081d	compiler.cfg: convert code into two-operand form before SSA destruction; SSA destruction now operates on a relaxed SSA form where multiple defs of the same vreg are allowed, but only within a single basic block. This makes linear scan's coalescing redundant, allowing it to be removed completely	2009-08-05 18:57:46 -05:00
Slava Pestov	c1c8424605	Compiler speedups	2009-08-02 09:16:21 -05:00
Slava Pestov	9bde92220b	compiler.cfg.two-operand: if last instruction in a basic block is an overflowing arithmetic op of the form x = y op x, we now convert it correctly. This fixes compiler regression with benchmark.dawes after recent coalescing changes	2009-08-01 23:50:47 -05:00
Slava Pestov	e8cf50ac3e	compiler.cfg.two-operand: make it work in more cases	2009-07-27 22:28:29 -05:00
Slava Pestov	21a012e3d7	compiler.cfg: Major restructuring -- do not compute liveness before local optimization, and instead change local optimizations to be more permissive of undefined values. Now, liveness is only computed once, after phi elimination and before register allocation. This means liveness analysis does not need to take phi nodes into account and can now use the new compiler.cfg.dataflow-analysis framework	2009-07-22 03:08:28 -05:00
Slava Pestov	e88e7f70be	Merge branch 'master' of git://factorcode.org/git/factor	2009-07-17 00:03:13 -05:00
Slava Pestov	3fb4fc1bde	Improve code generation for shift word: add intrinsics for fixnum-shift-fast in the case where the shift count is not constant, transform 1 swap shift into a more overflow check with open-coded fast case, transform bitand into fixnum-bitand in more cases	2009-07-16 23:50:48 -05:00
Daniel Ehrenberg	8ea2996438	Removing two unused words in compiler.cfg.two-operand	2009-07-16 22:59:38 -05:00
Slava Pestov	e76dce8aff	Overflowing fixnum intrinsics now expand into several CFG nodes. This speeds up the common case since only the uncommon case is now a stack syncpoint	2009-07-16 18:29:40 -05:00
Slava Pestov	45a2105449	cpu.x86.assembler: IMUL2 instruction was busted for immediate operands When given a register and an immediate, it would generate imul imm,dst,dst however the 64-bit prefix was generated wrong and if dst was an extended register only the first operand would be an extended register. To fix this, change IMUL2 to not work on immediates anymore, and added a new IMUL3 that takes a destination register, source register, and immediate. Also, change compiler.cfg.two-operand to not two-operandize %mul-imm, since this isn't needed anymore. This fixes the sporadic benchmark.tuple-arrays crash on 64-bit machines.	2009-06-08 21:15:52 -05:00
Slava Pestov	3a9922d161	Fix compiler errors	2009-06-01 03:00:10 -05:00
Slava Pestov	e04df76f60	Various codegen improvements: - new-insn word to construct instructions - cache RPO in the CFG - re-organize low-level optimizer so that MR is built after register allocation - register allocation now stores instruction numbers in the instructions themselves - split defs-vregs into defs-vregs and temp-vregs	2009-05-29 13:11:34 -05:00
Slava Pestov	6b25e99470	Add summary for heaps more vocabs	2009-02-16 21:05:13 -06:00
Slava Pestov	145b635eb6	More optimization intended to reduce compile time. Another 10% speedup on compiling empty PEG parser - new map-flat combinator replaces usages of 'map flatten' in compiler - compiler.tree.def-use.simplified uses an explicit accumulator instead of flatten - compiler.tree.tuple-unboxing uses an explicit accumulator instead of flatten - fix inlining regression from last time: custom inlining results would sometimes be discarded - compiler.tree's 3each and 3map combinators rewritten to not use flip - rewrite math.partial-dispatch without locals (purely stylistic, no performance increase) - hand-optimize flip for common arrays-of-arrays case - don't run escape analysis and tuple unboxing if there are no allocations in the IR	2008-12-06 11:17:19 -06:00
Slava Pestov	db4db19cd9	Start working on coalescing	2008-10-28 02:38:37 -07:00

24 Commits (01b5430fbfb29aa6000ed90080c4328f127d0b23)