factor

Commit Graph

Author	SHA1	Message	Date
Slava Pestov	e36a0d7ef4	compiler: clean up code generation for alien boxing/unboxing a bit	2009-09-03 21:22:43 -05:00
Joe Groff	c480bec303	convert comparison branch code in compiler to use locals	2009-09-03 21:19:39 -05:00
Slava Pestov	4d5a4222b6	More SIMD work - Rename SIMD types and register representations: <type>-<count> rather than <count><type>-array - Make a functor to define 256-bit vector types, use it to define float-8 type - Make SIMD instructions pure-insns so that they participate in value numbering	2009-09-03 20:58:56 -05:00
Joe Groff	036ff77306	add compiler comparison codes for floating-point unordered comparisons; update x86 backend to generate proper code for all floating-point comparisons	2009-09-03 20:32:05 -05:00
Slava Pestov	906a0d212a	Detect SSE version and enable the correct set of SIMD intrinsics	2009-09-03 03:28:38 -05:00
Slava Pestov	ff8c70dbe0	Initial implementation of SSE vector intrinsics: - cpu.architecture: add SSE vector representations - compiler.cfg.intrinsics.alien: remove an attempt at optimization that value numbering handles now - compiler.cfg.representations: support instructions where the representation is set in the 'rep' slot, and support conversions between single and double floats - alien-float, set-alien-float now use the single float representation, and the conversion is implicit; this fixes a long-standing bug where a register could get clobbered because of how %set-alien-float was defined on x86 - math.vectors.specialization: add support for SIMD specialization (where the vector word's body is replaced by another quotation), also specialize the 'sum' word - math.vectors.simd: 4float-array, 2double-array, 4double-array types, and specializers for the math.vectors words	2009-09-03 02:33:07 -05:00
Slava Pestov	85a2bfab6c	compiler: eliminate boilerplate by centralizing info in declarative INSN: syntax	2009-09-02 06:22:37 -05:00
Slava Pestov	9595be4bf9	%box-displaced-alien: fix clobberage found by Doug	2009-08-30 05:11:08 -05:00
Slava Pestov	2bb6293217	compiler: add fixnum-min/max intrinsics; ~10% speedup on benchmark.yuv-to-rgb	2009-08-28 19:02:59 -05:00
Slava Pestov	d957ae4e44	Performance improvements to make struct-arrays benchmark faster - improved optimization of ##unbox-any-c-ptr on ##box-displaced-alien; convert it to ##unbox-c-ptr where possible using class info stored in the ##bda instruction - make fcos, fsin, etc inline again; everything in math.libm inline again, except for fsqrt which is an intrinsic - convert min and max on floats to float-min and float-max - make min and max not inline, so that the above can work - struct-arrays: rice a bit so that more fixnums come up	2009-08-28 05:21:16 -05:00
sheeple	98f93f799b	cpu.ppc: fix ##box-displaced-alien	2009-08-27 04:43:45 -05:00
Slava Pestov	f662e6403a	compiler: new inline intrinsic for <displaced-alien> where the inputs have known types; value numbering now eliminates unnecessary allocation of displaced aliens if the result is immediately unboxed again	2009-08-27 00:06:19 -05:00
Slava Pestov	0df8aadce2	cpu.x86: use SQRTSD instruction for math.libm:fsqrt word	2009-08-25 23:22:15 -05:00
Doug Coleman	3f3d57032b	Delete empty unit tests files, remove 1- and 1+, reorder IN: lines in a lot of places, minor refactoring	2009-08-13 19:21:44 -05:00
Slava Pestov	725280d424	Split off the notion of a register representation from a register class	2009-08-07 17:44:50 -05:00
Slava Pestov	45770c6250	Move a bunch of GC check generation logic to platform-independent side	2009-07-30 21:28:27 -05:00
Slava Pestov	be363d1a5b	compiler.cfg: Get inline GC checks working again, using a dataflow analysis to compute uninitialized stack locations in compiler.cfg.stacks.uninitialized. Re-enable intrinsics which use inline allocation	2009-07-30 09:19:44 -05:00
Slava Pestov	d71e2f9577	cpu.x86: Fix shuffle bug. Shuffling bugs occurring in code that runs before optimizer/stack checker is online are only caught at runtime during bootstrap, what a pain	2009-07-30 05:12:40 -05:00
Slava Pestov	d81dec5d45	cpu.x86: fix a bug in small-register logic on 32-bit. Also, on 32-bit, we don't need to do any special register shuffling to work with 16-bit operands since all registers have 16-bit variants. So now only 8-bit operands on x86-32 require special treatment	2009-07-30 05:04:46 -05:00
Slava Pestov	8ca17d053c	cpu.x86: use full set of 8-bit, 16-bit and 32-bit registers on x86-64 to avoid clumsy save/restore logic	2009-07-29 21:56:37 -05:00
Slava Pestov	73862a9a03	cpu.x86.assembler: move operands to operands sub-vocabulary, clean up small-reg-* code in compiler backend	2009-07-29 21:44:08 -05:00
Slava Pestov	bfb2a4c1fc	cpu.x86: compile a load of zero, and adds, subs where dst = src1 more efficiently	2009-07-27 22:27:54 -05:00
Slava Pestov	3fb4fc1bde	Improve code generation for shift word: add intrinsics for fixnum-shift-fast in the case where the shift count is not constant, transform 1 swap shift into a more overflow check with open-coded fast case, transform bitand into fixnum-bitand in more cases	2009-07-16 23:50:48 -05:00
Slava Pestov	e76dce8aff	Overflowing fixnum intrinsics now expand into several CFG nodes. This speeds up the common case since only the uncommon case is now a stack syncpoint	2009-07-16 18:29:40 -05:00
Slava Pestov	768e2a5148	compiler.cfg: split off condition codes into a comparisons sub-vocabulary	2009-07-13 14:42:52 -05:00
Slava Pestov	45a2105449	cpu.x86.assembler: IMUL2 instruction was busted for immediate operands When given a register and an immediate, it would generate imul imm,dst,dst however the 64-bit prefix was generated wrong and if dst was an extended register only the first operand would be an extended register. To fix this, change IMUL2 to not work on immediates anymore, and added a new IMUL3 that takes a destination register, source register, and immediate. Also, change compiler.cfg.two-operand to not two-operandize %mul-imm, since this isn't needed anymore. This fixes the sporadic benchmark.tuple-arrays crash on 64-bit machines.	2009-06-08 21:15:52 -05:00
Slava Pestov	9ad9600764	Remove %dispatch-label since its tehe same on all platforms; fix %gc on PowerPC	2009-06-07 21:46:28 -05:00
Slava Pestov	ade5db2405	cpu.x86: fix small register intrinsics on x86-64	2009-06-03 03:22:46 -05:00
Slava Pestov	2d231f066a	GC checks now save and restore registers	2009-06-02 18:23:47 -05:00
Slava Pestov	b389dcf441	Redo compiler.codegen.fixup and get %dispatch to work	2009-06-01 02:32:36 -05:00
Slava Pestov	fc152ef210	Various improvements aimed at getting local optimization regressions fixed: - Rename _gc to ##gc - Absolute labels are now supported - Generate _dispatch-label	2009-05-31 23:28:08 -05:00
Slava Pestov	76d74c16af	Fixing various bugs; alias analysis wasn't handling ##phi nodes, stack analysis incorrectly handled height-changing back edges and ##fixnum-*, clean up ##dispatch generation	2009-05-29 01:39:14 -05:00
Slava Pestov	318552ba60	Fix tail call PICs on x86-64	2009-05-06 22:44:30 -05:00
Slava Pestov	581d017b46	Working on inline caching for tail call sites	2009-05-06 19:22:22 -05:00
Slava Pestov	c93d876075	Better separation of concerns: cpu.{x86,ppc}.assembler no longer depends on compiler.codegen.fixup and cpu.architecture. Rename rt-xt-direct to rt-xt-pic to better explain its purpose	2009-05-06 16:14:53 -05:00
Slava Pestov	44bfff7c7b	Rename ##load-indirect to ##load-reference since this is more descriptive; value numbering doesn't assign expressions to ##load-reference nodes since this would end up folding literals which were eq? but not =	2009-01-29 01:44:58 -06:00
Slava Pestov	8a8f0c925c	Use BSR instruction to implement fixnum-log2 intrinsic	2008-12-06 15:31:17 -06:00
Slava Pestov	a56d480aa6	Various optimizations leading to a 10% speedup on compiling empty EBNF parser: - open-code getenv primitive - inline tuple predicates in finalization - faster partial dispatch - faster built-in type predicates - faster tuple predicates - faster lo-tag dispatch - compile V{ } clone and H{ } clone more efficiently - add fixnum fast-path to =; avoid indirect branch if two fixnums not eq - faster >alist on hashtables	2008-12-06 09:16:29 -06:00
Slava Pestov	82cf6530c6	set-string-nth-fast intrinsic was busted	2008-12-05 23:52:09 -06:00
Slava Pestov	e256846acd	Tweak string representation; high bit indicates if character has high bits in aux vector. Avoids memory access in common case. Split set-string-nth into two primitives; set-string-nth-fast is open-coded by optimizing compiler. 13% improvement on reverse-complement	2008-12-05 06:38:51 -06:00
Slava Pestov	e5ed7447ed	Removing more >r/r> usages	2008-12-03 08:46:16 -06:00
Slava Pestov	e7f4563374	fixnum* intrinsic for x86	2008-11-30 07:26:49 -06:00
Slava Pestov	f44506089d	More work on overflow instructions: don't need temp register anymore, add -tail variants which don't need stack frame	2008-11-28 06:36:30 -06:00
Slava Pestov	5634becda1	##fixnum-add, ##fixnum-sub instructions open-code overflow check	2008-11-28 05:33:58 -06:00
Slava Pestov	ab689c098b	Clean up direct literal code and make a first attempt at PowerPC support	2008-11-24 08:16:14 -06:00
Slava Pestov	2aaf860f47	Experimental optimizations	2008-11-24 06:40:51 -06:00
Slava Pestov	20f5541d35	Refactoring FFI for Win64	2008-11-17 13:34:37 -06:00
Slava Pestov	eb05dd3a12	Optimize a ##dispatch that is applied to the result of a ##sub-imm or ##add-imm; this eliminates an instruction from the common 1 fixnum-fast { ... } dispatch and 8 fixnum-fast { ... } dispatch code sequences appearing in generic word expansions	2008-11-13 04:16:08 -06:00
unknown	f7fe84e563	Working on Win64 FFI	2008-11-08 21:40:47 -06:00
unknown	7365959f01	Starting work on Win64 port	2008-11-07 20:33:32 -06:00
sheeple	d2ec46e38f	PowerPC backend almost functional; some new compiler unit tests added, better compilation of 'f eq?'; f becomes an immediate operand move aux-offset to compiler.constants	2008-11-06 06:27:27 -06:00
Slava Pestov	53cd75b06c	Add string-nth intrinsic	2008-11-06 01:11:28 -06:00
Slava Pestov	8b7c47a68b	Clean up x86 backend: move cpu.x86.architecture to cpu.x86, use branchless arithmetic in some intrinsics	2008-11-05 04:15:48 -06:00

... 2 3 4 5 6

253 Commits (3e485652feb276a51667843f9a95173f9c862425)