luau/CodeGen
Arseny Kapoulkine cc51e616ce
CodeGen: Optimize vector ops for X64 when the source is computed (#1174)
With the TAG_VECTOR change, we can now confidently distinguish cases
when the .w component
contains TVECTOR tag from cases where it doesn't: loads and tag ops
produce the tag, whereas
other instructions don't.

We now take advantage of this fact and only apply vandps with a mask
when we need to.

It would be possible to use a positive filter (explicitly checking for
source coming from ADD_VEC
et al), but there are more instructions to check this way and this is
purely an optimization so
it is allowed to be conservative (as in, the cost of a mistake here is a
potential slowdown,
not a correctness issue).

Additionally, this change only performs vandps once when the arguments
are the same instead
of doing it twice.

On the function that computes a polynomial approximation this change
makes it ~20% faster on Zen4.
2024-03-01 03:32:43 -08:00
..
include CodeGen: Extract all vector tag patching into TAG_VECTOR (#1171) 2024-02-21 07:06:11 -08:00
src CodeGen: Optimize vector ops for X64 when the source is computed (#1174) 2024-03-01 03:32:43 -08:00