Commit Graph

81 Commits

Author SHA1 Message Date
Arseny Kapoulkine
9aa82c6fb9
CodeGen: Improve lowering of NUM_TO_VEC on A64 for constants (#1194)
When the input is a constant, we use a fairly inefficient sequence of
fmov+fcvt+dup or, when the double isn't encodable in fmov,
adr+ldr+fcvt+dup.

Instead, we can use the same lowering as X64 when the input is a
constant, and load the vector from memory. However, if the constant is
encodable via fmov, we can use a vector fmov instead (which is just one
instruction and doesn't need constant space).

Fortunately the bit encoding of fmov for 32-bit floating point numbers
matches that of 64-bit: the decoding algorithm is a little different
because it expands into a larger exponent, but the values are
compatible, so if a double can be encoded into a scalar fmov with a
given abcdefgh pattern, the same pattern should encode the same float;
due to the very limited number of mantissa and exponent bits, all values
that are encodable are also exact in both 32-bit and 64-bit floats.

This strategy is ~same as what gcc uses. For complex vectors, we
previously used 4 instructions and 8 bytes of constant storage, and now
we use 2 instructions and 16 bytes of constant storage, so the memory
footprint is the same; for simple vectors we just need 1 instruction (4
bytes).

clang lowers vector constants a little differently, opting to synthesize
a 64-bit integer using 4 instructions (mov/movk) and then move it to the
vector register - this requires 5 instructions and 20 bytes, vs ours/gcc
2 instructions and 8+16=24 bytes. I tried a simpler version of this that
would be more compact - synthesize a 32-bit integer constant with
mov+movk, and move it to vector register via dup.4s - but this was a
little slower on M2, so for now we prefer the slightly larger version as
it's not a regression vs current implementation.

On the vector approximation benchmark we get:

- Before this PR (flag=false): ~7.85 ns/op
- After this PR (flag=true): ~7.74 ns/op
- After this PR, with 0.125 instead of 0.123 in the benchmark code (to
use fmov): ~7.52 ns/op
- Not part of this PR, but the mov/dup strategy described above: ~8.00
ns/op
2024-03-13 12:56:11 -07:00
aaron
ae459a0197
Sync to upstream/release/616 (#1184)
# What's Changed

* Add a compiler hint to improve Luau memory allocation inlining

### New Type Solver

* Added a system for recommending explicit type annotations to users in
cases where we've inferred complex generic types with type families.
* Marked string library functions as `@checked` for use in new
non-strict mode.
* Fixed a bug with new non-strict mode where we would incorrectly report
arity mismatches when missing optional arguments.
* Implement an occurs check for unifications that would produce
self-recursive types.
* Fix bug where overload resolution would fail when applied to
non-overloaded functions.
* Fix bug that caused the subtyping to report an error whenever a
generic was instantiated in an invariant context.
* Fix crash caused by `SetPropConstraint` not blocking properly.

### Native Code Generation

* Implement optimization to eliminate dead stores
* Optimize vector ops for X64 when the source is computed (thanks,
@zeux!)
* Use more efficient lowering for UNM_* (thanks, @zeux!)

---

### Internal Contributors

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2024-03-08 16:47:53 -08:00
vegorov-rbx
443903aa00
Sync to upstream/release/615 (#1175)
# What's changed?

* Luau allocation scheme was changed to handle allocations in 513-1024
byte range internally without falling back to global allocator
* coroutine/thread creation no longer requires any global allocations,
making it up to 15% faster (vs libc malloc)
* table construction for 17-32 keys or 33-64 array elements is up to 30%
faster (vs libc malloc)

### New Type Solver

* Cyclic unary negation type families are reduced to `number` when
possible
* Class types are skipped when searching for free types in unifier to
improve performance
* Fixed issues with table type inference when metatables are present
* Improved inference of iteration loop types
* Fixed an issue with bidirectional inference of method calls
* Type simplification will now preserve error suppression markers

### Native Code Generation

* Fixed TAG_VECTOR skip optimization to not break instruction use counts
(broken optimization wasn't included in 614)
* Fixed missing side-effect when optimizing generic loop preparation
instruction

---

### Internal Contributors

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
2024-03-01 10:45:26 -08:00
Arseny Kapoulkine
cc51e616ce
CodeGen: Optimize vector ops for X64 when the source is computed (#1174)
With the TAG_VECTOR change, we can now confidently distinguish cases
when the .w component
contains TVECTOR tag from cases where it doesn't: loads and tag ops
produce the tag, whereas
other instructions don't.

We now take advantage of this fact and only apply vandps with a mask
when we need to.

It would be possible to use a positive filter (explicitly checking for
source coming from ADD_VEC
et al), but there are more instructions to check this way and this is
purely an optimization so
it is allowed to be conservative (as in, the cost of a mistake here is a
potential slowdown,
not a correctness issue).

Additionally, this change only performs vandps once when the arguments
are the same instead
of doing it twice.

On the function that computes a polynomial approximation this change
makes it ~20% faster on Zen4.
2024-03-01 03:32:43 -08:00
Vighnesh-V
3b0e93bec9
Sync to upstream/release/614 (#1173)
# What's changed?
Add program argument passing to scripts run using the Luau REPL! You can
now pass `--program-args` (or shorthand `-a`) to the REPL which will
treat all remaining arguments as arguments to pass to executed scripts.
These values can be accessed through variadic argument expansion. You
can read these values like so:
```
local args = {...} -- gets you an array of all the arguments
```
For example if we run the following script like `luau test.lua -a test1
test2 test3`:
```
-- test.lua
print(...)
```
you should get the output:
```
test1 test2 test3
```

### Native Code Generation

* Improve A64 lowering for vector operations by using vector
instructions
* Fix lowering issue in IR value location tracking! 
- A developer reported a divergence between code run in the VM and
Native Code Generation which we have now fixed

### New Type Solver

* Apply substitution to type families, and emit new constraints to
reduce those further
* More progress on reducing comparison  (`lt/le`)type families
* Resolve two major sources of cyclic types in the new solver

### Miscellaneous
* Turned internal compiler errors (ICE's) into warnings and errors

-------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2024-02-23 12:08:34 -08:00
Arseny Kapoulkine
80928acb92
CodeGen: Extract all vector tag patching into TAG_VECTOR (#1171)
Instead of patching the tag component with TVECTOR in every instruction
that produces a vector value, we now use a separate IR instruction to do
this. This reduces implementation redundancy, but more importantly
allows for a class of optimizations:

- NUM_TO_VECTOR previously patched the component unconditionally but the
result was used only in MUL/DIV_VEC instructions that ignore it anyway;
we can now remove this.

- ADD_VEC et al can now forward the source of TAG_VECTOR instruction of
either input; this shortens the latency chain and in the future could
allow us to generate optimal vector instruction sequence once the
temporary stores are marked as dead.

- In the future on X64, ADD_VEC et al will be able to analyze the input
instruction and remove tag masking conditionally. This is not part of
this PR as it requires a decision around expected FP environment and/or
the necessity of the existing masking to begin with.

I've also renamed NUM_TO_VECTOR to NUM_TO_VEC so that "VEC" always
refers to "3 float values" and for consistency with ADD/etc.

Note: ADD_VEC input forwarding is currently performed unconditionally;
it may or may not increase the spills that can't be reloaded from the
stack.

On A64 this makes the Taylor series computation a tiny bit faster
(11.3ns => 11.0ns) as it removes the redundant ins instructions along
the NUM_TO_VEC path. Curiously, the optimization of forwarding
TAG_VECTOR input to arithmetic instructions actually has a small penalty
as without it this PR runs at 10.9 ns. I don't know if this is a
property of the benchmark though, as I just noticed that in this
benchmark type inference actually fails to infer parts of the
computation as a vector op. If desired I will happily omit this part of
the change and we can explore that separately.
2024-02-21 07:06:11 -08:00
Arseny Kapoulkine
c5f4d973d7
Improve A64 lowering for vector operations by using vector instructions (#1164)
This change replaces scalar versions of vector opcodes for A64 with
actual vector instructions.

We take the approach similar to X64: patch last component with zero,
perform the math, patch last component with type tag. I'm hoping that in
the future the type tag will be placed separately (separate IR opcode?)
because right now chains of math operations result in excessive type tag
operations.

To patch the type tag without always keeping a mask in a register,
ins.4s instructions can be used; unfortunately it's only capable of
patching a register in-place, so we need an extra register copy in case
it's not last-use. Usually it's last-use so the patch is free; probably
with IR rework mentioned above all of this can be improved (e.g.
load-with-patch will never need to copy).

~It's not 100% clear if we *have* to patch type tag: Apple does preserve
denormals but we'd need to benchmark this to see if there's an actual
performance impact. But for now we're playing it safe.~

This was tested by running the conformance tests, and new opcode
implementations were checked by comparing the result with
https://armconverter.com/.

Performance testing is complicated by the fact that OSS Luau doesn't
support vector constructor out of the box, and other limitations of
codegen. I've hacked vector constructor/type into REPL and confirmed
that on a test that calls this function in a loop (not inlined):

```
function fma(a: vector, b: vector, c: vector)
        return a * b + c
end
```

... this PR improves performance by ~6% (note that probably most of the
overhead here is the call dispatch; I didn't want to brave testing a
more complex expression). The assembly for an individual operation
changes as follows:

Before:

```
#   %14 = MUL_VEC %12, %13                                    ; useCount: 2, lastUse: %22
 dup         s29,v31.s[0]
 dup         s28,v30.s[0]
 fmul        s29,s29,s28
 ins         v31.s[0],v29.s[0]
 dup         s29,v31.s[1]
 dup         s28,v30.s[1]
 fmul        s29,s29,s28
 ins         v31.s[1],v29.s[0]
 dup         s29,v31.s[2]
 dup         s28,v30.s[2]
 fmul        s29,s29,s28
 ins         v31.s[2],v29.s[0]
```

After:

```
#   %14 = MUL_VEC %12, %13                                    ; useCount: 2, lastUse: %22
 ins         v31.s[3],w31
 ins         v30.s[3],w31
 fmul        v31.4s,v31.4s,v30.4s
 movz        w17,#4
 ins         v31.s[3],w17
```

**edit** final form (see comments):

```
#   %14 = MUL_VEC %12, %13                                    ; useCount: 2, lastUse: %22
 fmul        v31.4s,v31.4s,v30.4s
 movz        w17,#4
 ins         v31.s[3],w17
```
2024-02-16 08:30:35 -08:00
vegorov-rbx
ea14e65ea0
Sync to upstream/release/613 (#1167)
# What's changed?

* Compiler now targets bytecode version 5 by default, this includes
support for vector type literals and sub/div opcodes with a constant on
lhs

### New Type Solver

* Normalizer type inhabitance check has been optimized
* Added ability to reduce cyclic `and`/`or` type families 

### Native Code Generation

* `CodeGen::compile` now returns more specific causes of a code
generation failure
* Fixed linking issues on platforms that don't support unwind frame data
registration

---

### Internal Contributors

Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
2024-02-15 18:04:39 -08:00
Alexander McCord
67ce75e870
Sync to upstream/release/611 (#1160)
# What's changed?

### Native Code Generation

* Fixed an UAF relating to reusing a hash key after a weak table has
undergone some GC.
* Fixed a bounds check on arm64 to allow access to the last byte of a
buffer.

### New Type Solver

* Type states now preserves error-suppression, i.e. `local x: any = 5`
and `x.foo` does not error.
* Made error-suppression logic in subtyping more accurate.
* Subtyping now knows how to reduce type families.
* Fixed function call overload resolution so that the return type
resolves to the correct overload.
* Fixed a case where we attempted to reduce irreducible type families a
few too many times, leading to duplicate errors.
* Type checker needs to type check annotations in function signatures to
be able to report errors relating to those annotations.
* Fixed an UAF from a pointer to stack-allocated data in Subtyping's
`explainReasonings`.

### Nonstrict Type Checker

* Fixed a crash when calling a checked function of the form `math.abs`
with an incorrect argument type.
* Fixed a crash when calling a checked function with a number of
arguments that did not exactly match the number of parameters required.

---

### Internal Contributors

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2024-02-02 13:32:42 -08:00
aaron
9c588be16d
Sync to upstream/release/610 (#1154)
# What's changed?

* Check interrupt handler inside the pattern match engine to eliminate
potential for programs to hang during string library function execution.
* Allow iteration over table properties to pass the old type solver. 

### Native Code Generation

* Use in-place memory operands for math library operations on x64.
* Replace opaque bools with separate enum classes in IrDump to improve
code maintainability.
* Translate operations on inferred vectors to IR.
* Enable support for debugging native-compiled functions in Roblox
Studio.

### New Type Solver

* Rework type inference for boolean and string literals to introduce
bounded free types (bounded below by the singleton type, and above by
the primitive type) and reworked primitive type constraint to decide
which is the appropriate type for the literal.
* Introduce `FunctionCheckConstraint` to handle bidirectional
typechecking for function calls, pushing the expected parameter types
from the function onto the arguments.
* Introduce `union` and `intersect` type families to compute deferred
simplified unions and intersections to be employed by the constraint
generation logic in the new solver.
* Implement support for expanding the domain of local types in
`Unifier2`.
* Rework type inference for iteration variables bound by for in loops to
use local types.
* Change constraint blocking logic to use a set to prevent accidental
re-blocking.
* Add logic to detect missing return statements in functions.

### Internal Contributors

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2024-01-26 19:20:56 -08:00
vegorov-rbx
cdd1a380db
Sync to upstream/release/609 (#1150)
### What's changed?
* Syntax for [read-only and write-only
properties](https://github.com/luau-lang/rfcs/pull/15) is now parsed,
but is not yet supported in typechecking

### New Type Solver
* `keyof` and `rawkeyof` type operators have been updated to match final
text of the [RFC](https://github.com/luau-lang/rfcs/pull/16)
* Fixed issues with cyclic type families that were generated for mutable
loop variables

### Native Code Generation
* Fixed inference for number / vector operation that caused an
unnecessary VM assist

---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2024-01-19 10:04:46 -08:00
Vighnesh-V
f31232d301
Sync to upstream/release/608 (#1145)
# Old Solver:

- Fix a bug in the old solver where a user could use the keyword
`typeof` as the name of a type alias.
- Fix stringification of scientific notation to omit a trailing decimal
place when not followed by a digit e.g. `1.e+20` -> `1e+20`
# New Solver
- Continuing work on the New non-strict mode
- Introduce `keyof` and `rawkeyof` type function for acquiring the type
of all keys in a table or class
(https://github.com/luau-lang/rfcs/pull/16)

---
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>

---------

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2024-01-12 14:25:27 -08:00
Petri Häkkinen
2173938eb0
Add tagged lightuserdata (#1087)
This change adds support for tagged lightuserdata and optional custom
typenames for lightuserdata.

Background: Lightuserdata is an efficient representation for many kinds
of unmanaged handles and resources in a game engine. However, currently
the VM only supports one kind of lightuserdata, which makes it
problematic in practice. For example, it's not possible to distinguish
between different kinds of lightuserdata in Lua bindings, which can lead
to unsafe practices and even crashes when a wrong kind of lightuserdata
is passed to a binding function. Tagged lightuserdata work similarly to
tagged userdata, i.e. they allow checking the tag quickly using
lua_tolightuserdatatagged (or lua_lightuserdatatag).

The tag is stored in the 'extra' field of TValue so it will add no cost
to the (untagged) lightuserdata type.

Alternatives would be to use full userdata values or use bitpacking to
embed type information into lightuserdata on application level.
Unfortunately these options are not that great in practice: full
userdata have major performance implications and bitpacking fails in
cases where full 64 bits are already used (e.g. pointers or 64-bit
hashes).

Lightuserdata names are not strictly necessary but they are rather
convenient when debugging Lua code. More precise error messages and
tostring returning more specific typename are useful to have in practice
(e.g. "resource" or "entity" instead of the more generic "userdata").

Impl note: I did not add support for renaming tags in
lua_setlightuserdataname as I'm not sure if it's possible to free fixed
strings. If it's simple enough, maybe we should allow renaming (although
I can't think of a specific need for it)?

---------

Co-authored-by: Petri Häkkinen <petrih@rmd.remedy.fi>
2023-12-14 15:05:51 -08:00
vegorov-rbx
c26d820902
Sync to upstream/release/606 (#1127)
New Solver
* Improvements to data flow analysis

Native Code Generation
* Block limit is now per-function instead of per-module

Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-12-08 13:50:16 -08:00
Vighnesh-V
c755875479
Sync to upstream/release/605 (#1118)
- Implemented [Require by String with Relative
Paths](https://github.com/luau-lang/rfcs/blob/master/docs/new-require-by-string-semantics.md)
RFC
- Implemented [Require by String with
Aliases](https://github.com/luau-lang/rfcs/blob/master/docs/require-by-string-aliases.md)
RFC with support for `paths` and `alias` arrays in .luarc
- Added SUBRK and DIVRK bytecode instructions to speed up
constant-number and constant/number operations
- Added `--vector-lib`, `--vector-ctor` and `--vector-type` options to
luau-compile to support code with vectors
 
New Solver
- Correctness fixes to subtyping
- Improvements to dataflow analysis

Native Code Generation
- Added bytecode analysis pass to predict type tags used in operations
- Fixed rare cases of numerical loops being generated without an
interrupt instruction
- Restored optimization data propagation into the linear block
- Duplicate buffer length checks are optimized away

Miscellaneous
- Small performance improvements to new non-strict mode
- Introduced more scripts for fuzzing Luau and processing the results,
including fuzzer build support for CMake

Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-12-01 23:46:57 -08:00
Arseny Kapoulkine
89b437bb4e
Add SUBRK and DIVRK bytecode instructions to bytecode v5 (#1115)
Right now, we can compile R\*K for all arithmetic instructions, but K\*R
gets compiled into two instructions (LOADN/LOADK + arithmetic opcode).

This is problematic since it leads to reduced performance for some code.
However, we'd like to avoid adding reverse variants of ADDK et al for
all opcodes to avoid the increase in I$ footprint for interpreter.

Looking at the arithmetic instructions, % and // don't have interesting
use cases for K\*V; ^ is sometimes used with constant on the left hand
side but this would need to call pow() by necessity in all cases so it
would be slow regardless of the dispatch overhead. This leaves the four
basic arithmetic operations.

For + and \*, we can implement a compiler-side optimization in the
future that transforms K\*R to R\*K automatically. This could either be
done unconditionally at -O2, or conditionally based on the type of the
value (driven by type annotations / inference) -- this technically
changes behavior in presence of metamethods, although it might be
sensible to just always do this because non-commutative +/* are evil.

However, for - and / it is impossible for the compiler to optimize this
in the future, so we need dedicated opcodes. This only increases the
interpreter size by ~300 bytes (~1.5%) on X64.

This makes spectral-norm and math-partial-sums 6% faster; maybe more
importantly, voxelgen gets 1.5% faster (so this change does have
real-world impact).

To avoid the proliferation of bytecode versions this change piggybacks
on the bytecode version bump that was just made in 604 for vector
constants; we would still be able to enable these independently but
we'll consider v5 complete when both are enabled.

Related: #626

---------

Co-authored-by: vegorov-rbx <75688451+vegorov-rbx@users.noreply.github.com>
2023-11-28 07:35:01 -08:00
vegorov-rbx
1cda8304bf
Add missing include file to BytecodeSummary.h (#1110) 2023-11-21 08:28:54 -08:00
Andy Friesen
74c532053f
Sync to upstream/release/604 (#1106)
New Solver

* New algorithm for inferring the types of locals that have no
annotations. This
algorithm is very conservative by default, but is augmented with some
control
  flow awareness to handle most common scenarios.
* Fix bugs in type inference of tables
* Improve performance of by switching out standard C++ containers for
`DenseHashMap`
* Infrastructure to support clearer error messages in strict mode

Native Code Generation

* Fix a lowering issue with buffer.writeu8 and 0x80-0xff values: A
constant
  argument wasn't truncated to the target type range and that causes an
  assertion failure in `build.mov`.
* Store full lightuserdata value in loop iteration protocol lowering
* Add analysis to compute function bytecode distribution
* This includes a class to analyze the bytecode operator distribution
per
function and a CLI tool that produces a JSON report. See the new cmake
      target `Luau.Bytecode.CLI`

---------

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-11-17 10:46:18 -08:00
Kostadin
0492ecffdf
Add #include <algorithm> to fix building with gcc 14 (#1104)
With gcc 14 some C++ Standard Library headers have been changed to no
longer include other headers that were used internally by the library.
In luau's case it is the `<algorithm>` header.
2023-11-16 11:51:16 -08:00
Alexander McCord
c2ba1058c3
Sync to upstream/release/603 (#1097)
# What's changed?

- Record the location of properties for table types (closes #802)
- Implement stricter UTF-8 validations as per the RFC
(https://github.com/luau-lang/rfcs/pull/1)
- Implement `buffer` as a new type in both the old and new solvers.
- Changed errors produced by some `buffer` builtins to be a bit more
generic to avoid platform-dependent error messages.
- Fixed a bug where `Unifier` would copy some persistent types, tripping
some internal assertions.
- Type checking rules on relational operators is now a little bit more
lax.
- Improve dead code elimination for some `if` statements with complex
always-false conditions

## New type solver

- Dataflow analysis now generates phi nodes on exit of branches.
- Dataflow analysis avoids producing a new definition for locals or
properties that are not owned by that loop.
- If a function parameter has been constrained to `never`, report errors
at all uses of that parameter within that function.
- Switch to using the new `Luau::Set` to replace `std::unordered_set` to
alleviate some poor allocation characteristics which was negatively
affecting overall performance.
- Subtyping can now report many failing reasons instead of just the
first one that we happened to find during the test.
- Subtyping now also report reasons for type pack mismatches.
- When visiting `if` statements or expressions, the resulting context
are the common terms in both branches.

## Native codegen

- Implement support for `buffer` builtins to its IR for x64 and A64.
- Optimized `table.insert` by not inserting a table barrier if it is
fastcalled with a constant.

## Internal Contributors

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Arseny Kapoulkine <arseny@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-11-10 13:10:07 -08:00
aaron
7105c81579
Sync to upstream/release/602 (#1089)
# What's changed?

* Fixed a bug in type cloning by maintaining persistent types.
* We now parse imprecise integer literals to report the imprecision as a
warning to developers.
* Add a compiler flag to specify the name of the statistics output file.

### New type solver

* Renamed `ConstraintGraphBuilder` to `ConstraintGenerator`
* LValues now take into account the type being assigned during
constraint generation.
* Normalization performance has been improved by 33% by replacing the an
internal usage of `std::unordered_set` with `DenseHashMap`.
* Normalization now has a helper to identify types that are equivalent
to `unknown`, which is being used to fix some bugs in subtyping.
* Uses of the old unifier in the new type solver have been eliminated.
* Improved error explanations for subtyping errors in `TypeChecker2`.

### Native code generation

* Expanded some of the statistics recorded during compilation to include
the number of instructions and blocks.
* Introduce instruction and block count limiters for controlling what
bytecode is translated into native code.
* Implement code generation for byteswap instruction.

### Internal Contributors

Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
2023-11-03 16:45:04 -07:00
Lily Brown
e5ec0cdff3
Sync to upstream/release/601 (#1084)
## What's changed

- `bit32.byteswap` added
([RFC](4f543ec23b/docs/function-bit32-byteswap.md))
- Buffer library implementation
([RFC](4f543ec23b/docs/type-byte-buffer.md))
- Fixed a missing `stdint.h` include
- Fixed parser limiter for recursive type annotations being kind of
weird (fixes #645)

### Native Codegen
- Fixed a pair of issues when lowering `bit32.extract`
- Fixed a narrow edge case that could result in an infinite loop without
an interruption
- Fixed a negative array out-of-bounds access issue
- Temporarily reverted linear block predecessor value propagation

### New type solver
- We now type check assignments to annotated variables
- Fixed some test cases under local type inference
- Moved `isPending` checks for type families to improve performance
- Optimized our process for testing if a free type is sufficiently
solved
- Removed "none ptr" from lea instruction disassembly logging

### Build system & tooling
- CMake configuration now validates dependencies to maintain separation
between components
- Improvements to the fuzzer coverage
- Deduplicator for fuzzed callstacks

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
2023-10-27 14:18:41 -07:00
Micah
011c1afbde
Implement bit32.byteswap (#1075)
I've decided to take a stab at implementing `bit32.byteswap` from the
[recently merged
RFC](https://github.com/Roblox/luau/blob/master/rfcs/function-bit32-byteswap.md).
I asked on Discord for some guidance, but for the sake of posterity:
this is my first time doing this and I am likely to have made some
mistakes.

The biggest gaps in this implementation are the lack of tests and the
lack of native codegen support. I'd appreciate help with those since I'm
not sure what's relevant for me to touch for tests, and I'm told that
relevant assembler instructions don't exist publicly yet. Intuition
tells me that Luau-side tests would go into
`tests/conformance/bitwise.luau` but this is not well documented and I'm
not sure how I'm meant to test built-in implementations.

The current implementation compiles down to `bswap` and `rev` on x86 and
ARM respectively when optimized.

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
2023-10-23 08:00:48 -07:00
Vighnesh-V
fd6250cf9d
Sync to upstream/release/600 (#1076)
### What's Changed

- Improve readability of unions and intersections by limiting the number
of elements of those types that can be presented on a single line (gated
under `FFlag::LuauToStringSimpleCompositeTypesSingleLine`)
- Adds a new option to the compiler `--record-stats` to record and
output compilation statistics
- `if...then...else` expressions are now optimized into `AND/OR` form
when possible.

### VM

- Add a new `buffer` type to Luau based on the [buffer
RFC](https://github.com/Roblox/luau/pull/739) and additional C API
functions to work with it; this release does not include the library.
- Internal C API to work with string buffers has been updated to align
with Lua version more closely

### Native Codegen

- Added support for new X64 instruction (rev) and new A64 instruction
(bswap) in the assembler
- Simplified the way numerical loop condition is translated to IR

### New Type Solver

- Operator inference now handled by type families
- Created a new system called `Type Paths` to explain why subtyping
tests fail in order to improve the quality of error messages.
- Systematic changes to implement Data Flow analysis in the new solver
(`Breadcrumb` removed and replaced with `RefinementKey`)

---
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
2023-10-20 18:10:30 -07:00
Lily Brown
24fdac4c05
Sync to upstream/release/599 (#1069)
## What's Changed

- Improve POSIX compliance in `CLI/FileUtils.cpp` by @SamuraiCrow #1064
- `AstStat*::hasEnd` is deprecated; use `AstStatBlock::hasEnd` instead
- Added a lint for common misuses of the `#` operator
- Luau now issues deprecated diagnostics for some uses of `getfenv` and
`setfenv`
- Fixed a case where we included a trailing space in some error
stringifications

### Compiler

- Do not do further analysis in O2 on self functions
- Improve detection of invalid repeat..until expressions vs continue

### New Type Solver

- We now cache subtype test results to improve performance
- Improved operator inference mechanics (aka type families)
- Further work towards type states
- Work towards [new non-strict
mode](https://github.com/Roblox/luau/blob/master/rfcs/new-nonstrict.md)
continues

### Native Codegen

- Instruction last use locations should follow the order in which blocks
are lowered
- Add a bonus assertion to IrLoweringA64::tempAddr

---

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
2023-10-13 13:20:12 -07:00
Andy Friesen
36e0e64715
Sync to upstream/release/598 (#1063)
* Include `windows.h` rather than `Windows.h` to make things compile on
MinGW.
* Custom implementation of timegm/os.time for all platforms
* Disable builtin constant folding when getfenv/setfenv are used
* Fixes https://github.com/Roblox/luau/issues/1042
* Fixes https://github.com/Roblox/luau/issues/1043

New Type Checker

* Initial work toward type states.
* Rework most overloadable operators to use type families.
* Initial work toward our new nonstrict mode.


Native Codegen

* Fix native code generation for dead loops
* Annotate top-level functions as cold
* Slightly smaller/faster x64 Luau calls
* emitInstCall used to not set savedpc itself, but now it does for
consistency with all other implementations
* Implement cmov support for X64
* Fix assertion in luau-compile when module is empty
* Optimize A64 calls at some code size cost
* Inline constant array index offset into the load/store instruction
* Increase x64 spill slots from 5 to 13

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
2023-10-06 12:02:32 -07:00
Radiant
2a3a030341
Rename Windows.h to windows.h (#1055)
Make it able to compile on MingW

Fixes #873.

Co-authored-by: Radiant <i.like.using.discord@gmail.com>
2023-10-03 06:59:44 -07:00
Alexander McCord
1d0b449181
Sync to upstream/release/597 (#1054)
# New Type Solver

- Implement bidirectional type inference for higher order functions so
that we can provide a more precise type improving the autocomplete's
human factors.
- We seal all tables, so we changed the stringification to make it a
little lighter on users.
- Fixed a case of array-out-of-bound access.
- Type families no longer depends on `TxnLog` and `Unifier`.
- Type refinements now waits until the free types are sufficiently
solved.

# Native Code Generation

- Remove cached slot lookup for `executeSETTABLEKS` function because it
is a fallback in the event of a cache miss, making the cached slot
lookup redundant.
- Optimized repeated array lookups, e.g. `a[3]` in `a[3] = a[3] / 2` is
done once.

# Misc

- On some platforms, it is necessary to use `gmtime_s` with the
arguments reversed to get the current time. You can now define
`DOCTEST_CONFIG_USE_GMTIME_S` to build and run unit tests on those
platforms.

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
2023-09-29 18:13:05 -07:00
aaron
16fbfe912c
Sync to upstream/release/596 (#1050)
- Cleaned up `FFlag::FixFindBindingAtFunctionName`,
`FFlag::LuauNormalizeBlockedTypes`, `FFlag::LuauPCallDebuggerFix`
- Added support for break and continue into control flow analysis
- The old type unification engine will now report a more fine-grained
error at times, indicating that type normalization in particular failed

# New Type Solver

- Refactor of Unifier2, the new unification implementation for Luau
- Completed MVP of new unification implementation
- Dramatically simplified overload selection logic
- Type family reduction can now apply sooner to free types that have
been solved
- Subtyping now supports table indexers
- Generalization now replaces bad generics with unknown

# Native Code Generation

- Reduce stack spills caused by FINDUPVAL and STORE_TAG
- Improve Generate SHL/SHR/SAR/rotates with immediate operands in X64
- Removed redundant case re-check in table lookup fallback

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
2023-09-22 12:12:15 -07:00
Andy Friesen
31a017c5c7
Sync to upstream/release/595 (#1044)
* Rerun clang-format on the code
* Fix the variance on indexer result subtyping. This fixes some issues
with inconsistent error reporting.
* Fix a bug in the normalization logic for intersections of strings

New Type Solver

* New overload selection logic
* Subtype tests now correctly treat a generic as its upper bound within
that generic's scope
* Semantic subtyping for negation types
* Semantic subtyping between strings and compatible table types like
`{lower: (string) -> string}`
* Further work toward finalizing our new subtype test
* Correctly generalize module-scope symbols

Native Codegen

* Lowering statistics for assembly
* Make executable allocation size/limit configurable without a rebuild.
Use `FInt::LuauCodeGenBlockSize` and `FInt::LuauCodeGenMaxTotalSize`.

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
2023-09-15 10:26:59 -07:00
vegorov-rbx
c7c986b996
Sync to upstream/release/594 (#1036)
* Fixed `Frontend::markDirty` not working on modules that were not
typechecked yet
* Fixed generic variadic function unification succeeding when it should
have reported an error

New Type Solver:
* Implemented semantic subtyping check for function types

Native Code Generation:
* Improved performance of numerical loops with a constant step
* Simplified IR for `bit32.extract` calls extracting first/last bits
* Improved performance of NaN checks
2023-09-07 17:13:49 -07:00
Lily Brown
551a43c424
Sync to upstream/release/593 (#1024)
- Updated Roblox copyright to 2023
- Floor division operator `//` (implements #832)
- Autocomplete now shows `end` within `do` blocks
- Restore BraceType when using `Lexer::lookahead` (fixes #1019)

# New typechecker

- Subtyping tests between metatables and tables
- Subtyping tests between string singletons and tables
- Subtyping tests for class types

# Native codegen

- Fixed macOS test failure (wrong spill restore offset)
- Fixed clobbering of non-volatile xmm registers on Windows
- Fixed wrong storage location of SSA reg spills
- Implemented A64 support for add/sub extended
- Eliminated zextReg from A64 lowering
- Remove identical table slot lookups
- Propagate values from predecessor into the linear block
- Disabled reuse slot optimization
- Keep `LuaNode::val` check for nil when optimizing `CHECK_SLOT_MATCH`
- Implemented IR translation of `table.insert` builtin
- Fixed mmap error handling on macOS/Linux

# Tooling

- Used `|` as a column separator instead of `+` in `bench.py`
- Added a `table.sort` micro-benchmark
- Switched `libprotobuf-mutator` to a less problematic version
2023-09-01 10:58:27 -07:00
vegorov-rbx
ce9414cb98
Sync to upstream/release/592 (#1018)
* AST queries at position where function name is will now return
AstExprLocal
* Lexer performance has been slightly improved
* Fixed incorrect string singleton autocomplete suggestions (fixes #858)
* Improved parsing error messages
* Fixed crash on null pointer access in unification (fixes #1017)
* Native code support is enabled by default and `native=1`
(make)/`LUAU_NATIVE` (CMake)/`-DLUA_CUSTOM_EXECUTION` configuration is
no longer required

New typechecker:
* New subtyping check can now handle generic functions and tables
(including those that contain cycles)

Native code generation:
* Loops with non-numeric parameters are now handled by VM to streamline
native code
* Array size check can be optimized away in SETLIST
* On failure, CodeGen::compile returns a reason
* Fixed clobbering of non-volatile xmm registers on Windows
2023-08-25 10:23:55 -07:00
Andy Friesen
e25b0a6275
Sync to upstream/release/591 (#1012)
* Fix a use-after-free bug in the new type cloning algorithm
* Tighten up the type of `coroutine.wrap`. It is now `<A..., R...>(f:
(A...) -> R...) -> ((A...) -> R...)`
* Break `.luaurc` out into a separate library target `Luau.Config`. This
makes it easier for applications to reason about config files without
also depending on the type inference engine.
* Move typechecking limits into `FrontendOptions`. This allows embedders
more finely-grained control over autocomplete's internal time limits.
* Fix stability issue with debugger onprotectederror callback allowing
break in non-yieldable contexts

New solver:

* Initial work toward [Local Type
Inference](0e1082108f/rfcs/local-type-inference.md)
* Introduce a new subtyping test. This will be much nicer than the old
test because it is completely separate both from actual type inference
and from error reporting.

Native code generation:

* Added function to compute iterated dominance frontier
* Optimize barriers in SET_UPVALUE when tag is known
* Cache lua_State::global in a register on A64
* Optimize constant stores in A64 lowering
* Track table array size state to optimize array size checks
* Add split tag/value store into a VM register
* Check that spills can outlive the block only in specific conditions

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-08-18 11:15:41 -07:00
vegorov-rbx
d98256bb80
Sync to upstream/release/590 (#1008)
* Better indentation in multi-line type mismatch error messages
* Error message clone can no longer cause a stack overflow (when
typechecking with retainFullTypeGraphs set to false); fixes
https://github.com/Roblox/luau/issues/975
* `string.format` with %s is now ~2x faster on strings smaller than 100
characters

Native code generation:
* All VM side exits will block return to the native execution of the
current function to preserve correctness
* Fixed executable page allocation on Apple platforms when using
hardened runtime
* Added statistics for code generation (no. of functions compiler,
memory used for different areas)
* Fixed issue with function entry type checks performed more that once
in some functions
2023-08-11 07:42:37 -07:00
Andy Friesen
0b2755f964
Sync to upstream/release/589 (#1000)
* Progress toward a diffing algorithm for types. We hope that this will
be useful for writing clearer error messages.
* Add a missing recursion limiter in `Unifier::tryUnifyTables`. This was
causing a crash in certain situations.
* Luau heap graph enumeration improvements:
    * Weak references are not reported
    * Added tag as a fallback name of non-string table links
* Included top Luau function information in thread name to understand
where thread might be suspended
* Constant folding for `math.pi` and `math.huge` at -O2
* Optimize `string.format` and `%*`
* This change makes string interpolation 1.5x-2x faster depending on the
number and type of formatted components, assuming a few are using
primitive types, and reduces associated GC pressure.

New type checker:

* Initial work toward tracking the upper and lower bounds of types
accurately.

Native code generation (JIT):

* Add IrCmd::CHECK_TRUTHY for improved assert fast-calls
* Do not compute type map for modules without types
* Capture metatable+readonly state for NEW_TABLE IR instructions
* Replace JUMP_CMP_ANY with CMP_ANY and existing JUMP_EQ_INT
* Add support for exits to VM with reentry lock in VmExit

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-08-04 12:18:54 -07:00
vegorov-rbx
76f67e0733
Sync to upstream/release/588 (#992)
Type checker/autocomplete:
* `Luau::autocomplete` no longer performs typechecking internally, make
sure to run `Frontend::check` before performing autocomplete requests
* Autocomplete string suggestions without "" are now only suggested
inside the ""
* Autocomplete suggestions now include `function (anonymous autofilled)`
key with a full suggestion for the function expression (with arguments
included) stored in `AutocompleteEntry::insertText`
* `AutocompleteEntry::indexedWithSelf` is provided for function call
suggestions made with `:`
* Cyclic modules now see each other type exports as `any` to prevent
memory use-after-free (similar to module return type)

Runtime:
* Updated inline/loop unroll cost model to better handle assignments
(Fixes https://github.com/Roblox/luau/issues/978)
* `math.noise` speed was improved by ~30%
* `table.concat` speed was improved by ~5-7%
* `tonumber` and `tostring` now have fastcall paths that execute ~1.5x
and ~2.5x faster respectively (fixes #777)
* Fixed crash in `luaL_typename` when index refers to a non-existing
value
* Fixed potential out of memory scenario when using `string.sub` or
`string.char` in a loop
* Fixed behavior of some fastcall builtins when called without arguments
under -O2 to match original functions
* Support for native code execution in VM is now enabled by default
(note: native code still has to be generated explicitly)
* `Codegen::compile` now accepts `CodeGen_OnlyNativeModules` flag. When
set, only modules that have a `--!native` hot-comment at the top will be
compiled to native code

In our new typechecker:
* Generic type packs are no longer considered to be variadic during
unification
* Timeout and cancellation now works in new solver
* Fixed false positive errors around 'table' and 'function' type
refinements
* Table literals now use covariant unification rules. This is sound
since literal has no type specified and has no aliases
* Fixed issues with blocked types escaping the constraint solver
* Fixed more places where error messages that should've been suppressed
were still reported
* Fixed errors when iterating over a top table type

In our native code generation (jit):
* 'DebugLuauAbortingChecks' flag is now supported on A64
* LOP_NEWCLOSURE has been translated to IR
2023-07-28 08:13:53 -07:00
vegorov-rbx
218159140c
Sync to upstream/release/584 (#977)
* Added support for async typechecking cancellation using a token passed
through frontend options
* Added luaC_enumheap for building debug tools that need a graph of Luau
heap

In our new typechecker:
* Errors or now suppressed when checking property lookup of
error-suppressing unions

In our native code generation (jit):
* Fixed unhandled value type in NOT_ANY lowering
* Fast-call tag checks will exit to VM on failure, instead of relying on
a native fallback
* Added vector type to the type information
* Eliminated redundant direct jumps across dead blocks
* Debugger APIs are now disabled for call frames executing natively
* Implemented support for unwind registration on macOS 14
2023-07-14 11:08:53 -07:00
Andy Friesen
e25de95445
Sync to upstream/release/583 (#974)
* Fixed indexing table intersections using `x["prop"]` syntax:
https://github.com/Roblox/luau/pull/971
* Add console output codepage for Windows:
https://github.com/Roblox/luau/pull/967
* Added `Frontend::parse` for a fast source graph preparation
* luau_load should check GC
* Work toward a type-diff system for nicer error messages

New Solver
* Correctly suppress errors in more cases
* Further improvements to typechecking of function calls and return
statements
* Crash fixes
* Propagate refinements drawn from the condition of a while loop into
the loop body

JIT
* Fix accidental bailout for math.frexp/modf/sign in A64
* Work toward bringing type annotation info in
* Do not propagate Luau IR constants of wrong type into load
instructions
* CHECK_SAFEENV exits to VM on failure
* Implement error handling in A64 reg allocator
* Inline the string.len builtin
* Do not enter native code of a function if arguments don’t match

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-07-07 13:10:48 -07:00
vegorov-rbx
76bea81a7b
Sync to upstream/release/582 (#960)
* Optimized operations like instantiation and module export for very
large types

In our new typechecker:
* Typechecking of function calls was rewritten to handle more cases
correctly
* Fixed a crash that can happen after self-referential type is exported
from a module
* Fixed a false positive error in string comparison
* Added handling of `for...in` variable type annotations and fixed
issues with the iterator call inside
* Self-referential 'hasProp' and 'setProp' constraints are now handled
correctly
 
In our native code generation (jit):
* Added '--target' argument to luau-compile to test multiple
architectures different from host architecture
* GC barrier tag check is skipped if type is already known to be
GC-collectable
* Added GET_TYPE/GET_TYPEOF instructions for type/typeof fast-calls
* Improved code size of interrupt handlers on X64
2023-06-23 23:19:39 -07:00
Andy Friesen
d458d240cd
Sync to upstream/release/581 (#958)
* Definition files can now ascribe indexers to class types.
(https://github.com/Roblox/luau/pull/949)
* Remove --compile support from the REPL. You can just use luau-compile
instead.
* When an exception is thrown during parallel typechecking (usually an
ICE), we now gracefully stop typechecking and drain active workers
before rethrowing the exception.

New solver

* Include more source location information when we hit an internal
compiler error
* Improve the logic that simplifies intersections of tables

JIT

* Save testable type annotations to bytecode
* Improve block placement for linearized blocks
* Add support for lea reg, [rip+offset] for labels
* Unify X64 and A64 codegen for RETURN
* Outline interrupt handlers for X64
* Remove global rArgN in favor of build.abi
* Change A64 INTERRUPT lowering to match X64

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-06-16 10:35:18 -07:00
vegorov-rbx
3ecd3a82ab
Sync to upstream/release/580 (#951)
* Added luau-compile executable target to build/test compilation without
having full REPL included.

In our new typechecker:
* Fixed the order in which constraints are checked to get more
deterministic errors in different environments
* Fixed `isNumber`/`isString` checks to fix false positive errors in
binary comparisons
* CannotCallNonFunction error is reported when calling an intersection
type of non-functions
 
In our native code generation (jit):
* Outlined X64 return instruction code to improve code size
* Improved performance of return instruction on A64
* Added construction of a dominator tree for future optimizations
2023-06-09 10:08:00 -07:00
Andy Friesen
63679f7288
Sync to upstream/release/579 (#943)
A pretty small changelist this week:

* When type inference fails to find any matching overload for a
function, we were declining to commit any changes to the type graph at
all. This was resulting in confusing type errors in certain cases. Now,
when a matching overload cannot be found, we always commit to the first
overload we tried.

JIT

* Fix missing variadic register invalidation in FALLBACK_GETVARARGS
* Add a missing null pointer check for the result of luaT_gettm

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-06-02 12:52:15 -07:00
vegorov-rbx
271c509046
Sync to upstream/release/578 (#939)
* Fixed gcc warning about uninitialized `std::optional`
* Fixed inlining of functions when they are used to compute their own
arguments

In the new type solver:
* Type families that are not part of a function signature cannot be
resolved at instantiation time and will now produce an error. This will
be relaxed in the future when we get constraint clauses on function
signatures (internally)
* `never` type is now comparable
* Improved typechecking of `for..in` statements
* Fixed checks for number type in `Add` type family
* Performance was improved, with particularly large gains on large
projects

And in native code generation (jit):
* We eliminated the call instruction overhead when native code support
is enabled in the VM
* Small optimizations to arm64 lowering
* Reworked LOP_GETIMPORT handling to reduce assembly code size
* Fixed non-deterministic binary output
* Fixed bad code generation caused by incorrect SSA to VM register links
invalidation

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
2023-05-25 14:36:34 -07:00
Andy Friesen
721f6e10fb
Sync to upstream/release/577 (#934)
Lots of things going on this week:

* Fix a crash that could occur in the presence of a cyclic union. We
shouldn't be creating cyclic unions, but we shouldn't be crashing when
they arise either.
* Minor cleanup of `luau_precall`
* Internal change to make L->top handling slightly more uniform
* Optimize SETGLOBAL & GETGLOBAL fallback C functions.
* https://github.com/Roblox/luau/pull/929
* The syntax to the `luau-reduce` commandline tool has changed. It now
accepts a script, a command to execute, and an error to search for. It
no longer automatically passes the script to the command which makes it
a lot more flexible. Also be warned that it edits the script it is
passed **in place**. Do not point it at something that is not in source
control!

New solver

* Switch to a greedier but more fallible algorithm for simplifying union
and intersection types that are created as part of refinement
calculation. This has much better and more predictable performance.
* Fix a constraint cycle in recursive function calls.
* Much improved inference of binary addition. Functions like `function
add(x, y) return x + y end` can now be inferred without annotations. We
also accurately typecheck calls to functions like this.
* Many small bugfixes surrounding things like table indexers
* Add support for indexers on class types. This was previously added to
the old solver; we now add it to the new one for feature parity.

JIT

* https://github.com/Roblox/luau/pull/931
* Fuse key.value and key.tt loads for CEHCK_SLOT_MATCH in A64
* Implement remaining aliases of BFM for A64
* Implement new callinfo flag for A64
* Add instruction simplification for int->num->int conversion chains
* Don't even load execdata for X64 calls
* Treat opcode fallbacks the same as manually written fallbacks

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-05-19 12:37:30 -07:00
Alex Orlenko
da0458bf6e
Add CodeGen C API (#931)
I'd like to experiment with the codegen feature (currently experimental)
and need a public C API for this.

This PR addresses this.
2023-05-18 04:03:29 -07:00
vegorov-rbx
97965c7c0a
Sync to upstream/release/576 (#928)
* `ClassType` can now have an indexer defined on it. This allows custom
types to be used in `t[x]` expressions.
* Fixed search for closest executable breakpoint line. Previously,
breakpoints might have been skipped in `else` blocks at the end of a
function
* Fixed how unification is performed for two optional types `a? <: b?`,
previously it might have unified either 'a' or 'b' with 'nil'. Note that
this fix is not enabled by default yet (see the list in
`ExperimentalFlags.h`)

In the new type solver, a concept of 'Type Families' has been
introduced.
Type families can be thought of as type aliases with custom type
inference/reduction logic included with them.
For example, we can have an `Add<T, U>` type family that will resolve
the type that is the result of adding two values together.
This will help type inference to figure out what 'T' and 'U' might be
when explicit type annotations are not provided.
In this update we don't define any type families, but they will be added
in the near future.
It is also possible for Luau embedders to define their own type families
in the global/environment scope.

Other changes include:
* Fixed scope used to find out which generic types should be included in
the function generic type list
* Fixed a crash after cyclic bound types were created during unification

And in native code generation (jit):
* Use of arm64 target on M1 now requires macOS 13
* Entry into native code has been optimized. This is especially
important for coroutine call/pcall performance as they involve going
through a C call frame
* LOP_LOADK(X) translation into IR has been improved to enable type
tag/constant propagation
* arm64 can use integer immediate values to synthesize floating-point
values
* x64 assembler removes duplicate 64bit numbers from the data section to
save space
* Linux `perf` can now be used to profile native Luau code (when running
with --codegen-perf CLI argument)
2023-05-12 10:50:47 -07:00
Andy Friesen
8453570658
Sync to upstream/release/575 (#919)
* `Luau.Analyze.CLI` now has experimental support for concurrent type
checking. Use the option `-jN` where `N` is the number of threads to
spawn.
* Improve typechecking performance by ~17% by making the function
`Luau::follow` much more efficient.
* Tighten up the type of `os.date`
* Removed `ParseOptions::allowTypeAnnotations` and
`ParseOptions::supportContinueStatement`

New solver

* Improve the reliability of function overload resolution
* More work toward supporting parallel type checking
* Fix a bug in inference of `==` and `~=` which would erroneously infer
that the operands were `boolean`
* Better error reporting when `for...in` loops are used incorrectly.

CodeGen

* Fix unwind registration when libunwind is used on Linux
* Fixed replaced IR instruction use count
* Convert X64 unwind info generation to standard prologue
* Implement A64 unwind info support for Dwarf2
* Live in/out data for linear blocks is now created
* Add side-exit VM register requirements to the IR dump
* Reuse ConstPropState between block chains 
* Remove redundant base update

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-05-05 14:52:49 -07:00
vegorov-rbx
4b267aa5c5
Sync to upstream/release/574 (#910)
* Added a limit on how many instructions the Compiler can safely produce
(reported by @TheGreatSageEqualToHeaven)

C++ API Changes:
* With work started on read-only and write-only properties,
`Property::type` member variable has been replaced with `TypeId type()`
and `setType(TypeId)` functions.
* New `LazyType` unwrap callback now has a `void` return type, all
that's required from the callback is to write into `unwrapped` field.

In our work on the new type solver, the following issues were fixed:

* Work has started to support https://github.com/Roblox/luau/pull/77 and
https://github.com/Roblox/luau/pull/79
* Refinements are no longer applied on l-values, removing some
false-positive errors
* Improved overload resolution against expected result type
* `Frontend::prepareModuleScope` now works in the new solver
* Cofinite strings are now comparable

And these are the changes in native code generation (JIT):

* Fixed MIN_NUM and MAX_NUM constant fold when one of the arguments is
NaN
* Added constant folding for number conversions and bit operations
* Value spilling and rematerialization is now supported on arm64
* Improved FASTCALL2K IR generation to support second argument constant
* Added value numbering and load/store propagation optimizations
* Added STORE_VECTOR on arm64, completing the IR lowering on this target
2023-04-28 12:55:13 -07:00
Andy Friesen
fe7621ee8c
Sync to upstream/release/573 (#903)
* Work toward affording parallel type checking
* The interface to `LazyType` has changed:
* `LazyType` now takes a second callback that is passed the `LazyType&`
itself. This new callback is responsible for populating the field
`TypeId LazyType::unwrapped`. Multithreaded implementations should
acquire a lock in this callback.
* Modules now retain their `humanReadableNames`. This reduces the number
of cases where type checking has to call back to a `ModuleResolver`.
* https://github.com/Roblox/luau/pull/902
* Add timing info to the Luau REPL compilation output

We've also fixed some bugs and crashes in the new solver as we march
toward readiness.
* Thread ICEs (Internal Compiler Errors) back to the Frontend properly
* Refinements are no longer applied to lvalues
* More miscellaneous stability improvements

Lots of activity in the new JIT engine:

* Implement register spilling/restore for A64
* Correct Luau IR value restore location tracking
* Fixed use-after-free in x86 register allocator spill restore
* Use btz for bit tests
* Finish branch assembly support for A64
* Codesize and performance improvements for A64
* The bit32 library has been implemented for arm and x64

---------

Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-04-21 15:14:26 -07:00