Instead of doing the dot product related math in scalar IR, we lift the
computation into a dedicated IR instruction.
On x64, we can use VDPPS which was more or less tailor made for this
purpose. This is better than manual scalar lowering that requires
reloading components from memory; it's not always a strict improvement
over the shuffle+add version (which we never had), but this can now be
adjusted in the IR lowering in an optimal fashion (maybe even based on
CPU vendor, although that'd create issues for offline compilation).
On A64, we can either use naive adds or paired adds, as there is no
dedicated vector-wide horizontal instruction until SVE. Both run at
about the same performance on M2, but paired adds require fewer
instructions and temporaries.
I've measured this using mesh-normal-vector benchmark, changing the
benchmark to just report the time of the second loop inside
`calculate_normals`, testing master vs #1504 vs this PR, also increasing
the grid size to 400 for more stable timings.
On Zen 4 (7950X), this PR is comfortably ~8% faster vs master, while I
see neutral to negative results in #1504.
On M2 (base), this PR is ~28% faster vs master, while #1504 is only
about ~10% faster.
If I measure the second loop in `calculate_tangent_space` instead, I
get:
On Zen 4 (7950X), this PR is ~12% faster vs master, while #1504 is ~3%
faster
On M2 (base), this PR is ~24% faster vs master, while #1504 is only
about ~13% faster.
Note that the loops in question are not quite optimal, as they store and
reload various vectors to dictionary values due to inappropriate use of
locals. The underlying gains in individual functions are thus larger
than the numbers above; for example, changing the `calculate_normals`
loop to use a local variable to store the normalized vector (but still
saving the result to dictionary value), I get a ~24% performance
increase from this PR on Zen4 vs master instead of just 8% (#1504 is
~15% slower in this setup).
### What's new
* Fixed many of the false positive errors in indexing of table unions
and table intersections
* It is now possible to run custom checks over Luau AST during
typechecking by setting `customModuleCheck` in `FrontendOptions`
* Fixed codegen issue on arm, where number->vector cast could corrupt
that number value for the next time it's read
### New Solver
* `error` type now behaves as the bottom type during subtyping checks
* Fixed the scope that is used in subtyping with generic types
* Fixed `astOriginalCallTypes` table often used by LSP to match the old
solver
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's Changed?
- Code refactoring with a new clang-format
- More bug fixes / test case fixes in the new solver
## New Solver
- More precise telemetry collection of `any` types
- Simplification of two completely disjoint tables combines them into a
single table that inherits all properties / indexers
- Refining a `never & <anything>` does not produce type family types nor
constraints
- Silence "inference failed to complete" error when it is the only error
reported
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Dibri Nsofor <dnsofor@roblox.com>
Co-authored-by: Jeremy Yoo <jyoo@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's Changed?
- Mostly stability and bugfixes with the new solver.
## New Solver
- Typechecking with the new solver should respect the no-check hot
comment.
- Record type alias locations and property locations of table
assignments
- Maintain location information for exported table types
- Stability fixes for normalization
- Report internal constraint solver errors.
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's Changed?
- Fix#1137 by appropriately retaining additional metadata from
definition files throughout the type system.
- Improve Frontend for LSPs by appropriately allowing the cancellation
of typechecking while running its destructor.
## New Solver
- Added support for the `rawget` type function.
- Reduced overall static memory usage of builtin type functions.
- Fixed a crash where visitors could mutate a union or intersection type
and fail to invalidate iteration over them in doing so.
- Revised autocomplete functionality to not rely on a separate run of
the type solver when using the new solver.
- Implemented a more relaxed semantic rule for casting.
- Fixed some smaller crashes in the new solver.
## Native Code Generation
- Add additional codegen specialization for `math.sign`
- Cleaned up a large number of outstanding fflags in the code.
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: James McNellis <jmcnellis@roblox.com>
Co-authored-by: Jeremy Yoo <jyoo@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
### What's new
* Added lint warning for using redundant `@native` attributes on
functions inside a `--!native` module
* Improved typechecking speed in old solver for modules with large types
### New Solver
* Fixed the length type function sealing the table prematurely
* Fixed crashes caused by general table indexing expressions
### VM
* Added support for a specialized 3-argument fast-call instruction to
improve performance of `vector` constructor, `buffer` writes and a few
`bit32` methods
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
### What's new
* A bug in exception handling in GCC(11/12/13) on MacOS prevents our
test suite from running.
* Parser now supports leading `|` or `&` when declaring `Union` and
`Intersection` types (#1286)
* We now support parsing of attributes on functions as described in the
[rfc](https://github.com/luau-lang/rfcs/pull/30)
* With this change, expressions such as `local x = @native function(x)
return x+1 end` and `f(@native function(x) return x+1 end)` are now
valid.
* Added support for `@native` attribute - we can now force native
compilation of individual functions if the `@native` attribute is
specified before the `function` keyword (works for lambdas too).
### New Solver
* Many fixes in the new solver for crashes and instability
* Refinements now use simplification and not normalization in a specific
case of two tables
* Assume that compound assignments do not change the type of the
left-side operand
* Fix error that prevented Class Methods from being overloaded
### VM
* Updated description of Garbage Collector invariant
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
### What's new
* Implemented parsing logic for attributes
* Added `lua_setuserdatametatable` and `lua_getuserdatametatable` C API
methods for a faster userdata metatable fetch compared to
`luaL_getmetatable`. Note that metatable reference has to still be
pinned in memory!
### New Solver
* Further improvement to the assignment inference logic
* Fix many bugs surrounding constraint dispatch order
### Native Codegen
* Add IR lowering hooks for custom host userdata types
* Add IR to create new tagged userdata objects
* Remove outdated NativeState
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
### What's new?
* Remove a case of unsound `table.move` optimization
* Add Luau stack slot reservations that were missing in REPL (fixes
#1273)
### New Type Solver
* Assignments have been completely reworked to fix a case of cyclic
constraint dependency
* When indexing, if the fresh type's upper bound already contains a
compatible indexer, do not add another upper bound
* Distribute type arguments over all type families sans `eq`, `keyof`,
`rawkeyof`, and other internal type families
* Fix a case where `buffers` component weren't read in two places (fixes
#1267)
* Fix a case where things that constitutes a strong ref were slightly
incorrect
* Fix a case where constraint dependencies weren't setup wrt `for ...
in` statement
### Native Codegen
* Fix an optimization that splits TValue store only when its value and
its tag are compatible
* Implement a system to plug additional type information for custom host
userdata types
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
### What's new?
* Removed new `table.move` optimization because of correctness problems.
### New Type Solver
* Improved error messages for type families to describe what's wrong in
more detail, and ideally without using the term `type family` at all.
* Change `boolean` and `string` singletons in type checking to report
errors to the user when they've gotten an impossible type (indicating a
type error from their context).
* Split debugging flags for type family reduction
(`DebugLuauLogTypeFamilies`) from general solver logging
(`DebugLuauLogSolver`).
* Improve type simplification to support patterns like `(number |
string) | (string | number)` becoming `number | string`.
### Native Code Generation
* Use templated `luaV_doarith` to speedup vector operation fallbacks.
* Various small changes to better support arm64 on Windows.
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: James McNellis <jmcnellis@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
### New Type Solver
* Fixed crash in numeric binary operation type families
* Results of an indexing operation are now comparable to `nil` without a
false positive error
* Fixed a crash when a type that failed normalization was accessed
* Iterating on a free value now implies that it is iterable
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: James McNellis <jmcnellis@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
# What's changed?
* Optimize table.maxn. This function is now 5-14x faster
* Reserve Luau stack space for error message.
## New Solver
* Globals can be type-stated, but only if they are already in scope
* Fix a stack overflow that could occur when normalizing certain kinds
of recursive unions of intersections (of unions of intersections...)
* Fix an assertion failure that would trigger when the __iter metamethod
has a bad signature
## Native Codegen
* Type propagation and temporary register type hints
* Direct vector property access should only happen for names of right
length
* BytecodeAnalysis will only predict that some of the vector value
fields are numbers
---
## Internal Contributors
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's changed?
### New Type Solver
- Unification of two fresh types no longer binds them together.
- Replaced uses of raw `emplace` with `emplaceType` to catch cyclic
bound types when they are created.
- `SetIndexerConstraint` is blocked until the indexer result type is not
blocked.
- Fix a case where a blocked type got past the constraint solver.
- Searching for free types should no longer traverse into `ClassType`s.
- Fix a corner case that could result in the non-testable type `~{}`.
- Fix incorrect flagging when `any` was a parameter of some checked
function in nonstrict type checker.
- `IterableConstraint` now consider tables without `__iter` to be
iterables.
### Native Code Generation
- Improve register type info lookup by program counter.
- Generate type information for locals and upvalues
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: James McNellis <jmcnellis@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's changed?
* Support for new 'require by string' RFC with relative paths and
aliases in now enabled in Luau REPL application
### New Type Solver
* Fixed assertion failure on generic table keys (`[expr] = value`)
* Fixed an issue with type substitution traversing into the substituted
parts during type instantiation
* Fixed crash in union simplification when that union contained
uninhabited unions and other types inside
* Union types in binary type families like `add<a | b, c>` are expanded
into `add<a, c> | add<b, c>` to handle
* Added handling for type family solving creating new type families
* Fixed a bug with normalization operation caching types with unsolved
parts
* Tables with uninhabited properties are now simplified to `never`
* Fixed failures found by fuzzer
### Native Code Generation
* Added support for shared code generation between multiple Luau VM
instances
* Fixed issue in load-store propagation and new tagged LOAD_TVALUE
instructions
* Fixed issues with partial register dead store elimination causing
failures in GC assists
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: James McNellis <jmcnellis@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's Changed
## New Type Solver
- Many more fixes to crashes, assertions, and hangs
- Annotated locals now countermand the inferred types of locals, meaning
that for a type `type MyType = number | string`, `local foo : MyType =
5` behaves the same as `local foo = 5 :: MyType`, where before, foo
would be assigned the type of the value on the rhs.
- Type Normalization now respects resource limits.
- Subtyping between classes and cyclic tables now supported
## Native Code Generation
- Work on the Native Code Generation(NCG) allocator continues
---
# Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: James McNellis <jmcnellis@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's Changed
## New Type Solver
- Many fixes to crashes, assertions, and hangs
- Binary type family aliases now have a default parameter
- Added a debug check for unsolved types escaping the constraint solver
- Overloaded functions are no longer inferred
- Unification creates additional subtyping constraints for blocked types
- Attempt to guess the result type for type families that are too large
to resolve timely
## Native Code Generation
- Fixed `IrCmd::CHECK_TRUTHY` lowering in a specific case
- Detailed compilation errors are now supported
- More work on the new allocator
---
# Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: James McNellis <jmcnellis@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
# What's changed
### Debugger
* Values after a 'continue' statement should not be accessible by
debugger in the 'until' condition
### New Type Solver
* Many fixes to crashes and hangs
* Better bidirectional inference of table literal expressions
### Native Code Generation
* Initial steps toward a shared code allocator
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's Changed
* Fix a case where the stack wasn't completely cleaned up where
`debug.info` errored when passed `"f"` option and a thread.
* Fix a case of uninitialized field in `luaF_newproto`.
### New Type Solver
* When a local is captured in a function, don't add a new entry to the
`DfgScope::bindings` if the capture occurs within a loop.
* Fix a poor performance characteristic during unification by not trying
to simplify an intersection.
* Fix a case of multiple constraints mutating the same blocked type
causing incorrect inferences.
* Fix a case of assertion failure when overload resolution encounters a
return typepack mismatch.
* When refining a property of the top `table` type, we no longer signal
an unknown property error.
* Fix a misuse of free types when trying to infer the type of a
subscript expression.
* Fix a case of assertion failure when trying to resolve an overload
from `never`.
### Native Code Generation
* Fix dead store optimization issues caused by partial stores.
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
When lowering LOADK for booleans/numbers/nils, we deconstruct the
operation using STORE_TAG which informs the rest of the optimization
pipeline about the tag of the value. This is helpful to remove various
tag checks.
When the constant is a string or a vector, we just use
LOAD_TVALUE/STORE_TVALUE. For strings, this could be replaced by pointer
load/store, but for vectors there's no great alternative using current
IR ops; in either case, the optimization needs to be carefully examined
for profitability as simply copying constants into registers for
function calls could become more expensive.
However, there are cases where it's still valuable to preserve the tag.
For vectors, doing any math with vector constants contains tag checks
that could be removed. For both strings and vectors, storing them into a
table has a barrier that for vectors could be elided, and for strings
could be simplified as there's no need to confirm the tag.
With this change we now carry the optional tag of the value with
LOAD_TVALUE. This has no performance effect on existing benchmarks but
does reduce the generated code for benchmarks by ~0.1%, and it makes
vector code more efficient (~5% lift on X64 log1p approximation).
When the input is a constant, we use a fairly inefficient sequence of
fmov+fcvt+dup or, when the double isn't encodable in fmov,
adr+ldr+fcvt+dup.
Instead, we can use the same lowering as X64 when the input is a
constant, and load the vector from memory. However, if the constant is
encodable via fmov, we can use a vector fmov instead (which is just one
instruction and doesn't need constant space).
Fortunately the bit encoding of fmov for 32-bit floating point numbers
matches that of 64-bit: the decoding algorithm is a little different
because it expands into a larger exponent, but the values are
compatible, so if a double can be encoded into a scalar fmov with a
given abcdefgh pattern, the same pattern should encode the same float;
due to the very limited number of mantissa and exponent bits, all values
that are encodable are also exact in both 32-bit and 64-bit floats.
This strategy is ~same as what gcc uses. For complex vectors, we
previously used 4 instructions and 8 bytes of constant storage, and now
we use 2 instructions and 16 bytes of constant storage, so the memory
footprint is the same; for simple vectors we just need 1 instruction (4
bytes).
clang lowers vector constants a little differently, opting to synthesize
a 64-bit integer using 4 instructions (mov/movk) and then move it to the
vector register - this requires 5 instructions and 20 bytes, vs ours/gcc
2 instructions and 8+16=24 bytes. I tried a simpler version of this that
would be more compact - synthesize a 32-bit integer constant with
mov+movk, and move it to vector register via dup.4s - but this was a
little slower on M2, so for now we prefer the slightly larger version as
it's not a regression vs current implementation.
On the vector approximation benchmark we get:
- Before this PR (flag=false): ~7.85 ns/op
- After this PR (flag=true): ~7.74 ns/op
- After this PR, with 0.125 instead of 0.123 in the benchmark code (to
use fmov): ~7.52 ns/op
- Not part of this PR, but the mov/dup strategy described above: ~8.00
ns/op
# What's Changed
* Add a compiler hint to improve Luau memory allocation inlining
### New Type Solver
* Added a system for recommending explicit type annotations to users in
cases where we've inferred complex generic types with type families.
* Marked string library functions as `@checked` for use in new
non-strict mode.
* Fixed a bug with new non-strict mode where we would incorrectly report
arity mismatches when missing optional arguments.
* Implement an occurs check for unifications that would produce
self-recursive types.
* Fix bug where overload resolution would fail when applied to
non-overloaded functions.
* Fix bug that caused the subtyping to report an error whenever a
generic was instantiated in an invariant context.
* Fix crash caused by `SetPropConstraint` not blocking properly.
### Native Code Generation
* Implement optimization to eliminate dead stores
* Optimize vector ops for X64 when the source is computed (thanks,
@zeux!)
* Use more efficient lowering for UNM_* (thanks, @zeux!)
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Instead of patching the tag component with TVECTOR in every instruction
that produces a vector value, we now use a separate IR instruction to do
this. This reduces implementation redundancy, but more importantly
allows for a class of optimizations:
- NUM_TO_VECTOR previously patched the component unconditionally but the
result was used only in MUL/DIV_VEC instructions that ignore it anyway;
we can now remove this.
- ADD_VEC et al can now forward the source of TAG_VECTOR instruction of
either input; this shortens the latency chain and in the future could
allow us to generate optimal vector instruction sequence once the
temporary stores are marked as dead.
- In the future on X64, ADD_VEC et al will be able to analyze the input
instruction and remove tag masking conditionally. This is not part of
this PR as it requires a decision around expected FP environment and/or
the necessity of the existing masking to begin with.
I've also renamed NUM_TO_VECTOR to NUM_TO_VEC so that "VEC" always
refers to "3 float values" and for consistency with ADD/etc.
Note: ADD_VEC input forwarding is currently performed unconditionally;
it may or may not increase the spills that can't be reloaded from the
stack.
On A64 this makes the Taylor series computation a tiny bit faster
(11.3ns => 11.0ns) as it removes the redundant ins instructions along
the NUM_TO_VEC path. Curiously, the optimization of forwarding
TAG_VECTOR input to arithmetic instructions actually has a small penalty
as without it this PR runs at 10.9 ns. I don't know if this is a
property of the benchmark though, as I just noticed that in this
benchmark type inference actually fails to infer parts of the
computation as a vector op. If desired I will happily omit this part of
the change and we can explore that separately.
This change replaces scalar versions of vector opcodes for A64 with
actual vector instructions.
We take the approach similar to X64: patch last component with zero,
perform the math, patch last component with type tag. I'm hoping that in
the future the type tag will be placed separately (separate IR opcode?)
because right now chains of math operations result in excessive type tag
operations.
To patch the type tag without always keeping a mask in a register,
ins.4s instructions can be used; unfortunately it's only capable of
patching a register in-place, so we need an extra register copy in case
it's not last-use. Usually it's last-use so the patch is free; probably
with IR rework mentioned above all of this can be improved (e.g.
load-with-patch will never need to copy).
~It's not 100% clear if we *have* to patch type tag: Apple does preserve
denormals but we'd need to benchmark this to see if there's an actual
performance impact. But for now we're playing it safe.~
This was tested by running the conformance tests, and new opcode
implementations were checked by comparing the result with
https://armconverter.com/.
Performance testing is complicated by the fact that OSS Luau doesn't
support vector constructor out of the box, and other limitations of
codegen. I've hacked vector constructor/type into REPL and confirmed
that on a test that calls this function in a loop (not inlined):
```
function fma(a: vector, b: vector, c: vector)
return a * b + c
end
```
... this PR improves performance by ~6% (note that probably most of the
overhead here is the call dispatch; I didn't want to brave testing a
more complex expression). The assembly for an individual operation
changes as follows:
Before:
```
# %14 = MUL_VEC %12, %13 ; useCount: 2, lastUse: %22
dup s29,v31.s[0]
dup s28,v30.s[0]
fmul s29,s29,s28
ins v31.s[0],v29.s[0]
dup s29,v31.s[1]
dup s28,v30.s[1]
fmul s29,s29,s28
ins v31.s[1],v29.s[0]
dup s29,v31.s[2]
dup s28,v30.s[2]
fmul s29,s29,s28
ins v31.s[2],v29.s[0]
```
After:
```
# %14 = MUL_VEC %12, %13 ; useCount: 2, lastUse: %22
ins v31.s[3],w31
ins v30.s[3],w31
fmul v31.4s,v31.4s,v30.4s
movz w17,#4
ins v31.s[3],w17
```
**edit** final form (see comments):
```
# %14 = MUL_VEC %12, %13 ; useCount: 2, lastUse: %22
fmul v31.4s,v31.4s,v30.4s
movz w17,#4
ins v31.s[3],w17
```
# What's changed?
* Compiler now targets bytecode version 5 by default, this includes
support for vector type literals and sub/div opcodes with a constant on
lhs
### New Type Solver
* Normalizer type inhabitance check has been optimized
* Added ability to reduce cyclic `and`/`or` type families
### Native Code Generation
* `CodeGen::compile` now returns more specific causes of a code
generation failure
* Fixed linking issues on platforms that don't support unwind frame data
registration
---
### Internal Contributors
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
# What's changed?
* Check interrupt handler inside the pattern match engine to eliminate
potential for programs to hang during string library function execution.
* Allow iteration over table properties to pass the old type solver.
### Native Code Generation
* Use in-place memory operands for math library operations on x64.
* Replace opaque bools with separate enum classes in IrDump to improve
code maintainability.
* Translate operations on inferred vectors to IR.
* Enable support for debugging native-compiled functions in Roblox
Studio.
### New Type Solver
* Rework type inference for boolean and string literals to introduce
bounded free types (bounded below by the singleton type, and above by
the primitive type) and reworked primitive type constraint to decide
which is the appropriate type for the literal.
* Introduce `FunctionCheckConstraint` to handle bidirectional
typechecking for function calls, pushing the expected parameter types
from the function onto the arguments.
* Introduce `union` and `intersect` type families to compute deferred
simplified unions and intersections to be employed by the constraint
generation logic in the new solver.
* Implement support for expanding the domain of local types in
`Unifier2`.
* Rework type inference for iteration variables bound by for in loops to
use local types.
* Change constraint blocking logic to use a set to prevent accidental
re-blocking.
* Add logic to detect missing return statements in functions.
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Vighnesh <vvijay@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
### What's changed?
* Syntax for [read-only and write-only
properties](https://github.com/luau-lang/rfcs/pull/15) is now parsed,
but is not yet supported in typechecking
### New Type Solver
* `keyof` and `rawkeyof` type operators have been updated to match final
text of the [RFC](https://github.com/luau-lang/rfcs/pull/16)
* Fixed issues with cyclic type families that were generated for mutable
loop variables
### Native Code Generation
* Fixed inference for number / vector operation that caused an
unnecessary VM assist
---
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# Old Solver:
- Fix a bug in the old solver where a user could use the keyword
`typeof` as the name of a type alias.
- Fix stringification of scientific notation to omit a trailing decimal
place when not followed by a digit e.g. `1.e+20` -> `1e+20`
# New Solver
- Continuing work on the New non-strict mode
- Introduce `keyof` and `rawkeyof` type function for acquiring the type
of all keys in a table or class
(https://github.com/luau-lang/rfcs/pull/16)
---
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
- Implemented [Require by String with Relative
Paths](https://github.com/luau-lang/rfcs/blob/master/docs/new-require-by-string-semantics.md)
RFC
- Implemented [Require by String with
Aliases](https://github.com/luau-lang/rfcs/blob/master/docs/require-by-string-aliases.md)
RFC with support for `paths` and `alias` arrays in .luarc
- Added SUBRK and DIVRK bytecode instructions to speed up
constant-number and constant/number operations
- Added `--vector-lib`, `--vector-ctor` and `--vector-type` options to
luau-compile to support code with vectors
New Solver
- Correctness fixes to subtyping
- Improvements to dataflow analysis
Native Code Generation
- Added bytecode analysis pass to predict type tags used in operations
- Fixed rare cases of numerical loops being generated without an
interrupt instruction
- Restored optimization data propagation into the linear block
- Duplicate buffer length checks are optimized away
Miscellaneous
- Small performance improvements to new non-strict mode
- Introduced more scripts for fuzzing Luau and processing the results,
including fuzzer build support for CMake
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Right now, we can compile R\*K for all arithmetic instructions, but K\*R
gets compiled into two instructions (LOADN/LOADK + arithmetic opcode).
This is problematic since it leads to reduced performance for some code.
However, we'd like to avoid adding reverse variants of ADDK et al for
all opcodes to avoid the increase in I$ footprint for interpreter.
Looking at the arithmetic instructions, % and // don't have interesting
use cases for K\*V; ^ is sometimes used with constant on the left hand
side but this would need to call pow() by necessity in all cases so it
would be slow regardless of the dispatch overhead. This leaves the four
basic arithmetic operations.
For + and \*, we can implement a compiler-side optimization in the
future that transforms K\*R to R\*K automatically. This could either be
done unconditionally at -O2, or conditionally based on the type of the
value (driven by type annotations / inference) -- this technically
changes behavior in presence of metamethods, although it might be
sensible to just always do this because non-commutative +/* are evil.
However, for - and / it is impossible for the compiler to optimize this
in the future, so we need dedicated opcodes. This only increases the
interpreter size by ~300 bytes (~1.5%) on X64.
This makes spectral-norm and math-partial-sums 6% faster; maybe more
importantly, voxelgen gets 1.5% faster (so this change does have
real-world impact).
To avoid the proliferation of bytecode versions this change piggybacks
on the bytecode version bump that was just made in 604 for vector
constants; we would still be able to enable these independently but
we'll consider v5 complete when both are enabled.
Related: #626
---------
Co-authored-by: vegorov-rbx <75688451+vegorov-rbx@users.noreply.github.com>
New Solver
* New algorithm for inferring the types of locals that have no
annotations. This
algorithm is very conservative by default, but is augmented with some
control
flow awareness to handle most common scenarios.
* Fix bugs in type inference of tables
* Improve performance of by switching out standard C++ containers for
`DenseHashMap`
* Infrastructure to support clearer error messages in strict mode
Native Code Generation
* Fix a lowering issue with buffer.writeu8 and 0x80-0xff values: A
constant
argument wasn't truncated to the target type range and that causes an
assertion failure in `build.mov`.
* Store full lightuserdata value in loop iteration protocol lowering
* Add analysis to compute function bytecode distribution
* This includes a class to analyze the bytecode operator distribution
per
function and a CLI tool that produces a JSON report. See the new cmake
target `Luau.Bytecode.CLI`
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's changed?
- Record the location of properties for table types (closes#802)
- Implement stricter UTF-8 validations as per the RFC
(https://github.com/luau-lang/rfcs/pull/1)
- Implement `buffer` as a new type in both the old and new solvers.
- Changed errors produced by some `buffer` builtins to be a bit more
generic to avoid platform-dependent error messages.
- Fixed a bug where `Unifier` would copy some persistent types, tripping
some internal assertions.
- Type checking rules on relational operators is now a little bit more
lax.
- Improve dead code elimination for some `if` statements with complex
always-false conditions
## New type solver
- Dataflow analysis now generates phi nodes on exit of branches.
- Dataflow analysis avoids producing a new definition for locals or
properties that are not owned by that loop.
- If a function parameter has been constrained to `never`, report errors
at all uses of that parameter within that function.
- Switch to using the new `Luau::Set` to replace `std::unordered_set` to
alleviate some poor allocation characteristics which was negatively
affecting overall performance.
- Subtyping can now report many failing reasons instead of just the
first one that we happened to find during the test.
- Subtyping now also report reasons for type pack mismatches.
- When visiting `if` statements or expressions, the resulting context
are the common terms in both branches.
## Native codegen
- Implement support for `buffer` builtins to its IR for x64 and A64.
- Optimized `table.insert` by not inserting a table barrier if it is
fastcalled with a constant.
## Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Arseny Kapoulkine <arseny@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's changed?
* Fixed a bug in type cloning by maintaining persistent types.
* We now parse imprecise integer literals to report the imprecision as a
warning to developers.
* Add a compiler flag to specify the name of the statistics output file.
### New type solver
* Renamed `ConstraintGraphBuilder` to `ConstraintGenerator`
* LValues now take into account the type being assigned during
constraint generation.
* Normalization performance has been improved by 33% by replacing the an
internal usage of `std::unordered_set` with `DenseHashMap`.
* Normalization now has a helper to identify types that are equivalent
to `unknown`, which is being used to fix some bugs in subtyping.
* Uses of the old unifier in the new type solver have been eliminated.
* Improved error explanations for subtyping errors in `TypeChecker2`.
### Native code generation
* Expanded some of the statistics recorded during compilation to include
the number of instructions and blocks.
* Introduce instruction and block count limiters for controlling what
bytecode is translated into native code.
* Implement code generation for byteswap instruction.
### Internal Contributors
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
### What's Changed
- Improve readability of unions and intersections by limiting the number
of elements of those types that can be presented on a single line (gated
under `FFlag::LuauToStringSimpleCompositeTypesSingleLine`)
- Adds a new option to the compiler `--record-stats` to record and
output compilation statistics
- `if...then...else` expressions are now optimized into `AND/OR` form
when possible.
### VM
- Add a new `buffer` type to Luau based on the [buffer
RFC](https://github.com/Roblox/luau/pull/739) and additional C API
functions to work with it; this release does not include the library.
- Internal C API to work with string buffers has been updated to align
with Lua version more closely
### Native Codegen
- Added support for new X64 instruction (rev) and new A64 instruction
(bswap) in the assembler
- Simplified the way numerical loop condition is translated to IR
### New Type Solver
- Operator inference now handled by type families
- Created a new system called `Type Paths` to explain why subtyping
tests fail in order to improve the quality of error messages.
- Systematic changes to implement Data Flow analysis in the new solver
(`Breadcrumb` removed and replaced with `RefinementKey`)
---
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
## What's Changed
- Improve POSIX compliance in `CLI/FileUtils.cpp` by @SamuraiCrow #1064
- `AstStat*::hasEnd` is deprecated; use `AstStatBlock::hasEnd` instead
- Added a lint for common misuses of the `#` operator
- Luau now issues deprecated diagnostics for some uses of `getfenv` and
`setfenv`
- Fixed a case where we included a trailing space in some error
stringifications
### Compiler
- Do not do further analysis in O2 on self functions
- Improve detection of invalid repeat..until expressions vs continue
### New Type Solver
- We now cache subtype test results to improve performance
- Improved operator inference mechanics (aka type families)
- Further work towards type states
- Work towards [new non-strict
mode](https://github.com/Roblox/luau/blob/master/rfcs/new-nonstrict.md)
continues
### Native Codegen
- Instruction last use locations should follow the order in which blocks
are lowered
- Add a bonus assertion to IrLoweringA64::tempAddr
---
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
* Include `windows.h` rather than `Windows.h` to make things compile on
MinGW.
* Custom implementation of timegm/os.time for all platforms
* Disable builtin constant folding when getfenv/setfenv are used
* Fixes https://github.com/Roblox/luau/issues/1042
* Fixes https://github.com/Roblox/luau/issues/1043
New Type Checker
* Initial work toward type states.
* Rework most overloadable operators to use type families.
* Initial work toward our new nonstrict mode.
Native Codegen
* Fix native code generation for dead loops
* Annotate top-level functions as cold
* Slightly smaller/faster x64 Luau calls
* emitInstCall used to not set savedpc itself, but now it does for
consistency with all other implementations
* Implement cmov support for X64
* Fix assertion in luau-compile when module is empty
* Optimize A64 calls at some code size cost
* Inline constant array index offset into the load/store instruction
* Increase x64 spill slots from 5 to 13
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
- Cleaned up `FFlag::FixFindBindingAtFunctionName`,
`FFlag::LuauNormalizeBlockedTypes`, `FFlag::LuauPCallDebuggerFix`
- Added support for break and continue into control flow analysis
- The old type unification engine will now report a more fine-grained
error at times, indicating that type normalization in particular failed
# New Type Solver
- Refactor of Unifier2, the new unification implementation for Luau
- Completed MVP of new unification implementation
- Dramatically simplified overload selection logic
- Type family reduction can now apply sooner to free types that have
been solved
- Subtyping now supports table indexers
- Generalization now replaces bad generics with unknown
# Native Code Generation
- Reduce stack spills caused by FINDUPVAL and STORE_TAG
- Improve Generate SHL/SHR/SAR/rotates with immediate operands in X64
- Removed redundant case re-check in table lookup fallback
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
* Rerun clang-format on the code
* Fix the variance on indexer result subtyping. This fixes some issues
with inconsistent error reporting.
* Fix a bug in the normalization logic for intersections of strings
New Type Solver
* New overload selection logic
* Subtype tests now correctly treat a generic as its upper bound within
that generic's scope
* Semantic subtyping for negation types
* Semantic subtyping between strings and compatible table types like
`{lower: (string) -> string}`
* Further work toward finalizing our new subtype test
* Correctly generalize module-scope symbols
Native Codegen
* Lowering statistics for assembly
* Make executable allocation size/limit configurable without a rebuild.
Use `FInt::LuauCodeGenBlockSize` and `FInt::LuauCodeGenMaxTotalSize`.
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
* Fixed `Frontend::markDirty` not working on modules that were not
typechecked yet
* Fixed generic variadic function unification succeeding when it should
have reported an error
New Type Solver:
* Implemented semantic subtyping check for function types
Native Code Generation:
* Improved performance of numerical loops with a constant step
* Simplified IR for `bit32.extract` calls extracting first/last bits
* Improved performance of NaN checks
- Updated Roblox copyright to 2023
- Floor division operator `//` (implements #832)
- Autocomplete now shows `end` within `do` blocks
- Restore BraceType when using `Lexer::lookahead` (fixes#1019)
# New typechecker
- Subtyping tests between metatables and tables
- Subtyping tests between string singletons and tables
- Subtyping tests for class types
# Native codegen
- Fixed macOS test failure (wrong spill restore offset)
- Fixed clobbering of non-volatile xmm registers on Windows
- Fixed wrong storage location of SSA reg spills
- Implemented A64 support for add/sub extended
- Eliminated zextReg from A64 lowering
- Remove identical table slot lookups
- Propagate values from predecessor into the linear block
- Disabled reuse slot optimization
- Keep `LuaNode::val` check for nil when optimizing `CHECK_SLOT_MATCH`
- Implemented IR translation of `table.insert` builtin
- Fixed mmap error handling on macOS/Linux
# Tooling
- Used `|` as a column separator instead of `+` in `bench.py`
- Added a `table.sort` micro-benchmark
- Switched `libprotobuf-mutator` to a less problematic version
* AST queries at position where function name is will now return
AstExprLocal
* Lexer performance has been slightly improved
* Fixed incorrect string singleton autocomplete suggestions (fixes#858)
* Improved parsing error messages
* Fixed crash on null pointer access in unification (fixes#1017)
* Native code support is enabled by default and `native=1`
(make)/`LUAU_NATIVE` (CMake)/`-DLUA_CUSTOM_EXECUTION` configuration is
no longer required
New typechecker:
* New subtyping check can now handle generic functions and tables
(including those that contain cycles)
Native code generation:
* Loops with non-numeric parameters are now handled by VM to streamline
native code
* Array size check can be optimized away in SETLIST
* On failure, CodeGen::compile returns a reason
* Fixed clobbering of non-volatile xmm registers on Windows
* Fix a use-after-free bug in the new type cloning algorithm
* Tighten up the type of `coroutine.wrap`. It is now `<A..., R...>(f:
(A...) -> R...) -> ((A...) -> R...)`
* Break `.luaurc` out into a separate library target `Luau.Config`. This
makes it easier for applications to reason about config files without
also depending on the type inference engine.
* Move typechecking limits into `FrontendOptions`. This allows embedders
more finely-grained control over autocomplete's internal time limits.
* Fix stability issue with debugger onprotectederror callback allowing
break in non-yieldable contexts
New solver:
* Initial work toward [Local Type
Inference](0e1082108f/rfcs/local-type-inference.md)
* Introduce a new subtyping test. This will be much nicer than the old
test because it is completely separate both from actual type inference
and from error reporting.
Native code generation:
* Added function to compute iterated dominance frontier
* Optimize barriers in SET_UPVALUE when tag is known
* Cache lua_State::global in a register on A64
* Optimize constant stores in A64 lowering
* Track table array size state to optimize array size checks
* Add split tag/value store into a VM register
* Check that spills can outlive the block only in specific conditions
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
* Better indentation in multi-line type mismatch error messages
* Error message clone can no longer cause a stack overflow (when
typechecking with retainFullTypeGraphs set to false); fixes
https://github.com/Roblox/luau/issues/975
* `string.format` with %s is now ~2x faster on strings smaller than 100
characters
Native code generation:
* All VM side exits will block return to the native execution of the
current function to preserve correctness
* Fixed executable page allocation on Apple platforms when using
hardened runtime
* Added statistics for code generation (no. of functions compiler,
memory used for different areas)
* Fixed issue with function entry type checks performed more that once
in some functions
* Progress toward a diffing algorithm for types. We hope that this will
be useful for writing clearer error messages.
* Add a missing recursion limiter in `Unifier::tryUnifyTables`. This was
causing a crash in certain situations.
* Luau heap graph enumeration improvements:
* Weak references are not reported
* Added tag as a fallback name of non-string table links
* Included top Luau function information in thread name to understand
where thread might be suspended
* Constant folding for `math.pi` and `math.huge` at -O2
* Optimize `string.format` and `%*`
* This change makes string interpolation 1.5x-2x faster depending on the
number and type of formatted components, assuming a few are using
primitive types, and reduces associated GC pressure.
New type checker:
* Initial work toward tracking the upper and lower bounds of types
accurately.
Native code generation (JIT):
* Add IrCmd::CHECK_TRUTHY for improved assert fast-calls
* Do not compute type map for modules without types
* Capture metatable+readonly state for NEW_TABLE IR instructions
* Replace JUMP_CMP_ANY with CMP_ANY and existing JUMP_EQ_INT
* Add support for exits to VM with reentry lock in VmExit
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Type checker/autocomplete:
* `Luau::autocomplete` no longer performs typechecking internally, make
sure to run `Frontend::check` before performing autocomplete requests
* Autocomplete string suggestions without "" are now only suggested
inside the ""
* Autocomplete suggestions now include `function (anonymous autofilled)`
key with a full suggestion for the function expression (with arguments
included) stored in `AutocompleteEntry::insertText`
* `AutocompleteEntry::indexedWithSelf` is provided for function call
suggestions made with `:`
* Cyclic modules now see each other type exports as `any` to prevent
memory use-after-free (similar to module return type)
Runtime:
* Updated inline/loop unroll cost model to better handle assignments
(Fixes https://github.com/Roblox/luau/issues/978)
* `math.noise` speed was improved by ~30%
* `table.concat` speed was improved by ~5-7%
* `tonumber` and `tostring` now have fastcall paths that execute ~1.5x
and ~2.5x faster respectively (fixes#777)
* Fixed crash in `luaL_typename` when index refers to a non-existing
value
* Fixed potential out of memory scenario when using `string.sub` or
`string.char` in a loop
* Fixed behavior of some fastcall builtins when called without arguments
under -O2 to match original functions
* Support for native code execution in VM is now enabled by default
(note: native code still has to be generated explicitly)
* `Codegen::compile` now accepts `CodeGen_OnlyNativeModules` flag. When
set, only modules that have a `--!native` hot-comment at the top will be
compiled to native code
In our new typechecker:
* Generic type packs are no longer considered to be variadic during
unification
* Timeout and cancellation now works in new solver
* Fixed false positive errors around 'table' and 'function' type
refinements
* Table literals now use covariant unification rules. This is sound
since literal has no type specified and has no aliases
* Fixed issues with blocked types escaping the constraint solver
* Fixed more places where error messages that should've been suppressed
were still reported
* Fixed errors when iterating over a top table type
In our native code generation (jit):
* 'DebugLuauAbortingChecks' flag is now supported on A64
* LOP_NEWCLOSURE has been translated to IR
* Added support for async typechecking cancellation using a token passed
through frontend options
* Added luaC_enumheap for building debug tools that need a graph of Luau
heap
In our new typechecker:
* Errors or now suppressed when checking property lookup of
error-suppressing unions
In our native code generation (jit):
* Fixed unhandled value type in NOT_ANY lowering
* Fast-call tag checks will exit to VM on failure, instead of relying on
a native fallback
* Added vector type to the type information
* Eliminated redundant direct jumps across dead blocks
* Debugger APIs are now disabled for call frames executing natively
* Implemented support for unwind registration on macOS 14
* Fixed indexing table intersections using `x["prop"]` syntax:
https://github.com/Roblox/luau/pull/971
* Add console output codepage for Windows:
https://github.com/Roblox/luau/pull/967
* Added `Frontend::parse` for a fast source graph preparation
* luau_load should check GC
* Work toward a type-diff system for nicer error messages
New Solver
* Correctly suppress errors in more cases
* Further improvements to typechecking of function calls and return
statements
* Crash fixes
* Propagate refinements drawn from the condition of a while loop into
the loop body
JIT
* Fix accidental bailout for math.frexp/modf/sign in A64
* Work toward bringing type annotation info in
* Do not propagate Luau IR constants of wrong type into load
instructions
* CHECK_SAFEENV exits to VM on failure
* Implement error handling in A64 reg allocator
* Inline the string.len builtin
* Do not enter native code of a function if arguments don’t match
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
* Optimized operations like instantiation and module export for very
large types
In our new typechecker:
* Typechecking of function calls was rewritten to handle more cases
correctly
* Fixed a crash that can happen after self-referential type is exported
from a module
* Fixed a false positive error in string comparison
* Added handling of `for...in` variable type annotations and fixed
issues with the iterator call inside
* Self-referential 'hasProp' and 'setProp' constraints are now handled
correctly
In our native code generation (jit):
* Added '--target' argument to luau-compile to test multiple
architectures different from host architecture
* GC barrier tag check is skipped if type is already known to be
GC-collectable
* Added GET_TYPE/GET_TYPEOF instructions for type/typeof fast-calls
* Improved code size of interrupt handlers on X64
* Definition files can now ascribe indexers to class types.
(https://github.com/Roblox/luau/pull/949)
* Remove --compile support from the REPL. You can just use luau-compile
instead.
* When an exception is thrown during parallel typechecking (usually an
ICE), we now gracefully stop typechecking and drain active workers
before rethrowing the exception.
New solver
* Include more source location information when we hit an internal
compiler error
* Improve the logic that simplifies intersections of tables
JIT
* Save testable type annotations to bytecode
* Improve block placement for linearized blocks
* Add support for lea reg, [rip+offset] for labels
* Unify X64 and A64 codegen for RETURN
* Outline interrupt handlers for X64
* Remove global rArgN in favor of build.abi
* Change A64 INTERRUPT lowering to match X64
---------
Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>