Also adjust benchmark runs to use config=profile and run clang for all benchmarks + gcc for runtime
* Run clang-format * Contains a preliminary implementation of deferred constraint resolution * Reduce stack usage by some recursive functions * Fix a bug when smartCloning a BoundTypeVar * Remove some GC related flags from VM