May 20th, 2015
Cross file script inlining
While inlining, all compilers need to play a balancing act. For example, beyond a particular size threshold, the compiler might end up spending more time providing the inlined code’s contextual information to produce optimized JIT code than the time it actually saves by the reduced call overhead. Similarly, when trying to inline functions that are coming from another context or script file, the cost of work that needs to be done (like providing the contextual information) could outweigh the performance benefits that inlining provides.
During the development of Windows 10 and Microsoft Edge, we gathered some data from the existing web to understand how Chakra’s inlining optimization was working on the web. We looked at a randomly selected sample of 3,000 sites from the list of top 10,000 sites and following is what we found:
|Chakra inlined only 31% of function calls; 48% of the functions were not getting inlined as the caller and callee functions lived in separate script files||Using another view, more than half of the sites in our sample were not inlining 60% or more functions|
Speeding up global constants using fixed field optimization
While the usage of const is catching up on the web, a majority of the existing web does not use the const construct today. Instead, in today’s web, most of the times constants are defined once as a variable on the global object and then used everywhere in the code. In one of the experiments we did with 10,000+ sites on the web, we found that more than 20% of the sites had occurrence of such patterns for defining integer constants, and the average was more than 4 such occurrences per site.
In Windows 10 and Microsoft Edge, we’ve started optimizing Chakra’s parser and the JIT compiler to identify non const variable declarations of integers that are defined globally and are never changed during the course of the execution time of the program. Once identified, the JIT’ed code produced by Chakra is able to substantially reduce the lookup cost associated with such globally defined variables that do not change their shape and value during the execution lifetime of the program, thus extending the performance oriented value proposition of the const statement in ECMAScript 6 to how constants are often used in the web as it exists today.
Improving performance of code within try-catch blocks
Using try-catch is very common on the web as it exists today. However, using try-catch is typically not a recommended best practice, especially when the code path is performance sensitive. Try-catch code is hard to optimize because most operations inside the try block can cause an exception and go to the catch. This flow is too expensive to model precisely in a JIT compiler. Different techniques need to be used to model this, which create an additional overhead that needs to be taken care of by the execution machinery.
Until Windows 10, Chakra did not optimize code inside of try-catch blocks. In Windows 10 and Microsoft Edge, Chakra’s compiler now has the capability to abstract out the code defined inside of the try-catch blocks and generate optimized JIT code for it. For cases where an exception is not thrown, Chakra now executes such code inside a try block almost on par with regular JIT’ed code (as if the try-catch never existed).
Minified code now brings size and speed benefits
|Minification on Top 4000 sites|
|95% of the sites had some form of minified code||Out of the 95%, 77% sites had some code that was minified using UglifyJS||Out of the 95%, 47% of the sites used jQuery minified via UglifyJS|
The experiment confirmed that usage of minified code is extremely popular on the web as it exists and amongst others, UglifyJS is very commonly used in today’s web. So in Windows 10 and Microsoft Edge, we’ve added new fast paths, improved inlining and optimized some heuristics in Chakra’s JIT compiler to ensure that minified code runs as fast, if not faster than the non-minified versions. With these changes, the performance of individual code patterns minified using UglifyJS that we tested, improved between 20-50%.
Though we are far away from the ideal world mentioned above, in one of the other of data gathering exercises that we recently did, we tried to measure the most used ECMAScript 5 built-ins on the web as it exists today. The experiment was carried out on a random sample of ~4000 of the top 10,000 sites. The top three hits that we found were: Array#indexOf, Array#map and Array#forEach.
Given the popularity of Array built-ins on the web, in Windows10 and Microsoft Edge, Chakra has optimized how values are retrieved, while the engine traverses a given array. This optimization helps remove the extraneous overhead of visiting the prototype chain and looking up the numeric property corresponding to the index, when holes are encountered in an array. This optimization helps improves the performance of ECMAScript5 Array#indexOf built-in in Chakra and Microsoft Edge by more than 5 times.
So, are we fast yet?
|All measures collected on 64-bit browsers running 64-bit Windows 10 Technical Preview
System Info: HP Compaq 8100 Elite with Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz (4 cores), 12GB RAM
What do these charts tell? Chakra in Microsoft Edge is way faster than IE11. Looking closely, Chakra has made the following improvements across these benchmarks in Microsoft Edge:
|Benchmark||Owned By||Improvements in Microsoft Edge over IE11|
|Jet Stream||Apple||More than 1.6 times faster|
|Octane 2.0||More than 2.25 times faster|
– Gaurav Seth, Principal PM Lead, Chakra