summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorprimiano <primiano@chromium.org>2016-03-11 03:21:49 -0800
committerCommit bot <commit-bot@chromium.org>2016-03-11 11:23:01 +0000
commitd76421171daa1327f8e1c10ee710ebf911a90d2f (patch)
tree199f61018c9d4b540286ede8fbb5488da1cd1977
parent42c563766f6ee364683246a4d363906691772c0e (diff)
downloadchromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.zip
chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.tar.gz
chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.tar.bz2
Improve performance of the malloc shim layer
This fixes the perf regression introduced by crrev.com/1781573002 As speculated, crbug.com/593344 (NoBarrier_Load being double-fenced) causes a visible perf regression, as it causes the addition of two fences in the malloc fast path. This CL adds a workaraound that falls-back on a raw volatile ptr read on Linux+Clang, relying on the fact that on the architectures we care about a load of an aligned pointer is intrinsically atomic [1,2]. Perf regression: https://chromeperf.appspot.com/report?sid=1237900c90f9c5e5320a87af8d9e8828fbd1794af07f0a4a335f1fc2a45f120a&start_rev=379480&end_rev=380417 I verified manually that the regression goes away on cc_perftests with this patch. [1] Chapter 7 of Part 3A - System Programming Guide http://download.intel.com/design/processor/manuals/253668.pdf [2] A3.5.3 Atomicity in the ARM architecture, ARM Architecture Reference Manual, ARMv7-A and ARMv7-R edition http://liris.cnrs.fr/~mmrissa/lib/exe/fetch.php?media=armv7-a-r-manual.pdf BUG=550886,593872 TEST=cc_perftestsfast --gtest_filter=*PrepareTiles* Review URL: https://codereview.chromium.org/1777363002 Cr-Commit-Position: refs/heads/master@{#380600}
-rw-r--r--base/allocator/allocator_shim.cc10
1 files changed, 9 insertions, 1 deletions
diff --git a/base/allocator/allocator_shim.cc b/base/allocator/allocator_shim.cc
index cc5097e..f689035 100644
--- a/base/allocator/allocator_shim.cc
+++ b/base/allocator/allocator_shim.cc
@@ -72,8 +72,16 @@ bool CallNewHandler() {
}
inline const allocator::AllocatorDispatch* GetChainHead() {
+ // TODO(primiano): Just use NoBarrier_Load once crbug.com/593344 is fixed.
+ // Unfortunately due to that bug NoBarrier_Load() is mistakenly fully
+ // barriered on Linux+Clang, and that causes visible perf regressons.
return reinterpret_cast<const allocator::AllocatorDispatch*>(
- subtle::NoBarrier_Load(&g_chain_head));
+#if defined(OS_LINUX) && defined(__clang__)
+ *static_cast<const volatile subtle::AtomicWord*>(&g_chain_head)
+#else
+ subtle::NoBarrier_Load(&g_chain_head)
+#endif
+ );
}
} // namespace