Improve performance of the malloc shim layer

This fixes the perf regression introduced by crrev.com/1781573002 As speculated, crbug.com/593344 (NoBarrier_Load being double-fenced) causes a visible perf regression, as it causes the addition of two fences in the malloc fast path. This CL adds a workaraound that falls-back on a raw volatile ptr read on Linux+Clang, relying on the fact that on the architectures we care about a load of an aligned pointer is intrinsically atomic [1,2]. Perf regression: https://chromeperf.appspot.com/report?sid=1237900c90f9c5e5320a87af8d9e8828fbd1794af07f0a4a335f1fc2a45f120a&start_rev=379480&end_rev=380417 I verified manually that the regression goes away on cc_perftests with this patch. [1] Chapter 7 of Part 3A - System Programming Guide http://download.intel.com/design/processor/manuals/253668.pdf [2] A3.5.3 Atomicity in the ARM architecture, ARM Architecture Reference Manual, ARMv7-A and ARMv7-R edition http://liris.cnrs.fr/~mmrissa/lib/exe/fetch.php?media=armv7-a-r-manual.pdf BUG=550886,593872 TEST=cc_perftestsfast --gtest_filter=*PrepareTiles* Review URL: https://codereview.chromium.org/1777363002 Cr-Commit-Position: refs/heads/master@{#380600}
author: primiano <primiano@chromium.org> 2016-03-11 03:21:49 -0800
committer: Commit bot <commit-bot@chromium.org> 2016-03-11 11:23:01 +0000
commit: d76421171daa1327f8e1c10ee710ebf911a90d2f (patch)
tree: 199f61018c9d4b540286ede8fbb5488da1cd1977
parent: 42c563766f6ee364683246a4d363906691772c0e (diff)
download: chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.zip
chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.tar.gz
chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.tar.bz2
1 files changed, 9 insertions, 1 deletions
diff --git a/base/allocator/allocator_shim.cc b/base/allocator/allocator_shim.cc
index cc5097e..f689035 100644
--- a/base/allocator/allocator_shim.cc
+++ b/base/allocator/allocator_shim.cc
@@ -72,8 +72,16 @@ bool CallNewHandler() {
 }
 
 inline const allocator::AllocatorDispatch* GetChainHead() {
+  // TODO(primiano): Just use NoBarrier_Load once crbug.com/593344 is fixed.
+  // Unfortunately due to that bug NoBarrier_Load() is mistakenly fully
+  // barriered on Linux+Clang, and that causes visible perf regressons.
   return reinterpret_cast<const allocator::AllocatorDispatch*>(
-      subtle::NoBarrier_Load(&g_chain_head));
+#if defined(OS_LINUX) && defined(__clang__)
+      *static_cast<const volatile subtle::AtomicWord*>(&g_chain_head)
+#else
+      subtle::NoBarrier_Load(&g_chain_head)
+#endif
+  );
 }
 
 }  // namespace
author	primiano <primiano@chromium.org>	2016-03-11 03:21:49 -0800
committer	Commit bot <commit-bot@chromium.org>	2016-03-11 11:23:01 +0000
commit	d76421171daa1327f8e1c10ee710ebf911a90d2f (patch)
tree	199f61018c9d4b540286ede8fbb5488da1cd1977
parent	42c563766f6ee364683246a4d363906691772c0e (diff)
download	chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.zip chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.tar.gz chromium_src-d76421171daa1327f8e1c10ee710ebf911a90d2f.tar.bz2