Integration of most changes from the GoogleTV project around the convolver/scaler.

This contains the following improvements: - Adding a few extra convolution filters on top of the existing LANCZOS3 (used internally in Chrome), and BOX (used in unit tests): - LANCZOS2: a variation of LANCZOS3 except that the windowed function is limited to the [-2:2] range. - HAMMING1: this uses a Hamming window using the [-1:-1] range. If we define the zoom down factor to z, and w the size of the window, the actual cost of each filter (CPU wise) is proportional to (w * 2 * z + 1). So, if we look at what happens when you zoom down by a factor of 4 (as often found when creating thumbnails), the cost would be 25 for LANCZOS3, 17 for LANCZOS2, and 9 for HAMMING. As a result, HAMMING1 can end up be roughly three times as fast as the typical LANCZOS3. In terms of visual quality, HAMMING1 will be obviously worse than filters that have a larger window. The motivation of this change is that not all processors are equally equipped, and while LANCZOS3 does provide good quality, it will be completely inadequate in speed on slower processors (as found on Google TV), and it would be worth trading some visual quality for speed. Because the definitions of what is acceptable from one platform to another will differ, this change adds generic enums describing various trade offs between quality and speed. And depending on the platform, these would then be mapped to different filters. This change does not contain the other changes made to the all the call sites to transform LANCZOS3 to the appropriate enum. Another CL will have to be checked in for the policy definition. - Improvements in speed by around 10% (the actual speed up depends on the parameters of the scale (scale ratios, sizes of images), as well as the actual processor on which this is run on. The 10% was measured on scale down of 1920x1080 images to 1920/4x1080/4 using the LANCZOS3 filter on a 32bit Atom based using the image_operations_bench. Actual numbers for a 64bit processor are discussed below. This optimization attempts to basically eliminate all zeroes on each side of the filter_size, since it is very likely that the calculated window will go one fraction of a pixel outside of the window where the function is actuall not zero. In many cases, this means it gets rid the convolution by one point. So, using the math above, (w * 2 * z + 1) will have 1 subtracted. The code though is generic and will get rid of more points if possible. - To measure speed, a small utility image_operations_bench was added. Its purpose is to simply measure speed of the actual speed of the convolution without any regards to the actual data. Run with --help for a list of options. The actual measured number is in MB/s (source MB + dest MB / time). The following numbers were found on a 64 bit Release build on a z600: | zero optimization | Filter | no | yes | Hamming1 | 459 | 495 | Lanczos2 | 276 | 294 | Lanczos3 | 202 | 207 | The command line was: for i in HAMMING1 LANCZOS2 LANCZOS3 ; do echo $i; out/Release/image_operations_bench -source 1920x1080 -destination 480x270 -m $i -iter 50 ; done The actual improvements for the zero optimization mentioned above are much more prevalent on a 32bit Atom. - Commented that there is half-pixel error inside the code in image_operations. Because this would effectively changes the results of many scales that are used in win_layout tests, this would effectively break them. As a result, the change here only adds comments about what needs to be changed, but does not fix the issue itself. A subsequent change will remove the comments and enable the fix, and also adds the corrected reference images used for the test. See bug 69999: http://code.google.com/p/chromium/issues/detail?id=69999 - Enhanced the convolver to support arbitrary strides, instead of the hard coded 4 * width. This value is correct on most platforms, but is not on GoogleTV since buffers allocated need to be 32 pixel multiples to exploit HW capabilities. - Added numerous unit tests to cover the new filters as well as adding other ones that are more rigourous than the existing ones. Such a test is the reason, we have found the half pixel error mentioned above. TEST=This was tested against the existing unit tests, and the added unit tests on a 64 bit Linux platform. The tests were then ran under valgrind to check for possible memory leaks/ and errors. The tests do come out clean (except the preexisting file descriptor 'leaks' coming from other tests that are linked with test_shell_tests Actual credit to most of the actual changes go to various contributors of the Google TV team. Note that there are two types of optimizations that are possible beyond these changes that are not done here: 1/ Use the fact that the filter coefficients will be periodic to reduce the cost of calculating the coefficients (though typically in the noise), but rather when the convolution is done to decrease cache misses on the coefficients. Experiments showed that on an Atom, this can yield 5 % improvement. 2/ This code is the prime target for the use of SIMD instructions. BUG=47447, 62820, 69999 Patch by evannier@google.com Original review http://codereview.chromium.org/5575010/ git-svn-id: svn://svn.chromium.org/chrome/trunk/src@73110 0039d316-1c4b-4281-b951-d872f2087c98
author: brettw@chromium.org <brettw@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98> 2011-01-30 17:58:21 +0000
committer: brettw@chromium.org <brettw@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98> 2011-01-30 17:58:21 +0000
commit: 384853b1969a9dcb0ada5131b9f703f9ed30c1c2 (patch)
tree: 4354b70414986911cdec1c984d6a95e689f581f6 /skia/ext/image_operations.cc
parent: 5261606647f05b40bf43b23a292c257c49241b87 (diff)
download: chromium_src-384853b1969a9dcb0ada5131b9f703f9ed30c1c2.zip
chromium_src-384853b1969a9dcb0ada5131b9f703f9ed30c1c2.tar.gz
chromium_src-384853b1969a9dcb0ada5131b9f703f9ed30c1c2.tar.bz2
1 files changed, 119 insertions, 10 deletions
diff --git a/skia/ext/image_operations.cc b/skia/ext/image_operations.cc
index 67236aa..51d2e4e 100644
--- a/skia/ext/image_operations.cc
+++ b/skia/ext/image_operations.cc
@@ -1,8 +1,9 @@
-// Copyright (c) 2009 The Chromium Authors. All rights reserved.
+// Copyright (c) 2011 The Chromium Authors. All rights reserved.
 // Use of this source code is governed by a BSD-style license that can be
 // found in the LICENSE file.
 
 #define _USE_MATH_DEFINES
+#include <algorithm>
 #include <cmath>
 #include <limits>
 
@@ -59,6 +60,33 @@ float EvalLanczos(int filter_size, float x) {
           sin(xpi / filter_size) / (xpi / filter_size);  // sinc(x/filter_size)
 }
 
+// Evaluates the Hamming filter of the given filter size window for the given
+// position.
+//
+// The filter covers [-filter_size, +filter_size]. Outside of this window
+// the value of the function is 0. Inside of the window, the value is sinus
+// cardinal multiplied by a recentered Hamming function. The traditional
+// Hamming formula for a window of size N and n ranging in [0, N-1] is:
+//   hamming(n) = 0.54 - 0.46 * cos(2 * pi * n / (N-1)))
+// In our case we want the function centered for x == 0 and at its minimum
+// on both ends of the window (x == +/- filter_size), hence the adjusted
+// formula:
+//   hamming(x) = (0.54 -
+//                 0.46 * cos(2 * pi * (x - filter_size)/ (2 * filter_size)))
+//              = 0.54 - 0.46 * cos(pi * x / filter_size - pi)
+//              = 0.54 + 0.46 * cos(pi * x / filter_size)
+float EvalHamming(int filter_size, float x) {
+  if (x <= -filter_size || x >= filter_size)
+    return 0.0f;  // Outside of the window.
+  if (x > -std::numeric_limits<float>::epsilon() &&
+      x < std::numeric_limits<float>::epsilon())
+    return 1.0f;  // Special case the sinc discontinuity at the origin.
+  const float xpi = x * static_cast<float>(M_PI);
+
+  return ((sin(xpi) / xpi) *  // sinc(x)
+          (0.54f + 0.46f * cos(xpi / filter_size)));  // hamming(x)
+}
+
 // ResizeFilter ----------------------------------------------------------------
 
 // Encapsulates computation and storage of the filters required for one complete
@@ -86,8 +114,16 @@ class ResizeFilter {
       case ImageOperations::RESIZE_BOX:
         // The box filter just scales with the image scaling.
         return 0.5f;  // Only want one side of the filter = /2.
+      case ImageOperations::RESIZE_HAMMING1:
+        // The Hamming filter takes as much space in the source image in
+        // each direction as the size of the window = 1 for Hamming1.
+        return 1.0f;
+      case ImageOperations::RESIZE_LANCZOS2:
+        // The Lanczos filter takes as much space in the source image in
+        // each direction as the size of the window = 2 for Lanczos2.
+        return 2.0f;
       case ImageOperations::RESIZE_LANCZOS3:
-        // The lanczos filter takes as much space in the source image in
+        // The Lanczos filter takes as much space in the source image in
         // each direction as the size of the window = 3 for Lanczos3.
         return 3.0f;
       default:
@@ -116,6 +152,10 @@ class ResizeFilter {
     switch (method_) {
       case ImageOperations::RESIZE_BOX:
         return EvalBox(pos);
+      case ImageOperations::RESIZE_HAMMING1:
+        return EvalHamming(1, pos);
+      case ImageOperations::RESIZE_LANCZOS2:
+        return EvalLanczos(2, pos);
       case ImageOperations::RESIZE_LANCZOS3:
         return EvalLanczos(3, pos);
       default:
@@ -149,6 +189,10 @@ ResizeFilter::ResizeFilter(ImageOperations::ResizeMethod method,
                            const SkIRect& dest_subset)
     : method_(method),
       out_bounds_(dest_subset) {
+  // method_ will only ever refer to an "algorithm method".
+  SkASSERT((ImageOperations::RESIZE_FIRST_ALGORITHM_METHOD <= method) &&
+           (method <= ImageOperations::RESIZE_LAST_ALGORITHM_METHOD));
+
   float scale_x = static_cast<float>(dest_width) /
                   static_cast<float>(src_full_width);
   float scale_y = static_cast<float>(dest_height) /
@@ -157,10 +201,6 @@ ResizeFilter::ResizeFilter(ImageOperations::ResizeMethod method,
   x_filter_support_ = GetFilterSupport(scale_x);
   y_filter_support_ = GetFilterSupport(scale_y);
 
-  SkIRect src_full = { 0, 0, src_full_width, src_full_height };
-  SkIRect dest_full = { 0, 0, static_cast<int>(src_full_width * scale_x + 0.5),
-                        static_cast<int>(src_full_height * scale_y + 0.5) };
-
   // Support of the filter in source space.
   float src_x_support = x_filter_support_ / scale_x;
   float src_y_support = y_filter_support_ / scale_y;
@@ -171,6 +211,17 @@ ResizeFilter::ResizeFilter(ImageOperations::ResizeMethod method,
                  scale_y, src_y_support, &y_filter_);
 }
 
+// TODO(egouriou): Take advantage of periods in the convolution.
+// Practical resizing filters are periodic outside of the border area.
+// For Lanczos, a scaling by a (reduced) factor of p/q (q pixels in the
+// source become p pixels in the destination) will have a period of p.
+// A nice consequence is a period of 1 when downscaling by an integral
+// factor. Downscaling from typical display resolutions is also bound
+// to produce interesting periods as those are chosen to have multiple
+// small factors.
+// Small periods reduce computational load and improve cache usage if
+// the coefficients can be shared. For periods of 1 we can consider
+// loading the factors only once outside the borders.
 void ResizeFilter::ComputeFilters(int src_size,
                                   int dest_subset_lo, int dest_subset_size,
                                   float scale, float src_support,
@@ -201,6 +252,15 @@ void ResizeFilter::ComputeFilters(int src_size,
     fixed_filter_values->clear();
 
     // This is the pixel in the source directly under the pixel in the dest.
+    // Note that we base computations on the "center" of the pixels. To see
+    // why, observe that the destination pixel at coordinates (0, 0) in a 5.0x
+    // downscale should "cover" the pixels around the pixel with *its center*
+    // at coordinates (2.5, 2.5) in the source, not those around (0, 0).
+    // Hence we need to scale coordinates (0.5, 0.5), not (0, 0).
+    // TODO(evannier): this code is therefore incorrect and should read:
+    // float src_pixel = (static_cast<float>(dest_subset_i) + 0.5f) * inv_scale;
+    // I leave it incorrect, because changing it would require modifying
+    // the results for the webkit test, which I will do in a subsequent checkin.
     float src_pixel = dest_subset_i * inv_scale;
 
     // Compute the (inclusive) range of source pixels the filter covers.
@@ -213,14 +273,22 @@ void ResizeFilter::ComputeFilters(int src_size,
     for (int cur_filter_pixel = src_begin; cur_filter_pixel <= src_end;
          cur_filter_pixel++) {
       // Distance from the center of the filter, this is the filter coordinate
-      // in source space.
-      float src_filter_pos = cur_filter_pixel - src_pixel;
+      // in source space. We also need to consider the center of the pixel
+      // when comparing distance against 'src_pixel'. In the 5x downscale
+      // example used above the distance from the center of the filter to
+      // the pixel with coordinates (2, 2) should be 0, because its center
+      // is at (2.5, 2.5).
+      // TODO(evannier): as above (in regards to the 0.5 pixel error),
+      // this code is incorrect, but is left it for the same reasons.
+      // float src_filter_dist =
+      //     ((static_cast<float>(cur_filter_pixel) + 0.5f) - src_pixel);
+      float src_filter_dist = cur_filter_pixel - src_pixel;
 
       // Since the filter really exists in dest space, map it there.
-      float dest_filter_pos = src_filter_pos * clamped_scale;
+      float dest_filter_dist = src_filter_dist * clamped_scale;
 
       // Compute the filter value at that location.
-      float filter_value = ComputeFilter(dest_filter_pos);
+      float filter_value = ComputeFilter(dest_filter_dist);
       filter_values->push_back(filter_value);
 
       filter_sum += filter_value;
@@ -250,6 +318,35 @@ void ResizeFilter::ComputeFilters(int src_size,
   }
 }
 
+ImageOperations::ResizeMethod ResizeMethodToAlgorithmMethod(
+    ImageOperations::ResizeMethod method) {
+  // Convert any "Quality Method" into an "Algorithm Method"
+  if (method >= ImageOperations::RESIZE_FIRST_ALGORITHM_METHOD &&
+      method <= ImageOperations::RESIZE_LAST_ALGORITHM_METHOD) {
+    return method;
+  }
+  // The call to ImageOperationsGtv::Resize() above took care of
+  // GPU-acceleration in the cases where it is possible. So now we just
+  // pick the appropriate software method for each resize quality.
+  switch (method) {
+    // Users of RESIZE_GOOD are willing to trade a lot of quality to
+    // get speed, allowing the use of linear resampling to get hardware
+    // acceleration (SRB). Hence any of our "good" software filters
+    // will be acceptable, and we use the fastest one, Hamming-1.
+    case ImageOperations::RESIZE_GOOD:
+      // Users of RESIZE_BETTER are willing to trade some quality in order
+      // to improve performance, but are guaranteed not to devolve to a linear
+      // resampling. In visual tests we see that Hamming-1 is not as good as
+      // Lanczos-2, however it is about 40% faster and Lanczos-2 itself is
+      // about 30% faster than Lanczos-3. The use of Hamming-1 has been deemed
+      // an acceptable trade-off between quality and speed.
+    case ImageOperations::RESIZE_BETTER:
+      return ImageOperations::RESIZE_HAMMING1;
+    default:
+      return ImageOperations::RESIZE_LANCZOS3;
+  }
+}
+
 }  // namespace
 
 // Resize ----------------------------------------------------------------------
@@ -369,6 +466,12 @@ SkBitmap ImageOperations::ResizeBasic(const SkBitmap& source,
                                       ResizeMethod method,
                                       int dest_width, int dest_height,
                                       const SkIRect& dest_subset) {
+  // Ensure that the ResizeMethod enumeration is sound.
+  SkASSERT(((RESIZE_FIRST_QUALITY_METHOD <= method) &&
+            (method <= RESIZE_LAST_QUALITY_METHOD)) ||
+           ((RESIZE_FIRST_ALGORITHM_METHOD <= method) &&
+            (method <= RESIZE_LAST_ALGORITHM_METHOD)));
+
   // Time how long this takes to see if it's a problem for users.
   base::TimeTicks resize_start = base::TimeTicks::Now();
 
@@ -382,6 +485,11 @@ SkBitmap ImageOperations::ResizeBasic(const SkBitmap& source,
       dest_width < 1 || dest_height < 1)
     return SkBitmap();
 
+  method = ResizeMethodToAlgorithmMethod(method);
+  // Check that we deal with an "algorithm methods" from this point onward.
+  SkASSERT((ImageOperations::RESIZE_FIRST_ALGORITHM_METHOD <= method) &&
+           (method <= ImageOperations::RESIZE_LAST_ALGORITHM_METHOD));
+
   SkAutoLockPixels locker(source);
 
   ResizeFilter filter(method, source.width(), source.height(),
@@ -400,6 +508,7 @@ SkBitmap ImageOperations::ResizeBasic(const SkBitmap& source,
   result.allocPixels();
   BGRAConvolve2D(source_subset, static_cast<int>(source.rowBytes()),
                  !source.isOpaque(), filter.x_filter(), filter.y_filter(),
+                 static_cast<int>(result.rowBytes()),
                  static_cast<unsigned char*>(result.getPixels()));
 
   // Preserve the "opaque" flag for use as an optimization later.
author	brettw@chromium.org <brettw@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98>	2011-01-30 17:58:21 +0000
committer	brettw@chromium.org <brettw@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98>	2011-01-30 17:58:21 +0000
commit	384853b1969a9dcb0ada5131b9f703f9ed30c1c2 (patch)
tree	4354b70414986911cdec1c984d6a95e689f581f6 /skia/ext/image_operations.cc
parent	5261606647f05b40bf43b23a292c257c49241b87 (diff)
download	chromium_src-384853b1969a9dcb0ada5131b9f703f9ed30c1c2.zip chromium_src-384853b1969a9dcb0ada5131b9f703f9ed30c1c2.tar.gz chromium_src-384853b1969a9dcb0ada5131b9f703f9ed30c1c2.tar.bz2