Add rewrite rule to unfuse (text . drop) #301

FPtje · 2020-10-14T13:21:23Z

Problem

While profiling some project, the following function turned out to be the largest bottleneck:

{-# INLINE slice #-}
-- | Substring from @offset@ to @offset + len@
slice :: Int -> Int -> Text -> Text
slice offset len = T.take len . T.drop offset

The direct cause of this bottleneck is fusion. The original take and drop functions work by calculating a new offset and length. A new view of Text can be returned without copying the underlying array. However, the slice function is rewritten, and the streaming implementation unnecessarily copies the data. This makes it much slower and more memory consuming.

With take and drop being fast individually, putting them together should not lead to a substantial decrease in performance.

Proposed solution

Add a rewrite rule that specifically rewrites take len . drop offset back to an unfused version, just like the TEXT take -> unfused and TEXT drop -> unfused do for their respective functions individually.

I would argue that a rewrite rule for this very specific pair of functions is justified because it represents the very common substring operation. Since I don't think it would be wise to add substring to the interface of Data.Text, optimizing its de facto implementation would be the next best thing.

Benchmark

A project hosting a benchmark can be found at Channable/haskell-string-slicing-benchmarks. It benchmarks the original function, the rule added by this pull request and three other solutions. See Bench.hs.

Below are the results of the benchmark that showed the most significant difference between fused and unfused:

Benchmark results for above mentioned slice

benchmarking long-5000-1000/naiveSlice
time                 258.8 μs   (257.7 μs .. 259.7 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 260.4 μs   (260.3 μs .. 260.5 μs)
std dev              340.3 ns   (284.6 ns .. 483.1 ns)
allocated:           1.000 R²   (1.000 R² .. 1.000 R²)
  iters              1380479.830 (1380478.707 .. 1380481.163)
  y                  2550.560   (2209.409 .. 2956.324)

Benchmark results when the rule of this PR is included

benchmarking long-5000-1000/sliceWithRule
time                 4.941 μs   (4.937 μs .. 4.945 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.941 μs   (4.938 μs .. 4.944 μs)
std dev              9.959 ns   (8.231 ns .. 12.75 ns)
allocated:           1.000 R²   (1.000 R² .. 1.000 R²)
  iters              64.000     (63.980 .. 64.025)
  y                  2556.215   (2301.293 .. 2812.799)

That's a huge difference, both in runtime and memory.

Open questions

There are many more subject-to-fusion functions that do not need to create a copy (last, tail, init, null to name a few). Any pipeline involving only such functions might be better off not being fused. Could a different approach to fusion improve performance here?
take n . drop m is very a common operation. Are there other common operations that would benefit from such rules?

phadej · 2020-10-15T12:38:00Z

No regression tests, no merge.

EDIT: also, please put the explanations why the change is made into the commit messages.

MaxGabriel · 2020-10-15T13:16:01Z

No regression tests, no merge.

EDIT: also, please put the explanations why the change is made into the commit messages.

Are these requirements documented somewhere? Could they be added to a DEVELOPMENT.md or pull_request_template.md so that potential contributors can learn this up-front?

phadej · 2020-10-15T13:25:11Z

Are these requirements documented somewhere? Could they be added to a DEVELOPMENT.md or pull_request_template.md so that potential contributors can learn this up-front?

Writing good commit messages and adding tests for your changes should be a common practice. But sure, we can be explicit.

The combination of take . drop is commonly used as a substring operation. The original take and drop functions both work by calculating just a new offset and length. When fused, the streaming implementation unnecessarily copies the data. This makes it much slower and more memory consuming.

The combination of (take . drop) should not be fused. This test makes sure they are not by comparing the underlying array before and after applying those functions. See haskell#301.

FPtje · 2020-10-17T13:55:38Z

The original commit has been rewritten to contain a more thorough explanation, and a regression test was added to make sure fusion doesn't happen.

The combination of (take . drop) should not be fused. This test makes sure they are not by comparing the underlying array before and after applying those functions. See haskell#301.

phadej · 2020-10-17T16:09:02Z

I wonder whether take n . drop m . drop p or drop n . take m would need own RULES too, but let someone first run into those problems.

The combination of (take . drop) should not be fused. This test makes sure they are not by comparing the underlying array before and after applying those functions. See haskell#301.

FPtje force-pushed the feat-unfuse-take-drop branch from da3b8cd to f4dff92 Compare October 17, 2020 13:48

Add regression test for (take . drop) fusion

8c0a8da

The combination of (take . drop) should not be fused. This test makes sure they are not by comparing the underlying array before and after applying those functions. See haskell#301.

FPtje force-pushed the feat-unfuse-take-drop branch from f4dff92 to 8c0a8da Compare October 17, 2020 14:50

phadej merged commit 7d3130b into haskell:master Oct 17, 2020

Bodigrim mentioned this pull request Jun 11, 2021

Disable implicit fusion rules #348

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add rewrite rule to unfuse (text . drop) #301

Add rewrite rule to unfuse (text . drop) #301

Uh oh!

FPtje commented Oct 14, 2020

Uh oh!

phadej commented Oct 15, 2020 •

edited

Loading

Uh oh!

MaxGabriel commented Oct 15, 2020

Uh oh!

phadej commented Oct 15, 2020

Uh oh!

FPtje commented Oct 17, 2020

Uh oh!

phadej commented Oct 17, 2020

Uh oh!

Uh oh!

Add rewrite rule to unfuse (text . drop) #301

Add rewrite rule to unfuse (text . drop) #301

Uh oh!

Conversation

FPtje commented Oct 14, 2020

Problem

Proposed solution

Benchmark

Open questions

Uh oh!

phadej commented Oct 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MaxGabriel commented Oct 15, 2020

Uh oh!

phadej commented Oct 15, 2020

Uh oh!

FPtje commented Oct 17, 2020

Uh oh!

phadej commented Oct 17, 2020

Uh oh!

Uh oh!

phadej commented Oct 15, 2020 •

edited

Loading