Closed
Description
I recently wanted a reverseConcat
, which led me to the current implementation of concat
and the realization that it is not very efficient:
Lines 1033 to 1046 in e84c7a3
- Allocates a new list of
Text
s (lazily, but it still has a cost) because of the filter - Allocates a list of boxed int lengths to sum them, since
sumP
does not fuse
Here is an alternate straightforward implementation:
concat :: [T.Text] -> T.Text
concat ts0 = case ts0 of
[] -> T.empty
[t] -> t
_ | len == 0 -> T.empty
| otherwise -> T.Text arr 0 len
where
flen acc (T.Text _ _ l)
| acc' > 0 || l == 0 = acc'
| otherwise = concatOverflowError
where
acc' = acc + l
len = foldl' flen 0 ts0
arr = A.run $ do
marr <- A.new len
let loop !i [] = pure marr
loop i (T.Text a o l : ts) = A.copyI l marr i a o *> loop (i+l) ts
loop 0 ts0
This not exactly the same, since the current implementation performs a case match on the list after filtering out null Text
s. But I doubt such a preemptive step helps.
Benchmarks on GHC 9.6.3 with -O, concating a list of all Char
s:
concat: OK
54.8 ms ± 3.1 ms, 174 MB allocated, 52 MB copied, 325 MB peak memory
alt concat: OK
13.1 ms ± 901 μs, 4.2 MB allocated, 773 B copied, 325 MB peak memory
==: OK
Benchmark file
{-# LANGUAGE BangPatterns #-}
import Prelude hiding (concat)
import Data.Foldable (foldl')
import Test.Tasty.Bench
import Test.Tasty.HUnit
import qualified Data.Text as T
import qualified Data.Text.Internal as T
import qualified Data.Text.Array as A
main :: IO ()
main = defaultMain
[ env (pure xs_) $ \xs -> bgroup ""
[ bench "concat" $ whnf T.concat xs
, bench "alt concat" $ whnf concat xs
, testCase "==" $ concat xs @?= T.concat xs
]
]
where
xs_ = map T.singleton [minBound .. maxBound]
concat :: [T.Text] -> T.Text
concat ts0 = case ts0 of
[] -> T.empty
[t] -> t
_ | len == 0 -> T.empty
| otherwise -> T.Text arr 0 len
where
flen acc (T.Text _ _ l)
| acc' > 0 || l == 0 = acc'
| otherwise = concatOverflowError
where
acc' = acc + l
len = foldl' flen 0 ts0
arr = A.run $ do
marr <- A.new len
let loop !i [] = pure marr
loop i (T.Text a o l : ts) = A.copyI l marr i a o *> loop (i+l) ts
loop 0 ts0
concatOverflowError :: a
concatOverflowError = error "Data.Text.concat: size overflow"
Metadata
Metadata
Assignees
Labels
No labels