Skip to content

Commit

Permalink
folly::tape (#2109)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #2109

A better version of `vector<vector>` for some scenarios.
Can also be an ok `vector<string>` if not everything is small vectors

## Basic usecase 0: iteration for data, initially not in cache

```
IterateVecIntsCacheMiss_20_0_15                           449.93ns     2.22M
IterateTapeIntsCacheMiss_20_0_15                          352.49ns     2.84M
IterateVecStringsCacheMiss_20_0_15                        262.35ns     3.81M
IterateTapeStringsCacheMiss_20_0_15                       222.20ns     4.50M
```

Tape is more cache friendly, very packed and predictable.

## Basic usecase 1: building vector<vector> by pushing one element at a time, no reserve.
```
PushBackVecInts_20_0_15                                     1.28us   782.19K
PushBackTapeInts_20_0_15                                  317.49ns     3.15M
```

## Worst case - constructing from a known range of SSO strings

Constructing a vector of strings (20 strings). When  everything is SSO, there is no point in tape. However, when we allocate, tape begins to win big.

```
ConstructVecSmallStrings_20_0_15                          122.31ns     8.18M
ConstructTapeSmallStrings_20_0_15                         129.04ns     7.75M
ConstructVecLargeStrings_20_16_32                         319.24ns     3.13M
ConstructTapeLargeStrings_20_16_32                        147.52ns     6.78M
```

Worst case scenario - 2 small strings. Overhead of tape makes this slower. We could tackle this but it requires very complicated code and so far, not the use case encountered.
```
ConstructVec2SmallStrings                                  20.95ns    47.74M
ConstructTape2SmallStrings                                 38.83ns    25.75M
```

Reviewed By: dmm-fb

Differential Revision: D52260714

fbshipit-source-id: a90e01bc589646a38f2ca22c1b9589b6cd4c7f85
DenisYaroshevskiy authored and facebook-github-bot committed Jan 2, 2024
1 parent 19cc500 commit b4ff536
Showing 7 changed files with 1,365 additions and 5 deletions.
11 changes: 11 additions & 0 deletions folly/Portability.h
Original file line number Diff line number Diff line change
@@ -27,6 +27,11 @@
#define FOLLY_CPLUSPLUS __cplusplus
#endif

// On MSVC an incorrect <version> header get's picked up
#if !defined(_MSC_VER) && __has_include(<version>)
#include <version>
#endif

static_assert(FOLLY_CPLUSPLUS >= 201402L, "__cplusplus >= 201402L");

#if defined(__GNUC__) && !defined(__clang__)
@@ -607,3 +612,9 @@ constexpr auto kCpplibVer = 0;
#else
#define FOLLY_HAS_DEDUCTION_GUIDES 0
#endif
// C++20 ranges
#if defined(__cpp_lib_ranges)
#define FOLLY_HAS_RANGES 1
#else
#define FOLLY_HAS_RANGES 0
#endif
174 changes: 174 additions & 0 deletions folly/container/detail/tape_detail.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
/*
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include <folly/Portability.h>
#include <folly/Range.h>
#include <folly/container/Iterator.h>
#include <folly/lang/Hint.h>
#include <folly/memory/UninitializedMemoryHacks.h>

#include <cstddef>
#include <iterator>
#include <memory>
#include <string>
#include <type_traits>
#include <utility>
#include <vector>

#if FOLLY_HAS_STRING_VIEW
#include <string_view>
#endif

#if FOLLY_HAS_RANGES
#include <ranges>
#endif

namespace folly {
namespace detail {

template <template <typename...> class templ, typename T>
struct instance_of : std::false_type {};

template <template <typename...> class templ, typename... Args>
struct instance_of<templ, templ<Args...>> : std::true_type {};

template <template <typename...> class templ, typename T>
constexpr bool instance_of_v = instance_of<templ, T>::value;

#if FOLLLY_HAS_RANGES
template <typename R>
constexpr bool guaranteed_contigious_range_cpp20_v =
std::ranges::contigious_iterator<I>;
#endif

template <typename R>
constexpr bool guaranteed_contigious_range_cpp14_v =
instance_of_v<std::vector, R> || instance_of_v<std::basic_string, R> ||
std::is_pointer_v<typename R::iterator>
#if FOLLY_HAS_STRING_VIEW
|| instance_of_v<std::basic_string_view, R>
#endif
;

#if FOLLLY_HAS_RANGES

template <typename R>
constexpr bool guaranteed_contigious_range =
guaranteed_contigious_range_cpp20_v<R>;

#else

template <typename R>
constexpr bool guaranteed_contigious_range =
guaranteed_contigious_range_cpp14_v<R>;

#endif

template <typename Container, bool = guaranteed_contigious_range<Container>>
struct tape_reference_traits {
using iterator = typename Container::const_iterator;
using reference = Range<iterator>;

static reference make(iterator f, iterator l) { return reference{f, l}; }
};

template <typename Container>
struct tape_reference_traits<Container, true> {
using iterator = typename Container::const_iterator;
using value_type = typename std::iterator_traits<iterator>::value_type;
using reference = Range<const value_type*>;

static auto* get_address(iterator it) {
// std::to_address is only available since C++20
if constexpr (std::is_pointer_v<iterator>) {
return it;
} else {
return it.operator->();
}
}

static reference make(iterator f, iterator l) {
return reference{get_address(f), get_address(l)};
}
};

template <typename R>
using get_range_const_iterator_t =
decltype(std::cbegin(std::declval<const R&>()));

struct fake_type {};

template <typename R>
using maybe_range_const_iterator_t =
detected_or_t<fake_type*, get_range_const_iterator_t, R>;

template <typename R>
using maybe_range_value_t =
iterator_value_type_t<maybe_range_const_iterator_t<R>>;

// This is a big function to inline but it's used insie a big function too
template <typename I, typename S>
auto compute_total_tape_len_if_possible(I f, S l) {
using successs = std::pair<std::size_t, std::size_t>;
using failure = fake_type;
if constexpr (!iterator_category_matches_v<I, std::forward_iterator_tag>) {
return failure{};
} else if constexpr (std::is_convertible_v<
iterator_value_type_t<I>,
folly::StringPiece>) {
std::size_t records_size = 0U;
std::size_t flat_size = 0U;

for (I i = f; i != l; ++i) {
++records_size;
flat_size += folly::StringPiece(*i).size();
}
return successs{records_size, flat_size};
} else if constexpr (!range_has_known_distance_v<iterator_value_type_t<I>>) {
return failure{};
} else {
std::size_t records_size = 0U;
std::size_t flat_size = 0U;

for (I i = f; i != l; ++i) {
++records_size;
flat_size +=
static_cast<std::size_t>(std::distance(std::begin(*i), std::end(*i)));
}
return successs{records_size, flat_size};
}
}

template <typename Container, typename I, typename S>
void append_range_unsafe(Container& c, I f, S l) {
if constexpr (
!iterator_category_matches_v<I, std::random_access_iterator_tag> ||
!std::is_trivially_copy_constructible_v<iterator_value_type_t<I>> ||
!(instance_of_v<std::vector, Container> ||
instance_of_v<std::basic_string, Container>)) {
c.insert(c.end(), f, l);
} else {
folly::compiler_may_unsafely_assume(l >= f);
auto old_size = c.size();
detail::unsafeVectorSetLargerSize(c, c.size() + (l - f));
std::copy(f, l, c.begin() + old_size);
}
}

} // namespace detail
} // namespace folly
506 changes: 506 additions & 0 deletions folly/container/tape.h

Large diffs are not rendered by default.

199 changes: 199 additions & 0 deletions folly/container/test/tape_bench.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
/*
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <folly/container/tape.h>

#include <random>
#include <string>
#include <vector>
#include <folly/Benchmark.h>
#include <folly/init/Init.h>

namespace {

using vec_str = std::vector<std::string>;
using st_tape = folly::string_tape;
using vv_int = std::vector<std::vector<int>>;
using tv_int = folly::tape<std::vector<int>>;

template <typename Cont>
struct ContGenerator {
ContGenerator(std::size_t minLen, std::size_t maxLen)
: g(minLen + maxLen), len(minLen, maxLen) {}

const Cont& operator()() {
buf.resize(len(g));
return buf; // we don't care for contents
}

std::mt19937 g;
std::uniform_int_distribution<std::size_t> len;

Cont buf;
};

template <typename Cont, std::size_t n, std::size_t minLen, std::size_t maxLen>
const std::vector<Cont>& generateContainers() {
static const std::vector<Cont> res = [] {
ContGenerator<Cont> gen{minLen, maxLen};
std::vector<Cont> r;
r.reserve(n);
for (std::size_t i = 0; i != n; ++i) {
r.push_back(gen());
}
return r;
}();

return res;
}

constexpr std::size_t kContainerCopies = 100'000;

template <
typename ContainerType,
typename RecordType,
std::size_t n,
std::size_t minLen,
std::size_t maxLen>
const std::vector<ContainerType>& generateContainerCopies() {
static const std::vector<ContainerType> res = [] {
const auto& input = generateContainers<RecordType, n, minLen, maxLen>();
const ContainerType sample(input.begin(), input.end());

std::vector<ContainerType> copies(kContainerCopies, sample);
std::shuffle(copies.begin(), copies.end(), std::mt19937{minLen + maxLen});

return copies;
}();
return res;
}

template <
typename StringContainer,
std::size_t n,
std::size_t minLen,
std::size_t maxLen>
void constructorStrings(std::size_t iters) {
const auto& strings = generateContainers<std::string, n, minLen, maxLen>();

while (iters--) {
StringContainer cont{strings.begin(), strings.end()};
folly::doNotOptimizeAway(cont);
}
}

template <typename T>
void pushBackByOne(
std::vector<std::vector<T>>& cont, const std::vector<std::vector<T>>& in) {
for (const auto& v : in) {
cont.emplace_back();
for (const auto& x : v) {
cont.back().push_back(x);
}
}
}

template <typename T>
void pushBackByOne(
folly::tape<std::vector<T>>& cont, const std::vector<std::vector<T>>& in) {
for (const auto& v : in) {
auto builder = cont.record_builder();
for (const auto& x : v) {
builder.push_back(x);
}
}
}

template <
typename Container,
std::size_t n,
std::size_t minLen,
std::size_t maxLen>
void pushBackInts(std::size_t iters) {
const auto& vecs = generateContainers<std::vector<int>, n, minLen, maxLen>();

while (iters--) {
Container r;
pushBackByOne(r, vecs);
folly::doNotOptimizeAway(r);
}
}

template <typename Container>
void iterateBenchImpl(const std::vector<Container>& copies, std::size_t iters) {
std::size_t i = 0;

while (iters--) {
for (const auto& v : copies[i]) {
for (const auto& x : v) {
auto xCopy = x;
folly::doNotOptimizeAway(x);
folly::doNotOptimizeAway(xCopy);
}
}
++i;
i %= copies.size();
}
}

template <
typename Container,
std::size_t n,
std::size_t minLen,
std::size_t maxLen>
void iterateInts(std::size_t iters) {
const auto& copies =
generateContainerCopies<Container, std::vector<int>, n, minLen, maxLen>();
iterateBenchImpl(copies, iters);
}

template <
typename Container,
std::size_t n,
std::size_t minLen,
std::size_t maxLen>
void iterateStrings(std::size_t iters) {
const auto& copies =
generateContainerCopies<Container, std::string, n, minLen, maxLen>();
iterateBenchImpl(copies, iters);
}

// Disabling clang format for table formatting
// clang-format off
BENCHMARK(IterateVecIntsCacheMiss_20_0_15, n) { iterateInts <vv_int, 20, 0, 15>(n); }
BENCHMARK(IterateTapeIntsCacheMiss_20_0_15, n) { iterateInts <tv_int, 20, 0, 15>(n); }
BENCHMARK(IterateVecStringsCacheMiss_20_0_15, n) { iterateStrings<vec_str, 20, 0, 15>(n); }
BENCHMARK(IterateTapeStringsCacheMiss_20_0_15, n) { iterateStrings<st_tape, 20, 0, 15>(n); }
BENCHMARK_DRAW_LINE();
BENCHMARK(PushBackVecInts_20_0_15, n) { pushBackInts<vv_int, 20, 0, 15>(n); }
BENCHMARK(PushBackTapeInts_20_0_15, n) { pushBackInts<tv_int, 20, 0, 15>(n); }
BENCHMARK_DRAW_LINE();
BENCHMARK(ConstructVecSmallStrings_20_0_15, n) { constructorStrings<vec_str, 20, 0, 15>(n); }
BENCHMARK(ConstructTapeSmallStrings_20_0_15, n) { constructorStrings<st_tape, 20, 0, 15>(n); }
BENCHMARK(ConstructVecLargeStrings_20_16_32, n) { constructorStrings<vec_str, 20, 16, 32>(n); }
BENCHMARK(ConstructTapeLargeStrings_20_16_32, n) { constructorStrings<st_tape, 20, 16, 32>(n); }
BENCHMARK_DRAW_LINE();
BENCHMARK(ConstructVec2SmallStrings, n) { constructorStrings<vec_str, 2, 0, 15>(n); }
BENCHMARK(ConstructTape2SmallStrings, n) { constructorStrings<st_tape, 2, 0, 15>(n); }
// clang-format on

} // namespace

int main(int argc, char** argv) {
folly::Init init(&argc, &argv);
folly::runBenchmarks();
return 0;
}
467 changes: 467 additions & 0 deletions folly/container/test/tape_test.cpp

Large diffs are not rendered by default.

5 changes: 0 additions & 5 deletions folly/portability/Constexpr.h
Original file line number Diff line number Diff line change
@@ -23,11 +23,6 @@
#include <cstring>
#include <type_traits>

// On MSVC an incorrect <version> header get's picked up
#if !defined(_MSC_VER) && __has_include(<version>)
#include <version>
#endif

namespace folly {

namespace detail {
8 changes: 8 additions & 0 deletions folly/test/PortabilityTest.cpp
Original file line number Diff line number Diff line change
@@ -20,6 +20,10 @@
#include <string_view> // @manual
#endif

#if FOLLY_HAS_RANGES
#include <ranges>
#endif

#include <memory>

#include <folly/portability/GTest.h>
@@ -50,3 +54,7 @@ TEST(Portability, Final) {
EXPECT_EQ(3, fooBase(p.get()));
EXPECT_EQ(3, fooDerived(p.get()));
}

#if FOLLY_HAS_RANGES
static_assert(std::ranges::random_access_range<std::vector<int>>);
#endif

0 comments on commit b4ff536

Please sign in to comment.