Showing posts with label Recursive lists. Show all posts
Showing posts with label Recursive lists. Show all posts

Sunday, April 3, 2016

C++ : Streams

In this blog post, types and functions were presented in OCaml for modeling streams. This post takes the action to C++.

First, the type definition for a stream.

struct Nil {};
template <class T> class Cons;

template <class T>
using stream = sum_type <
    Nil
  , recursive_wrapper<Cons<T>>
>;
The definition is in terms of the sum_type<> type from the "pretty good sum" library talked about here.

The definition of Cons<>, will be in terms of "thunks" (suspensions). They're modeled as procedures that when evaluated, compute streams.

template <class T>
using stream_thunk = std::function<stream<T>()>;
To complete the abstraction, a function that given a suspension, "thaws" it.
template <class T> inline 
stream<T> force (stream_thunk<T> const& s) { 
  return s (); 
}

The above choices made, here is the definition for Cons<>.

template <class T>
class Cons {
public:
  using value_type = T;
  using reference = value_type&;
  using const_reference = value_type const&;
  using stream_type = stream<value_type>;

private:
  using stream_thunk_type = stream_thunk<value_type>;

public:
  template <class U, class V>
  Cons (U&& h, V&& t) : 
    h {std::forward<U> (h)}, t {std::forward<V> (t)}
  {}

  const_reference hd () const { return h; }
  stream_type tl () const { return force (t); }

private:
  value_type h;
  stream_thunk_type t;
};

Next, utility functions for working with streams.

The function hd () gets the head of a stream and tl () gets the stream that remains when the head is stripped off.

template <class T>
T const hd (stream<T> const& s) {
  return s.template match<T const&> (
      [](Cons<T> const& l) -> T const& { return l.hd (); }
    , [](otherwise) -> T const & { throw std::runtime_error { "hd" }; }
  );
}

template <class T>
stream<T> tl (stream<T> const& l) {
  return l.template match <stream<T>> (
    [] (Cons<T> const& s) -> stream <T> { return s.tl (); }
  , [] (otherwise) -> stream<T> { throw std::runtime_error{"tl"}; }
  );
}

The function take () returns the the first $n$ values of a stream.

template <class T, class D>
D take (unsigned int n, stream <T> const& s, D dst) {
  return (n == 0) ? dst :
    s.template match<D>(
       [&](Nil const& _) -> D { return  dst; },
       [&](Cons<T> const& l) -> D { 
         return take (n - 1, l.tl (), *dst++ = l.hd ()); }
    );
}

It's time to share a little "hack" I picked up for writing infinite lists.

  • To start, forget about streams;
  • Write your list using regular lists;
  • Ignore the fact that it won't terminate;
  • Rewrite in terms of Cons and convert the tail to a thunk.

For example, in OCaml the (non-terminating!) code

  let naturals = 
    let rec loop x = x :: loop (x + 1) in
  next 0
leads to this definition of the stream of natural numbers.
let naturals =
 let rec loop x = Cons (x, lazy (loop (x + 1))) in
loop 0

Putting the above to work, a generator for the stream of natural numbers can be written like this.

class natural_numbers_gen {
private:
  using int_stream = stream<int>;
    
private:
  int start;

private:
  int_stream from (int x) const {
    return int_stream{
      constructor<Cons<int>>{}, x, [=]() { return this->from (x + 1); }
    };
  }
  
public:
  explicit natural_numbers_gen (int start) : start (start) 
  {}

  explicit operator int_stream() const { return from (start); }
};
The first $10$ (say) natural numbers can then be harvested like this.
std::vector<int> s;
take (10, stream<int> (natural_numbers_gen{0}), std::back_inserter (s));

The last example, a generator of the Fibonacci sequence. Applying the hack, start with the following OCaml code.

  let fibonacci_numbers = 
    let rec fib a b = a :: fib b (a + b) in
    fib 0 1
The rewrite of this code into streams then leads to this definition.
let fibonnaci_sequence = 
  let rec fib a b = Cons (a, lazy (fib b (a + b))) in
fib 0 1
Finally, casting the above function into C++ yields the following.
class fibonacci_numbers_gen {
private:
  using int_stream = stream<int>;
    
private:
  int start;

private:
  int_stream loop (int a, int b) const {
    return int_stream{
      constructor<Cons<int>>{}, a, [=]() {return this->loop (b, a + b); }
    };
  }
    
public:
  explicit fibonacci_numbers_gen () 
  {}

  explicit operator int_stream() const { return loop (0, 1); }
  };

Saturday, February 7, 2015

Recursive lists in C++

Earlier this week, I had a need for a recursive list, that is, a list defined in terms of itself. I think, "back in the day" implementing a data structure of that sort would have been a snap for the everyday C programmer. Today, in this modern C++ world I found myself struggling a little and came to think that maybe the old ways are fading :)

For motivation, here's a couple of examples of the sort of thing I'm talking about.

(1) The list [0; 1; 0; 1; 0; 1; ...] is a list with a cycle in it. In OCaml you'd write that as let rec l = 0 :: 1 :: l.

(2) An interpreter using the technique of environments and closures can require an environment ((string * value) list) to contain a closure where the closure contains the environment. In OCaml you'd write let rec vars = (tag, V_closure (vars, xpr)) :: !env; env := vars.

Of course with pointers, it's not hard to implement recursive structures in C++. The trouble is having to concern yourself with their memory management due to the absence of garbage collection.

Alright, here is what I came up with. The code is pretty short.

#include <boost/variant.hpp>

#include <memory>
#include <stdexcept>

template <class T> struct node;
template <class T> using node_ptr=typename node<T>::node_ptr;
template <class T> using node_weak_ptr=typename node<T>::weak_ptr;
template <class T> using node_shared_ptr=typename node<T>::shared_ptr;
template <class T> struct ptr_t;
template <class T> using list=ptr_t<node<T>>;
template <class T> using list_ref=node_weak_ptr<T>;

template <class T> list<T> nil (); 
template <class T> bool empty (list<T> l);
template <class T> list<T> cons (T val, list<T> l);
template <class T> T& hd (list<T> l);
template <class T> list<T>& tl (list<T> l);
template <class T> list_ref<T> ref (list<T> src);
template <class T> bool is_ref (list<T> src);

The idea behind the implementation is generalize a pointer to node as a union with two variants, a shared pointer or a weak pointer.

template <class T> struct ptr_t :
  boost::variant <std::shared_ptr<T>, std::weak_ptr<T>> {
  typedef boost::variant <std::shared_ptr<T>, std::weak_ptr<T>> base;
  ptr_t () {}
  ptr_t (std::weak_ptr<T> p) : base (p) {}
  ptr_t (std::shared_ptr<T> p) : base (p) {}
};

template <class T>
struct node {
  typedef ptr_t<node> node_ptr;
  typedef std::weak_ptr<node> weak_ptr;
  typedef std::shared_ptr<node> shared_ptr;

  T data;
  node_ptr next;
};

This little bit of implementation detail comes up a couple of times so it's handy to factor it out.

namespace {
  //'get' at the raw pointer in the union of a smart/weak pointer
  template <class T>
  T* get (ptr_t<T> l) {
    if (std::shared_ptr<T>* p=
        boost::get<std::shared_ptr<T>>(&l)) {
      return p->get ();
    }
    return boost::get<std::weak_ptr<T>>(l).lock ().get ();
  }
}//namespace<anonymous>

The rest of the implementation is basically a set of "one-liners".

template <class T> list<T> nil (){ 
  return node_shared_ptr<T> (); 
}

template <class T> bool empty (list<T> l) { 
  return (get (l)) == nullptr; 
}

template <class T> list<T> cons (T val, list<T> l) {
  return node_shared_ptr<T> (new node<T>{val, l});
}

template <class T> T& hd (list<T> l) {   
  if (empty (l))
    throw std::runtime_error ("hd");
  return get (l) -> data;
}

template <class T> list<T>& tl (list<T> l) { 
  if (empty (l))
    throw std::runtime_error ("tl");
  return get (l) -> next; 
}

template <class T> bool is_ref (list<T> src) {
  return boost::get<list_ref<T>>(&src)!=nullptr;
}

template <class T> node_weak_ptr<T> ref (list<T> src)  {
  return node_weak_ptr<T>(boost::get<node_shared_ptr<T>>(src));
}

OK, well, that's about it. Let's see, regarding usage, (1) could be expressed like this

list<int> l = cons (0, cons (1, nil<int> ())); tl (tl (l)) = ref (l);
or, if we assume the existence of a 'last' function with an obvious definition, could be tidied up to read
list<int> l = cons (0, cons (1, nil<int> ())); tl (last (l)) = ref (l);
and (2) can be stated like this
typedef std::pair<std::string, value_t> p_t;
list<p_t> vars = node_shared_ptr<p_t>(new node<p_t>);
hd (vars) = std::make_pair (tag, V_closure {ref (vars), xpr});
tl (vars) = *env; 
*env = vars;