data-findcycle-0.1.1.0: Find cycles in periodic functions (and lists)
Copyright(c) 2025 Florian Ragwitz
LicenseMIT
Safe HaskellSafe-Inferred
LanguageHaskell2010

Data.FindCycle

Description

Any function f :: a -> a where the type a has finitely many values eventually has to be cyclic when iterated from some initial a.

This module provides a number of common algorithms and utilities to identify and work with such cycles.

Synopsis

Typical Usage

The value of iterating someCyclicFunc for \(10^{100}\) times from startingValue, using the brent algorithm for cycle detection:

let fastCyclicFunc = cycleExp brent someCyclicFunc startingValue
fastCyclicFunc (10^100)

The length of the non-repeating prefix and the length of the cycle, as determined using the nivash algorithm:

let (mu, lambda) = findCycle nivash someCyclicFunc startingValue

The same two lengths, plus two lists containing the values of the prefix and cyclic parts of the sequence using the naiveOrd algorithm:

let (mu, lambda, (pre, cyc)) = findCycleExtract naiveOrd someCyclicFunc startingValue

When you already have a list of values created by iterating a cyclic function:

let xs = iterate someCyclicFunc startingValue
let (mu, lambda, (pre, cyc)) = unsafeFindCycleFromList brent xs

CycleFinder type

data CycleFinder a Source #

An algorithm to find the cycle in a function from a to a (or a list of as).

Algorithms

Cycles are typically described with a pair \((\mu, \lambda)\), where \(\mu\) represents the length of the non-cyclic prefix of the sequence, and \(\lambda\) represents the length of the repeating cycle of the sequence.

The cycle finding algorithms provided by this module return such a pair as a result, but some might return an upper bound \(\tilde{\mu}\) instead of the minimal \(\mu\) in order to avoid the computational cost of finding the minimal value. This approximation is acceptable in many practical cases, such as when using cycleExp, which uses the cyclic behavior of a function to efficiently compute \(f^n(x)\) for large \(n\).

When a minimal \(\mu\) is needed, it can be computed from a CycleFinder returning a non-minimal \(\tilde{\mu}\) using minimalMu.

All algorithms always provide a minimal \(\lambda\) as opposed to a multiple of the true cycle length.

In practice, you'll usually want to use nivash, brent, or one of the naive variants. If performance matters and you're not sure what to choose, compare the alternatives by benchmarking for your usecase.

Naive

These algorithms use Map-like structures to store the index of the first occurrence of each value in the sequence until a duplicate is found.

They always produce the minimal \((\mu, \lambda)\).

They never iterate the sequence further than \(\mu + \lambda\) elements.

They never compute an element at a given position in the sequence more than once.

They use memory approximately proportional to \(\mu + \lambda\).

naiveHashable tends to perform slightly better and uses slightly less memory. Both are provided for completeness and for cases where you might not have a Hashable instance or don't want to write one.

naiveOrd :: Ord a => CycleFinder a Source #

Naive cycle finding algorithm using Map.

naiveHashable :: (Eq a, Hashable a) => CycleFinder a Source #

Naive cycle finding algorithm using HashMap.

Constant Memory

These algorithms use a constant amount of memory, at the cost of having to potentially evaluate values in the sequence more than once.

brent is always better than floyd, and the latter is only present for completeness and as a baseline for testing. You shouldn't use floyd.

They always compute a minimal \(\lambda\), but only an upper bound \(\tilde{\mu}\) for the cycle length. Combine with minimalMu if the minimal \(\mu\) is needed.

floyd :: Eq a => CycleFinder a Source #

Floyd's / Tortoise and Hare cycle finding algorithm.

Always worse than brent. Don't use this.

Memory/Time Compromise

nivash :: Ord a => CycleFinder a Source #

Nivash's cycle finding algorithm.

Never computes an element at a given position in the sequence more than once.

Might use memory proportional to \(\mu + \lambda\) in the worst case of an ascending sequence, but commonly uses much less for reasonably "random" sequences.

Can often be faster than brent while not using nearly as much memory as naiveOrd or naiveHashable.

nivashPart :: (Ix k, Bounded k, Ord a) => (a -> k) -> CycleFinder a Source #

Like nivash, but uses multiple independent stacks as determined by a partitioning function.

This trades off some additional memory usage for the ability to detect cycles earlier in the sequence.

Using \(k\) stacks, the cycle will be identified after, on average, a fraction of \(\dfrac{1}{k+1}\) of the cycle length. E.g 50% above the absolute minimum achievable for \(k=1\) (nivash without partitioning), or 1% above that minimum for \(k=99\).

The entire range of k will be used for partitioning, with each value from minBound to maxBound having its own partition. Use a type appropriate for the number of partitions you'd like to use. Use nivashPartWithinBounds if you'd like to explicitly use a continuous subrange of k instead.

>>> import Data.Word (Word8)
>>> let alg256 = nivashPart (fromIntegral :: Integer -> Word8)
>>> import Data.Finite (modulo) -- >= 0.2 for the Ix instance
>>> let alg100 = nivashPart (modulo @100)

nivashPartWithinBounds Source #

Arguments

:: (Ix k, Ord a) 
=> (k, k)

the lower and upper bound (inclusive), respectively, of the partition keys returned by the partition function

-> (a -> k)

the partition function. Returning values outside of the specified bounds will cause run-time errors.

-> CycleFinder a 

Like nivashPart, but allows for a specific continuous subrange of k to be used for partitioning rather than creating a partition for each possible value of k.

>>> let alg100 = nivashPartWithinBounds (0, 99) (`mod` 100)

Running algorithms

findCycle :: CycleFinder a -> (a -> a) -> a -> (Int, Int) Source #

Runs a CycleFinder algorithm for the given function and starting value, returning a pair \((\mu, \lambda)\) representing the length of the non-cyclic prefix and the length of the cycle of the sequence.

findCycleExtract :: CycleFinder a -> (a -> a) -> a -> (Int, Int, ([a], [a])) Source #

Like findCycle, but also returns a third value (pre, cyc) such that

pre ++ cycle cyc == iterate f x

In addition to extracting the prefix and cyclic part of the list, this can also be used to cache some function calls to f which the specified CycleFinder might make, as the results of all calls to f in the sequence are memorised in a lazy list which is later used to extract pre and cyc.

If you're only interested in caching calls to f but don't need the two parts of the list, just don't evaluate the last part of the return value to not pay the cost of those parts being computed.

cycleExp :: CycleFinder a -> (a -> a) -> a -> Integer -> a Source #

Constructs an efficient evaluator for a cyclic function by "exponentiating" it. Given a CycleFinder for a function f and an initial value x, it returns a function of type Integer -> a which computes the nth iterate (i.e. the value of \(f^n(x)\)).

Using the pair \((\mu, \lambda)\) obtained by the CycleFinder, this function computes

[ f^n(x) = begin{cases} f^n(x) & text{if } n < mu, f^{mu + ((n - mu) bmod lambda)}(x) & text{if } n ge mu.

end{cases} ]

which allows \(f^n(x)\) to be computed for very large \(n\) without requiring \(n\) function applications.

Note that this function will use a lazy list generated by iterate f x. This list will only be evaluated up to \(\mu + \lambda\) elements and is shared between the cycle finding phase and the computation of the value after n iterations, but might still require a significant amount of memory. Use cycleExp' if you'd rather re-evaluate f many more times but use less memory at the expense of more time.

The lazy list might also be evaluated further than \(\mu + \lambda\) depending on the cycle finding algorithm chosen (brent, floyd).

>>> -- cycle μ=1, λ=83333 when starting from 23
>>> let f :: Integer -> Integer; f x = x^(42 :: Int) `mod` 1000003
>>> 
>>> g = cycleExp nivash f 23
>>> g 0 -- after 0 iterations
23
>>> -- after a googol iterations, but finishes in less than the current
>>> -- age of the universe
>>> g (10^(100 :: Int))
671872

cycleExp' :: CycleFinder a -> (a -> a) -> a -> Integer -> a Source #

Like cycleExp, but doesn't cache. Probably not very useful in practice.

unsafeFindCycleFromList :: CycleFinder a -> [a] -> (Int, Int, ([a], [a])) Source #

Runs the CycleFinder for a given input list.

This function is provided as a convenience for when you already have a list of values you'd like to find a cycle in. It's referred to as "unsafe", because it might lead to surprising results when the input doesn't satisfy the invariants that different algorithms assume.

All algorithms assume that the sequence they're searching can be constructed by repeated function application from a starting value. Many sequences can't be, such as [1,2,1,3,1,4,1,5,...] (because there can only be one unique successor of 1).

Algorithms also assume the input sequence to be infinite, and they will commonly consume more than \(\mu + \lambda\) (or \(\tilde{\mu} + \lambda\)) elements from it. If you provide a finite input list, cycles might not be identified correctly if the chosen algorithm runs into the end of it, even though the input does technically contain an identifiable cycle.

If an assumption is violated, algorithms might wrongly identify cycles or never terminate. Try to stick to findCycle, findCycleExtract, cycleExp, or cycleExp' if possible, or only pass infinite lists generated via iterate f x (or equivalent) to unsafeFindCycleFromList.

Similar to findCycleExtract, just don't evaluate the last part of the return value if you don't need it and want to avoid the cost of computing it.

Utilities

minimalMu :: Eq a => CycleFinder a -> CycleFinder a Source #

Compute a minimal result \((\mu, \lambda)\) from a partial result \((\tilde{\mu}, \lambda)\).

This involves re-traversing the sequence from the start and from \(\lambda\) which might be expensive for large \(\mu\). This should largely be negligible if you're running the CycleFinder using any of the functions which cache the sequence of values (any but findCycle and cycleExp').