Sunday, June 09, 2013

More (an efficient circle elimination algorithm for arbitrary intervals)


All posts in this series:

  • One Potato: A description of the problem
  • Two potato: The basic equations
  • Three potato: An algebraic formula for the last person left
  • Four: A neat calculation method using binary numbers
  • Five potato: F# functions to check the various formulae
  • Six potato: Using a calculation graph to see the pattern
  • Seven potato: Generalizing to other intervals
  • More: An efficient algorithm for arbitrary intervals
  • Online research: In which I discover that it is called the Josephus problem

Introduction

In my previous post I generated a recursive equation for calculating the last person left in a circle if every $k^{\text{th}}$ person is removed from the circle.

I also wrote an F# program which highlighted a weakness of this algorithm. For circles with more than 60 000 people, the F# algorithm would experience a stack overflow.

In this post I am going to develop a more efficient algorithm to address this flaw.

Warning: Lots of algebra ahead!

The initial formatting of the Mathematical equations can be slow. If so, please be patient.


The recursive equation for arbitrary intervals

To recap, here is the main equation from the previous blog post: $$ f_k(n) = [f_k(n-1) + k - 1 ] \bmod n + 1 \tag{4} $$
[This is very much a continuation of the previous post, so I have gone with the same equation numbering].


Arranging the function outputs in rows

Recall the calculation graph from post 6 in the series:



Notice how $f(n)$ "clocks over" for the first value in each row.

Let's arrange the values of $f_k(n)$ into rows in a similar way. To do this I am going to consider the values of $n$ at which $f_k(n)$ clocks over. Every time this happens, a new row will be started.

Let $r$ be the index of a row in the graph.
Let $n_r$ be the first value of $n$ in row $r$.

Now consider the previous value of $n$: $n_r - 1$. Since the modulo operator clocks over for $n_r$, we can take the argument to this operator in equation 4, and we must have:
$$ \begin{align*} f_k(n_r - 1) + k - 1 &\ge n_r \\ \therefore f_k(n_r - 1) &\ge n_r - (k - 1)\\ \end{align*} $$
However, we also know - by the definition of $f_k(n)$ - that:
$$ f_k(n_r - 1) <= n_r - 1 $$
So it must be that:
$$ \begin{align*} f_k(n_r - 1) &= n_r - j & \text{for some } j \in \mathbb{N} \text{ such that } 1 \le j \le k-1 \tag{5} \end{align*} $$
From our definition of $n_r$ we know that $f_k(n)$ doesn't clock over for any $n \in \left\{n_{r-1}+1, n_{r-1} + 2, ..., n_r - 1\right\}$.

And because it doesn't clock over, we can remove the mod operator in equation 4, which simplifies to: $$ \begin{align*} f_k(n) &= [f_k(n-1) + k - 1 ] \bmod n + 1 & \text{from equation 4}\\ &= [f_k(n-1) + k - 1 ] + 1 \\ &= f_k(n-1) + k & \forall n \in \left\{n_{r-1}+1, n_{r-1} + 2, ..., n_r - 1\right\} \\ \\ \therefore f_k(n_{r-1} + i) &= f_k(n_{r-1}) + ik &\forall i \in \left\{0, 1, \ldots, n_r - n_{r-1}-1 \right\} \tag{6} \\ \\ \text{In particular:} \\ f_k(n_r - 1) &= f_k(n_{r-1}) + k.(n_r - n_{r-1} - 1) \tag{7} \\ \end{align*} $$
Equations 5 and 7 give us two different formulae for $f_k(n_r-1)$, so we can combine them: $$ \begin{align*} n_r - j &= f_k(n_{r-1}) + k.(n_r - n_{r-1} - 1)\\ \Leftrightarrow (k-1).n_r &= k.n_{r-1} + k - j - f_k(n_{r-1}) \tag{8} \end{align*} $$
First we are going to use equation 8 to determine the value of j.

We can re-arrange equation 8 as follows: $$ \begin{align*} j - 1 & = k.n_{r - 1} + k - 1 - f_k(n_{r-1}) - (k-1).n_r \\ & = [(k-1)+1].n_{r - 1} + (k-1) - f_k(n_{r-1}) - (k-1).n_r \\ & = n_{r-1} - f_k(n_{r-1}) + (k-1).[n_{r - 1} + 1 - n_r] \tag{9} \end{align*} $$
Why $j-1$ on the left side of the equation?

Recall from equation 5 that $1 \leq j \leq k-1 $. So $0 \leq j - 1 \leq k-2 < k-1$.

And hence $j - 1 = (j - 1) \bmod (k - 1)$. This allows us to simplify the right hand side of equation 9 by taking the "$\bmod (k-1)$" of both sides.

So: $$ \begin{align*} j - 1 & = [j - 1] \bmod (k - 1) \\ & = [n_{r-1} - f_k(n_{r-1}) + (k-1).[n_{r - 1} + 1 - n_r]] \bmod (k - 1) & \text{from equation 9} \\ & = [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) & \text{since multiples of }k - 1\text{ can be removed} \\ \\ \Rightarrow j & = [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) + 1 \tag{10} \end{align*} $$

Now that we have a formula for j, we can determine both $n_r$ (using equation 8) and $f_k(n_r)$ (using equations 4 and 5).


Let's determine $n_r$ first. Substituting equation 10 into equation 8 gives:

$$ \begin{align*} (k-1).n_r &= k.n_{r-1} + k - j - f_k(n_{r-1}) &\text{from equation 8} \\ &= k.n_{r-1} + k - ([n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) + 1) - f_k(n_{r-1}) &\text{from equation 10} \\ &= [(k-1) + 1].n_{r-1} + (k-1) - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) - f_k(n_{r-1}) \\ &= (k-1).n_{r-1} + n_{r-1} + (k-1) - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) - f_k(n_{r-1}) \\ &= (k-1)[n_{r-1} + 1] + n_{r-1} - f_k(n_{r-1}) - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) \\ \Rightarrow n_r &= n_{r-1} + 1 + \frac{n_{r-1} - f_k(n_{r-1}) - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) }{k-1} \\ \end{align*} $$ This rather complicated quotient can be simplified quite significantly by observing that: $$ \frac{m - m \bmod d}{d} = \lfloor \frac{m}{d} \rfloor $$
This follows by expressing $m$ as: $$ \begin{align*} m &= qd+r & \text{where }0 \leq r < d &\text{i.e. }r = m \bmod d \\ \Rightarrow q & = \frac{m - r}{d} \\ & = \frac{m - m \bmod d}{d} \\ \text{But: } q = \lfloor \frac{m}{d} \rfloor &\text{and the result follows} \end{align*} $$

So substituting $n_{r-1} - f_k(n_{r-1})$ for $m$ and $k-1$ for $d$ we obtain the following simplification:

$$ n_r = n_{r-1} + 1 + \lfloor \frac{n_{r-1} - f_k(n_{r-1}) }{ k - 1 } \rfloor \tag{11} $$


Now let's determine the value of $f_k(n_r)$:

$$ \begin{align*} f_k(n_r) &= [f_k(n_r - 1) + k - 1 ] \bmod n_r + 1 & \text{from equation 4} \\ &= [(n_r - j) + k - 1 ] \bmod n_r + 1 & \text{from equation 5} \\ &= [k - 1 - j] \bmod n_r + 1 \\ &= [k - 1 - ([n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) + 1)] \bmod n_r + 1 & \text{from equation 10} \\ &= [k - 2 - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1)] \bmod n_r + 1 \end{align*} $$
This is already sufficient. However we can simplify things further. When $n_r \ge k-1$, the $ \text{mod }n_r $ can be dropped. This gives the following set of formulae:

$$ \begin{align*} f_k(n_{1}) & = f(1) = 1 \tag{12} \\ f_k(n_r) & = k - 1 - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) & \text{ where } n_r \geq k - 1 \tag{13} \\ f_k(n_r) & = [k - 2 - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1)] \bmod n_r + 1 & \forall n_r \text{ where } r > 1 \tag{14} \\ \end{align*} $$


So equations 11 through to 14 provide formulae for:
  • $n_r$ in terms of $n_{r-1}$ and $f_k(n_{r-1})$
  • $f_k(n_r)$ in terms of $n_r$, $n_{r-1}$ and $f_k(n_{r-1})$
  • $n_1$ and $f_k(n_1)$

With these pieces in place we are ready to create a more efficient algorithm for calculating $f_k(n)$ for arbitrary n.


The row-based algorithm

Given a value $n$ for which we wish to calculate $f_k(n)$, we can't work backwards down to $f_k(1)$ as we have done in the past. The problem is that we don't know which row $r$ it falls within, or what $n_r$ and $f(n_r)$ are.

Instead we are going to have to work our way forwards to find the row that $n$ belongs in. We will start at $n_{1} = 1, f_k(1) = 1$. From this we will derive $n_{2}$ and $f(n_{2})$. We will continue generating this sequence of values for $n_r$ and $n_{r+1}$ until we find a value of $r$ for which $n_r \le n < n_{r+1}$. When we do, we can use equation 6 to determine $f_k(n)$: $$ f_k(n) = f_k(n_r) + k.(n-n_r) \tag{15} $$
While working our way through the rows we will keep track of the values for the next row: $n_{r+1}$ and $f_k(n_{r+1})$. This will allow us to easily calculate $n_{r+2}$ and $f_k(n_{r + 2})$.

Below is an F# function to calculate these values for the next row:

let getNextRowStartSizeAndLabel thisRowStartSize thisRowStartLabel interval =
    let floor 
        = int( System.Math.Floor( double(thisRowStartSize - thisRowStartLabel) 
               / double(interval - 1) ) 
             )
    let nextRowStartSize = thisRowStartSize + 1 + floor
    let nextRowStartLabel = 
        if nextRowStartSize < interval - 1 then
            ( interval - 2 
              - (thisRowStartSize - thisRowStartLabel) % (interval - 1) 
            ) % nextRowStartSize + 1
        else
            interval - 1 - (thisRowStartSize - thisRowStartLabel) % (interval - 1)
    (nextRowStartSize, nextRowStartLabel)

The variables map to the formula as follows:
  • thisRowStartSize $\mapsto n_r$
  • thisRowStartLabel $\mapsto f_k(n_r)$
  • interval $\mapsto k$
  • nextRowStartSize $\mapsto n_{r+1}$
  • nextRowStartLabel $\mapsto f_k(n_{r+1})$

The following F# functions use the function above to calculate $f_k(n)$:
let rec getLabelOfLastLeftInCircleWithIntervalByRow 
        thisRowStartSize thisRowStartLabel 
        nextRowStartSize nextRowStartLabel 
        interval sizeOfCircle =
    if (sizeOfCircle >= thisRowStartSize) && (sizeOfCircle < nextRowStartSize) then
        thisRowStartLabel + interval * (sizeOfCircle - thisRowStartSize)
    else
        let (newNextRowStartSize, newNextRowStartLabel) 
            = getNextRowStartSizeAndLabel nextRowStartSize nextRowStartLabel interval
        getLabelOfLastLeftInCircleWithIntervalByRow 
            nextRowStartSize nextRowStartLabel 
            newNextRowStartSize newNextRowStartLabel 
            interval sizeOfCircle

let calcLastLeftInCircleWithIntervalByRow interval sizeOfCircle =
    if sizeOfCircle < 1 then
        raise (System.ArgumentException("The size of the circle must be positive!"))
    if interval < 2 then
        raise (System.ArgumentException("The interval must be 2 or more!"))
    let (nextRowStartSize, nextRowStartLabel) = getNextRowStartSizeAndLabel 1 1 interval
    getLabelOfLastLeftInCircleWithIntervalByRow 
        1 1 nextRowStartSize nextRowStartLabel interval sizeOfCircle


The efficiency of the row-based algorithm

Look back at equation 11:
$$ n_r = n_{r-1} + 1 + \lfloor \frac{n_{r-1} - f_k(n_{r-1})}{k-1} \rfloor $$

Consider what happens as $n_{r-1}$ starts becoming large compared to $k$. $f_k(n_{r-1}) \le k$ (since a "clock-over" occurs at $n_{r-1}$), so this term will become very small. So as $n_{r-1}$ becomes large:
$$ n_r \approx (1 + \frac{1}{k-1}).n_{r-1} $$

So the $\left\{n_r\right\}$ series is approximately a geometric progression. Since its values are going to grow exponentially, the algorithm will rapidly find the row containing n, even for very large values of n. The size of the stack is going to scale as O(log n), so we are far less likely to reach a value of n which will cause stack overflows.

If we run the latest F# algorithm it is indeed blindingly fast:


It was able to handle some very large 32 bit numbers almost instantaneously.

I would expect it to fail when the next higher $n_r$ value is larger than the maximum 32 bit integer. It might be possible to work around this by inserting a "try catch" block to detect and recover from this situation, since we can still calculate the answer using the previous value of $n_r$.

I thought that another failure condition could be with very large values of $k$, since the exponential growth rate would be very small when $k$ is large. So I tested the algorithm with $k$ and $n$ both equal to a billion (one thousand million). The calculation took slightly under 32 seconds on my i5 laptop.


Displaying the rows in the row-based algorithm

Ideally we would like a closed form solution for arbitrary k. I tried looking for patterns in the rows for various values of k, but I couldn't see a consistent pattern.

For anyone who would like to try, or just if you're curious to see the row structure in action, here is an F# script to allow you to see all $(n, f_k(n))$ pairs arranged in rows. It depends on the function getNextRowStartSizeAndLabel defined earlier, so make sure that function has been defined in your F# interactive session:
let rec showLabelsOfLastLeftInCircleWithIntervalByRow 
        showFullRow thisRowStartSize thisRowStartLabel 
        nextRowStartSize nextRowStartLabel interval 
        sizeOfCircle maxSizeOfCircle =
    if (sizeOfCircle <= maxSizeOfCircle) then
        let result = thisRowStartLabel + interval * (sizeOfCircle - thisRowStartSize)
        if sizeOfCircle = thisRowStartSize then
            System.Console.WriteLine()
            System.Console.Write( "f({0})={1}; ", sizeOfCircle, result)
        elif showFullRow then
            System.Console.Write( "f({0})={1}; ", sizeOfCircle, result)
        
        if sizeOfCircle < nextRowStartSize - 1 then
            showLabelsOfLastLeftInCircleWithIntervalByRow 
                showFullRow thisRowStartSize thisRowStartLabel 
                nextRowStartSize nextRowStartLabel
                interval (sizeOfCircle+1) maxSizeOfCircle 
        else
            let (newNextRowStartSize, newNextRowStartLabel) =
                getNextRowStartSizeAndLabel nextRowStartSize nextRowStartLabel interval
            showLabelsOfLastLeftInCircleWithIntervalByRow 
                showFullRow nextRowStartSize nextRowStartLabel 
                newNextRowStartSize newNextRowStartLabel 
                interval (sizeOfCircle+1) maxSizeOfCircle

let showLastLeftInCircleWithIntervalByRow showFullRow interval maxSizeOfCircle =
    if maxSizeOfCircle < 1 then
        raise 
            ( System.ArgumentException(
                "The maximum size of the circle must be 1 or more!")
            )
    if interval < 2 then
        raise 
            ( System.ArgumentException(
                "The interval must be 2 or more!")
            )
    let (nextRowStartSize, nextRowStartLabel) = 
        getNextRowStartSizeAndLabel 1 1 interval
    showLabelsOfLastLeftInCircleWithIntervalByRow 
        showFullRow 1 1 
        nextRowStartSize nextRowStartLabel interval 
        1 maxSizeOfCircle
    System.Console.WriteLine()

When the showFullRow parameter to the function is false, then only the first pair in each row will be shown. I find this is a lot more useful than seeing the intermediate values as well, which soon leads to each console line overflowing.



Validating the various algorithms against one other

The following F# script can be used to check that the 3 algorithms produce the same answers:
type circleCalculationWithInterval = {
    SizeOfCircle: int;
    Interval: int;
    LastLeftInCircleByBruteForce: int;
    LastLeftInCircleByRecursiveFormula: int;
    LastLeftInCircleByRow: int
}

let getCircleCalculationWithInterval includeBruteForce interval sizeOfCircle =
    // Brute force is slow to calculate. Provide the option to omit it...
    let lastLeftInCircleByBruteForce = 
        match includeBruteForce with 
        | true -> calcLastLeftInCircleWithIntervalByBruteForce interval sizeOfCircle
        | false -> 0
    let lastLeftInCircleByRecursiveFormula = 
        calcLastLeftInCircleWithIntervalByRecursiveFormula interval sizeOfCircle
    let lastLeftInCircleByRow = 
        calcLastLeftInCircleWithIntervalByRow interval sizeOfCircle
    let circleCalc = {
        SizeOfCircle = sizeOfCircle;
        Interval = interval;
        LastLeftInCircleByBruteForce = lastLeftInCircleByBruteForce; 
        LastLeftInCircleByRecursiveFormula = lastLeftInCircleByRecursiveFormula;
        LastLeftInCircleByRow = lastLeftInCircleByRow
    }
    circleCalc

You can then check the results using snippets similar to the following:

// With brute force:
[1..1000] |> List.map (getCircleCalculationWithInterval true 2) |> List.filter (
    fun cc -> cc.LastLeftInCircleByRow <> cc.LastLeftInCircleByBruteForce
           || cc.LastLeftInCircleByRow <> cc.LastLeftInCircleByRecursiveFormula
    );;

[1..100] |> List.map (getCircleCalculationWithInterval true 100) |> List.filter (
    fun cc -> cc.LastLeftInCircleByRow <> cc.LastLeftInCircleByBruteForce
           || cc.LastLeftInCircleByRow <> cc.LastLeftInCircleByRecursiveFormula
    );;

// Without brute force:
[1..10000] |> List.map (getCircleCalculationWithInterval false 2) |> List.filter (
    fun cc -> cc.LastLeftInCircleByRow <> cc.LastLeftInCircleByRecursiveFormula
    );;

[1..10000] |> List.map (getCircleCalculationWithInterval false 100) |> List.filter (
    fun cc -> cc.LastLeftInCircleByRow <> cc.LastLeftInCircleByRecursiveFormula
    );;


A new proof for the formula when removing every second person

We can use the general formulae to provide a new derivation of the closed form solution when every second person is removed from the circle.

The proof ends up being remarkably simple, because $k - 1 = 1$. Terms such as $(...) \bmod (k-1)$ simply fall away, since any integer mod 1 is zero. Also some terms involve a division by $k-1$, and these also simplify very nicely.

But these simplifications only work for $k = 2$. This reduces my confidence of finding a closed form solution for the general case.

Anyway, here's the proof:
$$ \begin{align*} f_2(n_r) & = k - 1 - [n_{r-1} - f_k(n_{r-1})] \bmod (k - 1) & \text{ from equation 13, since } n_r \geq k - 1 = 1 \\ & = 2 - 1 - [n_{r-1} - f_2(n_{r-1})] \bmod 1 & \text{(but any number mod 1 is 0)} \\ & = 1 & \text{ for }n_r > 1 \\ \\ \text{but: } f_2(n_{1}) & = f(1) = 1 & \text{from equation 12} \\ \\ \therefore f_2(n_r) & = 1 & \forall n_r \tag{16} \\ \\ n_r & = n_{r-1} + 1 + \lfloor \frac{n_{r-1} - f_k(n_{r-1}) }{ k - 1 } \rfloor & \text{ from equation 11} \\ & = n_{r-1} + 1 + \lfloor n_{r-1} - f_k(n_{r-1}) \rfloor & \\ & = n_{r-1} + 1 + n_{r-1} - f_k(n_{r-1}) & \text{(since floor has an integer argument)} \\ & = n_{r-1} + 1 + n_{r-1} - 1 & \text{since }f_k(n_{r-1}) = 1 \text{ by equation 16 } \\ & = 2.n_{r-1} & \text{a geometric progression} \tag{17} \\ \\ \therefore n_r & = 2^{r-1} & \text{by equations 16 and 17} \tag{18} \\ \\ f_2(n) & = f_2(n_r) + 2.(n-n_r) & \text{from equation 15, where: } n_r \le n < n_{r+1} \\ & = 1 + 2(n - 2^{r-1}) & \text{ where: } 2^{r-1} \le n \le 2^r \text{(by equation 18)}\\ & = 1 + 2n - 2.2^{r-1} & \text{where: } 2^{r-1} \le 2^{log_2 n} < 2^r \\ & = 1 + 2n - 2.2^{\lfloor log_2 n \rfloor} & \text{since: } 2^{r-1} = 2^{\lfloor log_2 n \rfloor} < 2^{r}\\ & = 2n + 1 + 2^{{\lfloor log_2 n \rfloor} + 1} \tag{19} \\ \end{align*} $$
And this is the same equation derived in blog post three in the series, albeit through a completely different method.


Conclusion

In this blog post and the previous post I derived formulae which are applicable to intervals other than 2.

I was able to use these formulae to find a new proof of the closed form solution for the original interval of 2 (i.e. removing every second person from the circle).

I wasn't able to derive a closed form solution for arbitrary intervals. However I was able to devise an efficient algorithm for calculating the answer.


Next time

I feel that I have progressed as far as I can with the problem on my own. I don't know whether a closed form solution exists. But even if it does, I doubt I'm going to find it.

So now it's time to search the internet to see if this is a problem which other people have solved. In my next blog post, I hope to report back on what I found.

I'm also hoping to share some of the scripts and tools I used to generate the artefacts in this series of blog posts.

But most of all, I'm looking forward to wrapping up this series, so that I can move on to other topics that capture my interest!


No comments: