Fibonacci in O(__builtin_popcount(n)) time

Rafael_sof 2013-09-12 03:30:46 UTC #1

I was asked about my algorithm to find F(n+1) in O(_builtinpopcount(n)) time here. Since I didn't find any web page discussing it (perhaps didn't search enough), I decided to discuss it here.

F(m) = F(m-1) + F(m-2)
F(m) = 2F(m-2) + F(m-3)
F(m) = 3F(m-3) + 2F(m-4)
F(m) = 5F(m-4) + 3F(m-5)
F(m) = 8F(m-5) + 5F(m-6)
...
F(m) = F(t+1)F(m-t) + F(t)F(m-t-1)

Let's do:

m = n + k
t = k - 1

From this we get:

F(n+k) = F(k)F(n+1) + F(k-1)F(n)
F(n+k) = F(k)[F(n-1)+F(n)] + F(k-1)F(n)

It's nice to note that, using k = n:

F(2n) = F(n)[F(n-1)+F(n)] + F(n-1)F(n)
F(2n) = F(n)[2F(n-1)+F(n)]

You are asked to calculate F(m).
Let's say m = 2^i + k.

F(2^i+k) = F(k)[F(2^i-1)+F(2^i)] + F(k-1)F(2^i)

Now, k = 2^j + z.

F(2^j+z) = F(z)[F(2^j-1)+F(2^j)] + F(z-1)F(2^j)

And z = 2^h + x, and so on... and we calculate F(m) recursively. If we pre-calculate F(2^i) and F(2^i-1) for 0 <= i <= lg(MAXN), we can calculate F(x), for any x, in O(lg N) time for worst case.

The algorithm for this is easy to implement and understand:

#define LOGMAXN 31
typedef unsigned long long uint64;
// fib2[i].first  = F((2^i)-1)
// fib2[i].second = F(2^i)
pair<uint64,uint64> fib2[LOGMAXN+1] = {make_pair(0,1)};
void generate_fib2() {
	for (int i = 1; i <= LOGMAXN; i++) {
		fib2[i].second = fib2[i-1].second*((fib2[i-1].first<<1)+fib2[i-1].second); // x<<1 is similar to x*2
		fib2[i].first = fib2[i].second - fib2[i-1].first*((fib2[i-1].second<<1)-fib2[i-1].first);
	}
}

With fib2[] generated, we can recursively calculate F(n):

#define pow2(i) (1<<(i))
uint64 f0, f1, faux;
void fib_rec(int n, int i=LOGMAXN) {
	if (n) {
		// find the index of the most significant 1-bit
		// it could be the least significant, it's your choice...
		while (pow2(i) > n) i--;
    	// calculates f0 = F(n - 2^i - 1) and f1 = F(n - 2^i)
	fib_rec(n - pow2(i), i-1);

	// F(n-1) = F(n-2^i)F(2^i) + F(n-2^i-1)F(2^i-1)
	faux = f1*fib2[i].second + f0*fib2[i].first;

	// f1 = F(n) = F(n-2^i)[F(2^i-1)+F(2^i)] + F(n-2^i-1)F(2^i)
	f1 = f1*(fib2[i].first+fib2[i].second) + f0*fib2[i].second;

	// f0 = F(n-1)
	f0 = faux;
}
}
uint64 fib(int n) {
	if (!n) return 0;
	f0 = 0, f1 = 1;
	fib_rec(n-1);
	return f1;
}

It's clearly an O(lg(MAXN)) algorithm. But using builtin C/C++ functions, we can reduce it to O(numberOf1Bits(n-1)) = O(_builtinpopcount(n-1)).

Instead of:

while (pow2(i) > n) i--;

We can use _builtinclz(n), which retrieves the number of leading 0-bits in n, starting at the most significant bit position. If we do 31 - _builtinclz(n), we get the index of the most significant 1-bit in n.

i = 31 - __builtin_clz(n);

Of course, if you prefer, you can also choose to retrieve the least significant 1-bit, using _builtinctz(n):

i = __builtin_ctz(n);

And since it's simple to make this iterative, there's no reason to not make it:

uint64 fib(int n) { // {0,1,1,2,...}
	if (!n--) return 0;
	uint64 f0 = 0, f1 = 1, faux;
	while (n) {
		int i = __builtin_ctz(n); // __builtin_ctzll(n)
		n -= 1<<i; // 1ULL<<i
		faux = f1*fib2[i].second + f0*fib2[i].first;
		f1 = f1*(fib2[i].first+fib2[i].second) + f0*fib2[i].second;
		f0 = faux;
	}
	return f1;
}

fedelebron 2013-09-12 05:31:01 UTC #2

Hi. You may be interested in this post I wrote on computing Fibonacci fast.
The identity is indeed known, see this page for that and others.

A straightforward Haskell implementation of a similar identity, with the same runtime, is as follows:
bbone=haskell,622 = ((mod 1000000007) .) . (*)
fib :: Int -> (Integer, Integer)
fib 0 = (0, 1)
fib n =
let (a, b) = fib (n div 2)
c = a .*. ((b * 2) - a)
d = (b .. b) + (a .. a)
in if even n
then (c, d)
else (d, c + d)
[/bbone]

Rafael_sof 2013-09-12 08:53:18 UTC #3

I'm out of time to read these references now, but I will certainly do it soon!

Unfortunately, I can't understand functional programming languages, but thanks and I'll try to understand your code when I have more time.

@Edit
I've read your post, and it's certainly the exact same algorithm (not sure about that code in haskell, but the explanation seemed to lead to the same idea).
That's nice to see it elsewhere, because when people talk about "fast fibonacci" they almost always mean "fibonacci using matrix exponentiation", that we all already know about... it's quite boring reading the same thing again and again.

@Edit
Just a note to your post. Binet's formula isn't O(1), since you still need to calculate phi^n, which takes O(lg N) time.

stephenharper 2013-10-22 04:41:54 UTC #4

I read the article, it is really the same algorithms. It's good to see the rest, because when we talk about "Fast Fibonacci" almost always means "field index Fibonacci", which everyone already knows.'s Boring to read the same thing over and over again.