I was asked about my algorithm to find F(n+1) in O(_builtinpopcount(n)) time here. Since I didn't find any web page discussing it (perhaps didn't search enough), I decided to discuss it here.
F(m) = F(m-1) + F(m-2)
F(m) = 2F(m-2) + F(m-3)
F(m) = 3F(m-3) + 2F(m-4)
F(m) = 5F(m-4) + 3F(m-5)
F(m) = 8F(m-5) + 5F(m-6)
...
F(m) = F(t+1)F(m-t) + F(t)F(m-t-1)
Let's do:
m = n + k
t = k - 1
From this we get:
F(n+k) = F(k)F(n+1) + F(k-1)F(n)
F(n+k) = F(k)[F(n-1)+F(n)] + F(k-1)F(n)
It's nice to note that, using k = n:
F(2n) = F(n)[F(n-1)+F(n)] + F(n-1)F(n)
F(2n) = F(n)[2F(n-1)+F(n)]
You are asked to calculate F(m).
Let's say m = 2^i + k.
F(2^i+k) = F(k)[F(2^i-1)+F(2^i)] + F(k-1)F(2^i)
Now, k = 2^j + z.
F(2^j+z) = F(z)[F(2^j-1)+F(2^j)] + F(z-1)F(2^j)
And z = 2^h + x, and so on... and we calculate F(m) recursively. If we pre-calculate F(2^i) and F(2^i-1) for 0 <= i <= lg(MAXN), we can calculate F(x), for any x, in O(lg N) time for worst case.
The algorithm for this is easy to implement and understand:
#define LOGMAXN 31
typedef unsigned long long uint64;
// fib2[i].first = F((2^i)-1)
// fib2[i].second = F(2^i)
pair<uint64,uint64> fib2[LOGMAXN+1] = {make_pair(0,1)};
void generate_fib2() {
for (int i = 1; i <= LOGMAXN; i++) {
fib2[i].second = fib2[i-1].second*((fib2[i-1].first<<1)+fib2[i-1].second); // x<<1 is similar to x*2
fib2[i].first = fib2[i].second - fib2[i-1].first*((fib2[i-1].second<<1)-fib2[i-1].first);
}
}
With fib2[] generated, we can recursively calculate F(n):
#define pow2(i) (1<<(i))
uint64 f0, f1, faux;
void fib_rec(int n, int i=LOGMAXN) {
if (n) {
// find the index of the most significant 1-bit
// it could be the least significant, it's your choice...
while (pow2(i) > n) i--;
// calculates f0 = F(n - 2^i - 1) and f1 = F(n - 2^i)
fib_rec(n - pow2(i), i-1);
// F(n-1) = F(n-2^i)F(2^i) + F(n-2^i-1)F(2^i-1)
faux = f1*fib2[i].second + f0*fib2[i].first;
// f1 = F(n) = F(n-2^i)[F(2^i-1)+F(2^i)] + F(n-2^i-1)F(2^i)
f1 = f1*(fib2[i].first+fib2[i].second) + f0*fib2[i].second;
// f0 = F(n-1)
f0 = faux;
}
}
uint64 fib(int n) {
if (!n) return 0;
f0 = 0, f1 = 1;
fib_rec(n-1);
return f1;
}
It's clearly an O(lg(MAXN)) algorithm. But using builtin C/C++ functions, we can reduce it to O(numberOf1Bits(n-1)) = O(_builtinpopcount(n-1)).
Instead of:
while (pow2(i) > n) i--;
We can use _builtinclz(n), which retrieves the number of leading 0-bits in n, starting at the most significant bit position. If we do 31 - _builtinclz(n), we get the index of the most significant 1-bit in n.
i = 31 - __builtin_clz(n);
Of course, if you prefer, you can also choose to retrieve the least significant 1-bit, using _builtinctz(n):
i = __builtin_ctz(n);
And since it's simple to make this iterative, there's no reason to not make it:
uint64 fib(int n) { // {0,1,1,2,...}
if (!n--) return 0;
uint64 f0 = 0, f1 = 1, faux;
while (n) {
int i = __builtin_ctz(n); // __builtin_ctzll(n)
n -= 1<<i; // 1ULL<<i
faux = f1*fib2[i].second + f0*fib2[i].first;
f1 = f1*(fib2[i].first+fib2[i].second) + f0*fib2[i].second;
f0 = faux;
}
return f1;
}