Here is my code for the problem http://www.spoj.com/problems/REC/

def mul(l, m, mod):
    a, b, c, d = l; p,q,r,s = m;
    return [(a*p + b*r)%mod, (a*q + b*s)%mod, (c*p + d*r)%mod, (c*q + d*s)%mod]
def rem(a, b, n, mod):
    temp = [a, b, 0, 1]
    ans = [1, 0, 0, 1]
    for char in n:
        if char == '1':
            ans = mul(ans, temp, mod)
        temp = mul(temp, temp, mod)
    return ans
n = input()
for i in xrange(n):
    a, b, n, m = [int(x) for x in raw_input().split()]
    if a == 1:
        print (1+n*b)%m
        continue
    n = bin(n)[2:][::-1]
    a = a %m; b = b %m
    #print a, b
    ans = rem(a, b, n, m)
    print (ans[0] + ans[1]) % m

The simple code above gives correct output.. but gives TLE obviously ...(20K test cases take 13 sec on my comp)
Any suggestions/hints for improving the algo/the program itself.
Thanks in advance...