That, indeed, was a silly mistake( had nothing to do with my fluency in python though). Nevertheless, I need to make it run faster. The code computes every test case in sub-linear time, so it boils down to I/O again, right?
The following code also gives TLE.
[bbone=PYTHON,160]import psyco
psyco.full()
import sys
def find( a, b, n, M):
L,I= [a,b,0,1],[1,0,0,1]
while n>0:
l1,l2,l3,l4,i1,i2,i3,i4= L[0],L[1],L[2],L[3],I[0],I[1],I[2],I[3]
if n&1:
I[0],I[1],I[2],I[3]=(i1*l1+i3*l2)%M, (l1*i2+l2*i4)%M, (l3*i1+l4*i3)%M,(l3*i2+l4*i4)%M
L[0],L[1],L[2],L[3]= (l1*l1+l3*l2)%M,(l1*l2+l2*l4)%M, (l3*l1+l4*l3)%M, (l3*l2+l4*l4)%M
n=n/2
return (I[0]+I[1])%M
inp = map(int,sys.stdin.read().split())
t,i=inp[0],1
while t:
a,b,n,M=inp[i],inp[i+1],inp[i+2],inp[i+3]
if n==0:
print 1
elif a==0:
print b%M
else:
print find(a,b,n,M)
t,i=t-1,i+4
[/bbone]
Should I store the answers in a list and print them together in one go? Would that suffice? Is modulo operator slowing the execution down?