Problemset 1

Adrian_Kosowski 2004-10-29 19:06:37 UTC #1

This topic is meant for solutions and hints to problems from problemset 1 of DASM Programming League 2004. You may post any useful questions and resources here, and I will do my best to format them in to one single document for the benifit of all.

PAWNS by Adrian Kuegel

Maybe some people are interested in how I solved problem pawns.
What I thought was most important in this problem was to avoid intermediate game states that can't lead to the given end game state. So how can we recognise such an invalid intermediate game state?
For any intermediate game state, you look at the pawns of one color. Clearly, all pawns of the end game state of this color has to be "matched" with a pawn of the current game state. So we have to figure out for each pawn, which of the pawns in the end game state he can match. This is a hard task to be solved optimally, since it depends on the moves of the other pawns. But I used following simple approximation:
I assumed that pawns can always choose to go forward, left forward or right forward. With this assumption it is easy to check which pawns can be matched. A small improvement can be done by checking if the pawn has to do at least one diagonal move, that in each column between the current column and the end column there is at least a pawn of either color.
So after I got this information, I can regard this as a bipartite graph. I find a maximum matching in this graph. If this matching has smaller size than the number of pawns of that color in the end configuration, the game state is clearly invalid. Otherwise, it may still be an invalid game state, but at least I was able to throw some clearly invalid game states away.
Unfortunately, my maximum matching function needs O(sqrt(V)*E) time with V = number of nodes, E = number of edges. So I can only visit 1200 game states within the time limit. If I would use a simple bfs, this wouldn't lead to good results I guess (I must admit I never tried it).
So I tried a greedy-like dfs: always continue the search with the game state that seems to be closest to the end game state. Again, this is hard to determine exactly. My approximation was:
Take the matching that was calculated to check if game state was valid. Calculate for each pawn how many moves he has to make to reach "his" place in the end configuration. Use the summation of all these values as a guess of the number of steps that are still needed.
If I would have calculated a minimum cost maximum matching, this approximation value would be probably better, but then I would have runtime O(V^3) for this step, and I thought it was not worth it.
Another optimization that I never tried, but that would have good chances to improve results: do a search from both ends, and look where they "meet" in the middle. This is somewhat easy with a bfs, but with my priority_queue searching it is more difficult to decide, when to switch to the search from the other direction and it would make the code much more complicated.

PSPHERE by Tomek Czajka

First I figured out what the test cases for PSHPERE are.
Then, I found some point sets on these webpages:
For n<=130:
research.att.com/~njas/packings/index.html
These are quite good.
For n>130:
research.att.com/~njas/icosahedral.codes/
These are only for "nice" n and for symmetric sets. I chose the closest
n above the input, threw away a few points and submitted the rest.

SOPARADE by Pascal Zimmer

First, the general constraint tells us that a 2 can only be next to a 4, and a 3 can only be next to a 1. This implies that the sequence will be composed of two alternating subsequences 424...424 and 131...131, and we can switch only from one subsequence to another between a 1 and a 4. For symmetry reasons, we can only consider sequences starting with 2 and 4.

If the sequence starts with a 4, all 4s and all 3s must be at odd places, and all 1s and 2s must be at even places (and the converse if we start with a 2). Now, let's consider the specific constraints. If one of them has more than two odd numbers or more than two even numbers, it cannot be satisfied and we reject the entry. Otherwise, we split the constraint into two parts (odd and even numbers, at most 2 for each), and we only need to ensure that they get different ranks.

To do that, I used an algorithm similar to the computation of connected components in a graph. That is, to each place i, I assign a zone number i at the beginning. When I have a constraint between i and j, I ensure that their respective zone numbers are opposite (u and -u), and I propagate this information using an adapted CC style algorithm. If I detect that i and j should get the same zone number, the system cannot be satisfied and we go out.

That way, only by parsing the input, many cases will be rejected. If it goes through, it still does not mean that the system can be satisfied. Now, we try to assign ranks to places using the info we have computed, simply by doing a recursive search. For many places, there will be only one possibility, either because the previous place got a 2 or a 3, or because this place has a zone number that appeared before (or its opposite), so there is no choice. Of course, as soon as we found an assignment for all places, we get out with a success.

On the given inputs, this algorithm seems to be very fast (I also had to use a few C tricks).

CUTSQRS by Adrian Kuegel

You have to realize that this problem is about euclid algorithm.
We can model this as if also a rectangle with one side length zero is allowed. Then the last "rectangle" is a losing position, and it is uniquely determined by gcd(a,b),0.
For example, starting with 36,16
the euclidean algorithm would do
36,16
16,4
4,0
In the described game, there are moves possible that are between the intermediate results of euclidean algorithm, but you can see that the intermediate results of euclidean algorithm are always part of the game.

To decide if the starting player can force a win:
w. l. o. g. assume a<=b
you need to know if the rectangle b mod a,a is a winning position or a losing position.
If the starting player has several possible moves, he can always force a win. If b mod a,a is a losing position, he reduces the rectangle to b mod a,a and lets his opponent lose. Otherwise if b mod a, a is a winning position, he reduces the square to a+(b mod a),a leaving the other player no other choice but to reduce the rectangle to b mod a,a which is a winning position (and then it is the turn of the starting player again).
If the starting player has only one possible move, it all depends on if the position b mod a,a is a winning position or not.
So just evaluate this either with recursion or with bottom up dynamic programming.

Now to the second part of the problem: how many game sequences are there?
As I mentioned before, you can observe, that in any valid game sequence, all intermediate results of the euclidean algorithm are present. So we can consider the number of game sequences between two of these intermediate results separately and multiply these results all together.
So lets examine the example of above again:
36,16
16,4
4,0
between 36,16 and 16,2, there is a possible move to 20,16
between 16,4 and 4,0, there are following possible two moves in between:
12,4 and 8,4.
More generally, if one intermediate result (a,b) is considered (here a>=b), there are a/b - 1 possible moves between (a,b) and (b,a mod b) (/ is meant as integer division ignoring remainder, so for example 5/2 = 2).
For any valid game sequence and every possible move between euclidean intermediate results, this move is either taken or not taken. So there are 2^(a/b-1) possible game sequences between (a,b) and (b,a mod b). As mentioned before, we can multiply all these results together, which is equivalent to add all a/b-1 values and in the end calculating 2^{sum of these values}.
For the third question: how many sequences are there in which the starting player wins: exactly half of all possible game sequences lead to a win (only exception: in the case where only one sequence is possible, but since we need to calculate only the number of digits, we don't care if it is a 1 or a 0).
Now how can we calculate the number of required digits of a large number 2^k?
Use logarithm!
number of digits in base 10 is floor(1.0+log10(2^k)) = floor(1.0+k*log10(2)).

PHRASES by Adrian Kuegel

You build a suffix tree of the word
w_1 $1 w2 $2 ... wn $n
where w_1 .. w_n are the input messages, and $1 .. $n are different separator characters that doesn't occur in any word.
Now, first lets assume there was only one input word.
Create a function with parameter current node that returns the first and the last occurence of the substring that is represented by concatenating the labels of the edges from the root to this node.
To calculate this, call the function recursively with all child nodes and update the first and last occurrence each time.
Whenever first occurrence+(length of substring)<=last occurrence, you have a potential solution and you can update the best substring found so far.
It is not so difficult to see how to generalize this method for more than one string: just check whenever you update a last or first occurrence for which input message you have to do the update.

PHRASES by Tomek Czajka

I did binary search on the maximum length.
For each length I calculate the hashes of all substrings of that length and check for duplicates. If the hashes are same, I assumed that the substrings would be same.

In order to modify the text of the solution, authors are requested to notify me by e-mail or as follow-ups to this post.

zhuzeyuan 2004-11-09 04:18:29 UTC #2

Thank you for the hints. Nice problems

zhuzeyuan 2004-11-09 06:09:00 UTC #3

But for problem SOPARADE, is there any algorithm with fixed complexity? I'm still worried about that recursion algorithm. If the testdatas're quite tricky...?

Such as the first 1000 soldiers got nothing to do with the real conflict which is placed after the first 1000 soldiers.. Will your program run on about 2^1000?

Adrian_Kosowski 2004-11-09 10:40:16 UTC #4

If you treat conflits as edges of the graph, I believe that the recursion described by Pascal can be regarded as DFS in the connected components. If it isn't, then at any rate you can always use DFS if you want to be certain of your algorithm's complexity.

zhuzeyuan 2004-11-09 12:01:28 UTC #5

I misunderstood his explaination. just an O(N) problem.

system 2004-11-09 13:25:02 UTC #6

I'm interested in this algorithm. How did you build your hash function? Thanx. "I calculate the hashes of all substrings of that length " that will take quite a long time I think.

zhuzeyuan 2004-11-09 13:26:43 UTC #7

Faint. The last reply was mine, but treated as guest's. :S

And how about this data:
n=1
ababa
the answer is 2 or 3? (about disjoint substrings, i'm not sure about my understading)

zhuzeyuan 2004-11-09 15:21:16 UTC #8

I wonder what do you mean by "figure out what the test cases for PSHPERE are"? How could you send up your answer for the problem? (50K for source code)

marian_sof 2004-11-13 16:19:47 UTC #9

Is Pascal's solution really O(N)? In fact, I came up with a linear algorithm, but it seems much more complex than what was decribed here.

As Pascal already said, checking if those constraints from input form bipartite graph is not enough, you have to check also the constraint, that two succesive soldiers have rank difference at least 2. I could solve this as a 2-CNF problem (assigning for every bipartite component one logical variable representing the ranks of soldiers), what involves computing strongly connected components. Is there an easier way?

Adrian_Kosowski 2004-11-13 17:02:12 UTC #10

I think it was easiest to notice that the problem has a solution, iff an equivalent problem set with the following rules has a solution:

Soldiers standing at odd positions are of rank 1 or 2.
Soldiers standing at even positions are of rank 3 or 4.
If a soldier of rank X stands at one end of a "conflict edge" connecting positions of the same parity, then the soldier standing at the other end must have rank 2+2*((X-1)/2)-((X-1)%2). Edges connecting positions of different parity can be ignored.
The neighbours of a soldier of rank 2 must be of rank 4.
The neighbours of a soldier of rank 3 must be of rank 1.
If the above rules don't determine ranks for all soldiers, take the leftmost soldier without a rank and try assigning both possible ranks to him (1 or 2 if he is at an odd position, 3 or 4 if he is at an even one). Continue solving the problem for both cases. If either way works, leave it at that. If neither way is possible, the problem has no solution.

I guess that most solutions come down to these rules in the end, it is only a question of making the most of optimisation & other little observations .

marian_sof 2004-11-13 18:05:13 UTC #11

Well, this step is similar to the quadratic algorithm for solving 2-CNF SAT problem (try assigning some value to the variable and see, if it doesn't generate contradiction), so there's a good reason to think that this algorithm is Omega(N^2) too. But probably the nature of this problem is such that this solution actually works pretty fast even in the worst case, but that would require very tricky analysis.

Adrian_Kosowski 2004-11-13 19:09:18 UTC #12

I'm not sure whether this is all that tricky. If you treat the additional conflict conditions as edges, notice that every connected partition of the conflict graph is checked at most twice (the first vertex is assigned 1 or 2 [alternatively: 3 or 4]), hence every edge is traversed at most twice. So the runtime is proportional to the size of input (it is hard to get any better than that ).

zhuzeyuan 2004-11-14 10:58:02 UTC #13

Best no one replys my question about the hashfunction and Points on the Sphere? sigh~

AK, could you lead me the way to contact them? I could see their e-mails in the past (problemsetter), but can't see them now. Thank you very much.

Adrian_Kosowski 2004-11-14 11:46:37 UTC #14

They were flooded in the tide of posts.

As far as test case guessing goes - in problemset one the test data was small and there was no limit on the number of submissions. Now this has been changed. I wouldn't really like to return to that topic .

I guess Tomek could provide you with a few more details about hashes (signatures) of substrings... But I think it is really quite straightforward, provided you perform a binary search with respect to substring length (the hashes of all strings of given length can be easily computed in linear time).

I don't know whether it was a bug then or now (I am not sure what our current policy is, I will find out ). Anyway, I think that I can send you the e-mails of everyone posting in this topic (they are also publicly available on various websites). You can also mail/privatemsg anyone registered in the forum from this page: http://spoj.sphere.pl/forum/memberlist.php.

zhuzeyuan 2004-11-14 12:01:26 UTC #15

Thank you very much. I thought so on the "guessing testdatas"

Speaking of the hash function, I did so as you said, in average of O(1) counting the hash value, but wa..~~~ SO I wonder his great way of counting it both in time and with high probability of exactitude~

Spykaj 2008-01-04 22:05:22 UTC #16

Pewnie wysłałeś w zwykłym C zamiast w C++

m_sasinowski1 2008-01-05 11:03:12 UTC #17

Musisz wysłać program jako c++:
zgłaszanie -> język -> C++ (g++ ...)