Hello, I have problem with counting repeated words from a input.
Im trying to simulate that process by using a file with 100 000 lines, each line looks like :
1. string StringsIWantCount
...
n. string StringsIWantCount
Im using a HashMap as a data structure, I chose it because of O(1) search and put.
Reading from a file or spoj input is done by buffered scanner class.
while reading from a file I check whether the string is in a hashmap or not, If it is I retrieve it value and increase by 1, otherwise Im putting that word into my data structure.
HashMap<String,Integer> Punkty = new HashMap<String, Integer>() ;
while (fs.nextString().length()>0)//odczytanie numeru
{
fs.nextString();//odczytanie nazwiska
String s = fs.nextString().toUpperCase();//odczytanie
if(Punkty.containsKey(s))
Punkty.put(s,Punkty.get(s)+1);
else
Punkty.put(s,1);
}
After reading whole file/spoj input I have to sort my hashmap by value and when values are the same, by key["a" to "z"]
List<Map.Entry<String, Integer>> list = new Vector<Map.Entry<String, Integer>>(Punkty.entrySet());
java.util.Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
public int compare(Map.Entry<String, Integer> e, Map.Entry<String, Integer> e1) {
int t=0;
if(e.getValue().equals(e1.getValue()))
t = e.getKey().compareTo(e1.getKey());
else
t = (e.getValue() < e1.getValue() ? 1 : -1);
return t;
}});
I chopped my program into a few parts, and measured execution time:
//
Number of names: 54
Read and write to HashMap time: 130 ms
Sorting time: 2 ms
output time: 3 ms
Execution time: 135 ms
//
Number of names: 1404
Read and write to HashMap time: 130 ms
Sorting time: 25 ms
output time: 5 ms
Execution time: 160 ms
//
Number of names: 34032
Read and write to HashMap time: 147 ms
Sorting time: 290 ms
output time: 3 ms
Execution time: 440 ms
//
Number of names: 94806
Read and write to HashMap time: 154 ms
Sorting time: 433 ms
output time: 3 ms
Execution time: 590 ms