如果真的是interview問題,這個題目真的是tricky
文章來源: 海姑娘2006-07-12 06:14:06
1。 Given a file of URLs, how to find the top 10 most popular domains? If the file is really large (containing tens of millions of URLs), and we don't want to know the exact top 10 most popular domains, what is the quickest way to find the approximate top 10 most popular domains?

domian name和url是不一樣de啊。這個問題不像個簡單的統計問題,首先要找出最快的從url中提取domain name的辦法,然後.....

2。How to efficiently store a billion URLs?

url本身就是由domain name 加上path組成的, 所以....