Saturday, 20 February 2010

Personalized Search

I haven't blogged in a while now, but here's an update on what I've been doing since September. I've embarked on a Masters course in Computer Speech, Text and Internet Technology at the University of Cambridge.

It's a one-year Masters course on the state-of-the-art in Speech and Language Processing and its application to Internet Technology. The main aim of the MPhil course is to teach the fundamental theory of speech and natural language processing and its use in a variety of advanced applications, especially those related to the Internet.

  • Speech Processing: analysis, speech recognition, speech synthesis
  • Language Processing (computational linguistics): syntax, parsing, semantics, discourse
  • Applications: information retrieval, information extraction, dialogue systems, machine translation, question answering.

The CSTIT is a one-year postgraduate course, which combines lectures, practicals, seminars and a substantial research project. It starts off with a term of taught material (lectures and structured practicals) covering the foundations of speech and language processing. In the second term, students attend lectures on more advanced topics, participate in a small group seminar in which they study and present material on a research topic, undertake two longer practicals and start on their research project. A dissertation is submitted in the third week in June and students give presentations on their projects in the last week in June.


My project is about Personalized Search. I'll look at how we can learn a user's interests by looking at different data sources like browsing history, search queries, ... and how we can use those interests to make the output of search engines more biased towards the user using it. Because there are no publicly available corpora or data sets available for this, I need to gather my own data set. That's why I need your help!


I've created a FireFox add-on that captures some of that data. I doesn't get in the way of anything an guarantees your privacy. If you want to help me out, please visit http://alterego.caret.cam.ac.uk and install the add-on. The page also contains more information about the project.


You can also win a prize ;)

40 comments:

  1. We could learn a lot from crayons. Some are sharp, some are pretty and some are dull, Some have weird names , and all are different colors, but they all have to live in the same box.............................................

    ReplyDelete
  2. 如果,人類也像鼠輩一般,花很多時間來吃飯和睡覺,一定會改善健康。 .............................................

    ReplyDelete
  3. 快樂是你與生俱來的權力,它不應該取決於你完成什麼。 ..................................................

    ReplyDelete
  4. 閒來無聊逛逛blog~~跟您打聲招呼~~.................................................................                           

    ReplyDelete
  5. 向著星球長驅直進的人,反比踟躕在峽路上的人,更容易達到目的。............................................................

    ReplyDelete
  6. 當一個人內心能容納兩樣相互衝突的東西,這個人便開始變得有價值了。............................................................

    ReplyDelete
  7. 教育的目的,不在應該思考什麼,而是教吾人怎樣思考............................................................

    ReplyDelete
  8. 生存乃是不斷地在內心與靈魂交戰;寫作是坐著審判自己。..................................................

    ReplyDelete
  9. 愛,拆開來是心和受兩個字。用心去接受對方的一切,用心去愛對方的所有。......................................................................

    ReplyDelete
  10. 人生就像一顆核桃,必須敲破它,才會顯出他的內容。.......................................................

    ReplyDelete
  11. 愛,拆開來是心和受兩個字。用心去接受對方的一切,用心去愛對方的所有。......................................................................

    ReplyDelete
  12. 知識可以傳授,智慧卻不行。每個人必須成為他自己。. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    ReplyDelete
  13. 愛,拆開來是心和受兩個字。用心去接受對方的一切,用心去愛對方的所有。......................................................................

    ReplyDelete
  14. 每次看完你的文章,總是回味許久,要經常發表喔。..................................................

    ReplyDelete