Saturday 20 February 2010

Personalized Search

I haven't blogged in a while now, but here's an update on what I've been doing since September. I've embarked on a Masters course in Computer Speech, Text and Internet Technology at the University of Cambridge.

It's a one-year Masters course on the state-of-the-art in Speech and Language Processing and its application to Internet Technology. The main aim of the MPhil course is to teach the fundamental theory of speech and natural language processing and its use in a variety of advanced applications, especially those related to the Internet.

  • Speech Processing: analysis, speech recognition, speech synthesis
  • Language Processing (computational linguistics): syntax, parsing, semantics, discourse
  • Applications: information retrieval, information extraction, dialogue systems, machine translation, question answering.

The CSTIT is a one-year postgraduate course, which combines lectures, practicals, seminars and a substantial research project. It starts off with a term of taught material (lectures and structured practicals) covering the foundations of speech and language processing. In the second term, students attend lectures on more advanced topics, participate in a small group seminar in which they study and present material on a research topic, undertake two longer practicals and start on their research project. A dissertation is submitted in the third week in June and students give presentations on their projects in the last week in June.


My project is about Personalized Search. I'll look at how we can learn a user's interests by looking at different data sources like browsing history, search queries, ... and how we can use those interests to make the output of search engines more biased towards the user using it. Because there are no publicly available corpora or data sets available for this, I need to gather my own data set. That's why I need your help!


I've created a FireFox add-on that captures some of that data. I doesn't get in the way of anything an guarantees your privacy. If you want to help me out, please visit http://alterego.caret.cam.ac.uk and install the add-on. The page also contains more information about the project.


You can also win a prize ;)