Saturday 20 February 2010

Personalized Search

I haven't blogged in a while now, but here's an update on what I've been doing since September. I've embarked on a Masters course in Computer Speech, Text and Internet Technology at the University of Cambridge.

It's a one-year Masters course on the state-of-the-art in Speech and Language Processing and its application to Internet Technology. The main aim of the MPhil course is to teach the fundamental theory of speech and natural language processing and its use in a variety of advanced applications, especially those related to the Internet.

  • Speech Processing: analysis, speech recognition, speech synthesis
  • Language Processing (computational linguistics): syntax, parsing, semantics, discourse
  • Applications: information retrieval, information extraction, dialogue systems, machine translation, question answering.

The CSTIT is a one-year postgraduate course, which combines lectures, practicals, seminars and a substantial research project. It starts off with a term of taught material (lectures and structured practicals) covering the foundations of speech and language processing. In the second term, students attend lectures on more advanced topics, participate in a small group seminar in which they study and present material on a research topic, undertake two longer practicals and start on their research project. A dissertation is submitted in the third week in June and students give presentations on their projects in the last week in June.


My project is about Personalized Search. I'll look at how we can learn a user's interests by looking at different data sources like browsing history, search queries, ... and how we can use those interests to make the output of search engines more biased towards the user using it. Because there are no publicly available corpora or data sets available for this, I need to gather my own data set. That's why I need your help!


I've created a FireFox add-on that captures some of that data. I doesn't get in the way of anything an guarantees your privacy. If you want to help me out, please visit http://alterego.caret.cam.ac.uk and install the add-on. The page also contains more information about the project.


You can also win a prize ;)

Wednesday 20 May 2009

Missing radix parameter

This message is given by JSLint when you are using parseInt() without the second parameter radix. This might cause a problem when the variable passed to parseInt() starts with a 0, which makes JavaScript interpret the value as an octal number. Or if the string even starts with “0x” parseInt() might come up with the idea to see a hexadecimal number.

>>> parseInt("8")
8
>>> parseInt("08")
0
>>> parseInt("010") // A juicy mistake, octal numbers.
8
>>> parseInt("0x10") // Probably rare, but possible, hexadecimals.
16
>>> parseInt("08", 10) // Prevent problems, use the radix.
8
>>> parseInt("010", 10)
10


And if the paramter passed to 
parseInt() is a variable that comes from some other place you can not be sure that the string does not start with a “0″. So using the radix might save a lot of headache.

Handle global variables in JavaScript

When people write decent JavaScript (read: when they use JavaScript frameworks like jQuery which handle the quirks of JavaScript for them) they will often assume that certain top level variables are there, without explicitly defining it.

e.g.: $("#test").html("test");

This line is assuming that jQuery has been loaded and that the $ object is defined. If jQuery isn't loaded, the code will fail and claim that $ is undefined. Because of this, the JSLint validator will complain when validating this code and will state the following:

Implied global: $ 1
Problem at line 1 character 1: '$' is not defined.

(Note that this will also happen when we call a function that hasn't been defined yet inside a function. Sometimes it's impossible to avoid this.)

This can and should be fixed by adding this to the top of the document:

/*global $ */

Inside this global statement we can define all of the top level variables we expect there to be. This will tell JSLint that it should assume those variables are defined (or will be defined in time), and it will no longer complain. It also a very nice way of documenting which files and frameworks we need to run a certain JavaScript file, so it's a win-win again.

Tuesday 19 May 2009

MyCamTools, UXI, Sakai 3 and other Sakai’s

A lot has happened in the world of (client side) development for Sakai over the last year. In fact, so much has happened that I haven’t been able to keep up with this blog.

We have managed to get MyCamTools out of the door and it has now been running in production at Cambridge for almost a year now. Reactions have been positive and we have had relatively few reported problems.

In August, we have started working on the UX Improvement project, based on designs made by Nathan Pearson, aimed at improving the Sakai 2.x UX. We have also introduced some Sakai 3 concepts into those screens. You can check the work at http://mycamtools.caret.cam.ac.uk .
I would like to thank Michigan, Indiana, Berkeley and Georgia Tech for their implementation help.

In March, we have started working on Sakai 3 RC 1, which would be a first step towards Sakai 3, and we’re hoping to bring into production this year.  A lot is going on, and it’s a fast moving target right now, so the best way to track the progress is to follow the dev server at http://131.111.21.17:9090/dev/ .

All of this might make more sense if you read through the presentation I recently gave at EuroSakai 2009 in Stockholm which you can find at SakaiEurope-Sakai3.pdf

Enabling JSLint in Aptana Studio

JSLint (http://www.jslint.com) is a pretty cool tool that allows you to validate your javascript code and find common bugs, style issues and pitfalls. Until recently, I was pasting my javascript code into JSLint from time to time, fixing some of the issues and then pasting it back into JSLint, ...

I have now however found a way of enabling the JSLint Validator inside Aptana Studio, which is the IDE in which I write all of my JavaScript code.

You can enable it by doing:

- Open Aptana Studio- Go to Window > Preferences
- Go to Aptana > Editors > JavaScript > Validation in the left hand menu
- Check "JSLint JavaScript Validator"- Hit OK- Go to Window > Show View > Validation
- You'll see the Validation on the bottom right of the screen- Now also click Toggle Information and Toggle Warnings (found on the top left corner of the validation pane)
- You're all set. JSLint will warn you whilst writing code

This is super useful, because I can now track JSLint issues while I'm writing the code, which should improve productivity!