Saturday, September 20, 2008

Coding Style

While at Google at got exerience with code reviews. Code readability and style is taken very seriously there and can drag out code reviews. So, I looked back at comments I received in our web integrated revision control system to see what my common issues where. I found a lot of stuff about style one would usually not notice.

  • Comment Grammar. They do check for spelling and Grammar in your comments. It gets kind of annoying to edit your sentences when you have to manually break at 80 cols and enter a new comment symbol. It would be cool if there was an IDE that would pop up a text box and let you type with a spell checker and a grammer checker. Then it would insert it into the code for you.
  • 80 characters line width. Seems like those 82's always sneak in. Check your code with a linter to find this kind of stuff!
  • Put a good description of the input and output of a program at the top. In a doc comment if your using Python
  • Document why your using < vs <=. This helps avoid of by one errors.
  • Variable names: The have to be descriptive, but not too long. Its is a definite trade off. I guess there is a feng shui to this one.
  • Capitilization style: the is usually a convention for capitilzation for variables, global constants, class names and function names. Some lint tools do this.
  • White space rules: for indentation and line continuation. (lintable)
  • Whitespace after , or binary operators. There is also not supposed to be whitespace at the end of a line. (lintable)
  • Don't hard code in any numbers, filenames, or many strings. They should be global constants.
  • Put the units in a variable name if it is a quanitity with a unit such as seconds
  • Use libraries whenever possible. This is why it is important to know what all the libraries do.
  • Avoid deprecated code.
  • Avoid redundant functions calls. For example, using flush() before close() if close does flush automatically.
  • Comment almost every line. There is probably some good ratio of comment lines to code lines if you where to compile some stats. Don't explain the language in the comments explain the program. To do otherwise would violate style rules as well.

MLSS

I had a great time at MLSS, but some people dropped the ball when it came to IT and logistics. So there are some things people need to get right at the next MLSS.

We need to make sure we have:
- IT:
- internet in the rooms where people are staying
- a lot bandwidth available (at least 50 Mbps total)
- enough wireless routers to handle the load of ~200 connections for some buffer
- a power strip at every table for people to plug their laptops into
- some US power plug tables and some Euro ones given there will be a lot of euro people
- Some tables with Cat 5 for people who have wifi troubles
- A Google Map with placemarks of all the locations we need to know
- message boards
- A transport board setup a few months in advance so participants can arrange transportation pland with each other.
- Video recordings of the lectures for videolectures.net

- Talks:
- Have the slides, references, labs, and code online in advance on the website (not some funky ftp server on someones laptop)
- All the lectures and activities scheduled in Google calenders
- Facebook and LinkedIn groups setup before hand
- A whiteboard for the lecturers to right on (with markers that work. Seems whiteboards always have dead markers.)
- Enough lab space for everyone to fit
- A notebook and a pen in the backpack schwag that is given out. People don't usually remember this when packing.
- More practical sessions and have lectures oriented towards practical sessions. I like the philosophy of Nando De Freitas, if you can't code it you don't understand it.
- The should also be feedback forms for the participants to fill out on the lecturers

Other:
- Maybe a mini-library with a few books around like Bishop ect.
- A vending machine around where people are staying in case they get the munchies
- A whereivebeen map showing where everone is from
- A MLSS reader in one pdf with all the relevent readers that would be useful in preperation for the lectures and practical sessions
- Energy drinks in addition to coffee at the breaks (In Cambridge we'll probably have tea being the UK)

Another possibility is for MLSS to experiment with the I-Clicker. My brother had to get one for his freshman physics class. The lecturer can ask multiple choice questions and get feedback from the whole class with the remote. It gives the lecturers feedback on if they are not explaining something in enough detail. It also would make the lectures more interactive so people would be less inclined to fall asleep.

Saturday, September 13, 2008

Presentations

At MLSS I've seen the spectrum of very good to very bad lectures. There seems to be a few rules that one could follow to avoid the pitfalls of the bad presentations. Such as,

Explain things through examples rather than cryptic mathematical definitions
Don't explain anything to complicated with your hands. Draw everything on a board.
Provide a motivation and backgorund on what you want to do.

Being a graphical models person I can't understand anything with out a graphical model anymore. It helps to specify the full bayesian inference scheme as the objective even if it is intractable. It helps orient people as a target for what you are trying to acheive. Many people explain non-probabalisitc methods without going through ths step. However, many of these methods have an implicit graphical model which be used to help orient people.

It is also good to use examples. People are very good at learning from examples. Once people understand something 90% it is then appropriate to supply a cryptic mathematic definition to completely disambiguate. For example, the DP is much more understandable with the stick breaking construction then the abstract definition. In physics thermodynamics is much easierto explain if you have one of those particle java applets that explains an ideal gas.

At one of the lecturers at MLSS I could look back in the audience and see that 95% of people where either surfing the internet, reading a book, or staring off into space. The lecturer seemed preety oblivious to it. Maybe he didn't even care. Many of us suspected he simply gave us a presentation he gave to his research group. He didn't take the time to make a new presentation even though his curent one was very innappropriate.

As for student presentations at the CBL, I've found there are a few reasons there are confusing:
1) the presenter doesn't really understand the topic themselves
2) Going over too much in too little time
3) language barrier

When niavely listenting to confusing presentations it seems like the person understands the topic so much no one else can understand at their level. However, in my experience, when you begin asking questions it becomes clear the presenting doesn't really understand the topic.

On the CBL wiki I've made a presentation guide with the following tips:

Technical Things

* Always label graph axis and be clear what graph represents.
o Use units if applicable (on graph axis and in general)
* Use page numbers on slides that show the current page number and total number of pages
* Don't forget your video adapter if you have a Mac. Mac folks always seem to forget this. Hugo has one in his desk that people can borrow.
* If your not presenting on your own computer then you should put the presentation in multiple formats: ppt, pdf, odf. Don't expect everyone else to have open office, for instance.
* Also, if not using your on own computer, make sure that the computer in the presentation has the necessary software for any demos. This includes Flash Player, Java applet support, any necessary video codecs, ect.
* Don't put too much text on a slide and keep fonts big enough

Before Starting the Talk

* think about, what kind of talk you want to give (rough idea of an algorithm, detailed description of sth, ...)
o depending on this you might not want to use too many equations (although the slides are not complete)
o keep it simple!
* give at least one test talk

Starting the Talk

* Might be best to start the presentation with material you expect most people already know. This allows you to synchronize that people are on the same page. Then start to introduce new things.
* It is good to establish the practical importance of whatever your presenting. Giving an example problem helps give people context. If everything is in the abstract then things become much more confusing.

During the Talk

* Always define what variables represent. maybe keep them on white board on the side.
o If necessary, define the dimensions of matrices and, if not obvious, the range of values variables can take (zero to one, positive, ...)
* If presenting a probabalistic model then put its graphical model, in Bishop notation, in your presentation.
* Give intuitive feel for all equations and variables to the extent possible
o Do this with examples and analogies
* Don't try to convey any important information with your hands alone.
o Never write out equations in the air with your hands (I've had a teacher who does this)
* Don't be afraid to write out and derive equations on the board
* In engineering problems it is always good to explain the input, output, and interface of any given system up front. If this is not clear people will get confused.
* If longer than an hour or so, give breaks for caffeine, snacks, etc.
* Don't rush through the slides. People should be able to read it! Explain, what's going on. Depending on your presentation style (more or less tutorial-like): 2-3 minutes per slide (in average) seems good

Voice

* Speak loudly, clearly, and not too fast
o Mumbling technical comments on the side only confuses people

Some Dos and Don'ts

* Don't point your laser pointer at people. It always seems kind of awkward.
* Don't point with hands. People can't see what your pointing at exactly. Use the laser pointer.
* Don't point to everything with the laser pointer.
* Do look at the audience
* Do modulate your voice and be interested in your own stuff. It's not trivial to most others!
* Do use examples and demos
o In intro physics they always like to use those Java applets of a spring oscillator and so on. Try to do the same if possible/applicable.

After the Talk

* Post your slides on the Presentation Archive
* Use our template if you can get Latex to work

References

* some hints if you give a short talk
* All the stuff in this guide to Terrible Presentations (…and how to not give one)

Friday, September 12, 2008

MLSS

I have been at MLSS for the last 12 days. We are located here.

There are a lot of good people here from ETH, ParisTech, Max Plank, and so on.

The lectures by Nando de Freitas, Richard Sutton, and Yann LeCunn were the most interesting. The was an intersting lecture by Shai Ben David on the theory of clustering. I am not sure what to think of it yet.

Nando talked about his new search engine Worio. It is supposed to cluster web pages when you enter terms with multiple meanings. It sounds like an intersting idea. He also showed a demo of an image search where you can refine you query by selecting images you like and don't like.

Last Sunday many of us went on a 50 km bike ride here

We've been to the beach several days as well. The water here is about 20 ÂșC, much warmer than the Pacific. I've taken several pictures which I will need to post too.

The internet hasn't been very good here though. We don't have internet in the rooms and the wifi can't handle the load of all the attendees.

Hello World

Hello world test for my new blog of cool machine learning anecdotes.