10-04-06 - 1

I'm assuming it would be a good thing to load the Netflix Prize data into an SQL DB like Postgre. The thing you want is to cross-index; for each movie you want the list of everyone who's rated it & their ratings; for each user you want all the ratings they've assigned to movies. You could do this easily in C just by having two linked lists on a {user,movie,rating} data item. The only reason for me to not just do it in C is that it's a ton of data and might not fit in memory. Plus I guess learning SQL might be useful some day.

Anybody who knows Windows - what's the best way to hold an 800 MB data structure on a machine with 1 GB of RAM ? I'm assuming if I just hold it in RAM I'm going to page like crazy and get killed, cuz Windoze is Stoopid. I guess I could write my program in basic C and boot to command line Linux or something ridiculous like that but that's not fun.

No comments:

old rants