Tuesday, February 8, 2011

The Social Network: (In)accuracies regarding the Computer Science

Typically, movies that have any relevance with computing or technology tend to irritate me, regarding technological accuracy as well as the overall atmosphere and attitude of characters. One may think that movies involving technology would be my passion, but I've spent so much time with actual technical people that I always find their portrayal in film a bit off.

I recently got a chance to watch 'The Social Network', and I found that overall it was a great film. I expected this much, as it had a great cast, an excellent director (Fincher), and a score produced by Trent Reznor, of all people. I had been apprehensive about seeing it because I knew there would find fault with the way it represented technology in some way. I also felt that Facebook was still too young to warrant its accolades in film, and that this was a cheap way to cash in on its popularity. A movie about a web site that I barely care about didn't seem to have it's appeal to me.

I wound up loving the atmosphere it created, and I mostly liked the portrayal of the characters and the programmer mindset. In many ways I was taken back to my recent CS education, I saw many parallels between Mark's character and my own as a student. The time spent in labs, the endless barrage of projects.

However my initial hesitations remained correct, there were things that did bother me. Much attention has been paid to how the movie has it's inaccuracies in terms of the people and company being portrayed, and I don't want to rehash that. (Though, for all the talk about this movie NOT being a documentary, I was surprised when Mark Zuckerburg's character was actually called 'Mark Zuckerburg'. Similarly, the company was actually Facebook, etc. I incorrectly assumed that the names had all been changed to protect the innocent.) The things that get to me are technical. They're mostly pedantic in nature, and I understand that. Still, in much the way a medical professional cringes when they may watch anything involving medicine, or the way how the screen portrayal of crime & justice is at best a parody of reality, as a programmer it's hard not to focus on what is off in a movie like this instead of just enjoying it.

There was a lot that I did feel they got right, though. I think in a sense they managed to capture the feel of a tech start-up, to the best of my knowledge. That shift between a personal project into a commercial endeavor. They used a (modified) version of Mark's real LiveJournal posting of that day to describe his little Facemash activity. The content of the message felt authentic (because it was), but the delivery perhaps not so. I've never heard anyone talk like that, but I've seen people write like that. Hard to describe to a non-programmer, but the dramatic inflections were just.. off. Overall I liked that it showed how the youngest billionaire started out with that hacker ethic, but they perhaps got his motivation wrong. As Mark Zuckerburg has pointed out, it's hard for people who are not programmers to understand the desire to build something for the sake of creating it, and Hollywood went out of the way to imply that his creations were for the sake of women and money. Never mind that the real Facemash featured both men and women, but I digress.

I paused the screen to see the software that Mark was using, and we got nice KDE desktops on a presumably Linux-based system. Appropriate versions for the era, too. No one set background images, though, odd. Mark does mention that he was using Mozilla as a browser, while we see the default KDE's Konquerer on his screen. Producers probably too lazy to find the appropriate antiquated version and have it installed. Plus this flashes by the screen for only a few frames.

Mark uses Emacs (yay!) and we see that throughout. We see a bash session where it looks like he installs a 'php-elisp' package (appropriate) and tries to run Emacs twice. I can't recall the errors but I think they were just console warning messages that would display on run. Now, he does say he's writing Perl, but we can assume he was well versed in such, and only installed Emac's PHP package as part of his struggle to understand one of the dorm's face-book's that he complains about.

Most technical things in the film were correct, but again I mention that the delivery of these technical subjects just felt odd to me. "Give me the algorithm!!" isn't really something that I've heard myself, although yes, algorithms are an integral part of Computer Science and it's neat that we saw some emphasis. Mark probably would have just looked it up in his text, or asked a friend politely. The Hollywood way to do it was to write it on a dorm window in wax pencil, it reminded me of the over-the-top depictions of NORAD in film, where everything is written on upright sheets of glowing glass.

There was the scene where mark asks for an additional LAMP server, which in reality I don't think he would have enumerated the acronym in the way he did. The line there seemed like a forced attempt at authenticity, but the dramatic reading of "Linux, Apache, MySQL" was off.

Those "He's plugged in!" lines were ridiculous too. It's not the Matrix, I've never heard of software engineers at a system being referred to as being "plugged in". Although it's true - don't interrupt someone when they're writing code - they're on a roll.

I realize I'm being pedantic here, and so I don't really fault the movie for it. They have to make the topic exciting, after all. I just find it personally distracting to see my field portrayed the way it is, and that is why I have a hard time watching movies like this. The Net, Antitrust, Swordfish, it's always a similar story for me. The Social Network is the best so far, but they were still off.

The scene that went over the top was the Hacking competition scene. Now, I've been involved in ACM programming competitions, and I've seen a few hacking competitions. I don't know how they do things at Harvard but that was ridiculous. Real hacking competitions are lengthy, and while they may involve a few beers they are not drinking games. The audiences are also a few notches toned down, or nonexistent. It makes for a poor spectator sport. Milestone results are generally posted onto some bulletin, where people can check in periodically. Programming is an intense intellectual discipline, and few people have the tolerance to maintain enough composure after two shots to get any real work done. Let alone 8 shots. The programming competitions I've been involved in had zero spectators, lasted a few hours, and almost no one spoke. You would have thought this scene was a wrestling match.

The scene that really got to me was the OS lecture scene. The professor asks the course a question, to which no one can reply. Mark leaves the course, is insulted for doing so, and then casually gives the correct response. This is a trope I've seen a number of times before, we get to see a nice contrast between Mark's intellect and his attitude. Of course, when I hear a question being asked in this film, my mind will scramble to come up to an answer. When I could not follow the discussion, this was frustrating. I rewound that scene three times, trying to understand what the actual question being asked was, and how the answer relates. Confused, I turned to the internet to find any technical discussion on this scene and turned up in failure. I did not find any (hence the inspiration for this blog post), but I did find the original film's script for the scene. Continued searching revealed the actual Harvard CS161 lecture notes to a 2007 course. Let's compare:

MARK is in his Operating Systems class. This is considered the
hardest class at Harvard and MARK is one of the 50 students
with their laptops open as the professor takes them through an
impossibly difficult lesson.
Okay, let’s look at a sample problem:
Suppose we’re given a computer with a 16-
bit virtual address and a page size of
256 bytes.
A GIRL scribbles something on a piece of paper. Then hands it
to the student next to her and nods that it should be passed
over to MARK. While that’s happening--
The system uses one-level page tables,
that start at address 0x0400. Maybe you
want to have DMA on your 16-bit system,
who knows? The first few pages are
reserved for hardware flags, etc.
MARK opens the note. It reads “U dick”.
He looks over and sees a couple of GIRLS looking at him with
Assume page table entries have eight
status bits.
MARK closes his laptop, gets up and starts to head out of the
The eight status bits would be--
(re: MARK)
And I see we have our first surrender.
Don’t worry, Mr. Zuckerberg, brighter men
than you have tried and failed at this
(calling back)
1 valid bit, 1 modify bit, 1 reference
bit and 5 permission bits.
That is correct. Does everybody see how
he got there?

Lecture Notes:
Computer Science 161: Operating Systems
Processes, Scheduling, VM, Writing a Design Doc
CS161 Course Staff
March 8, 2007
3 VM
VM is one of the most useful things you learn in this course, since probably for the first time you no longer need to be surprised by the “magic” the operating system provides. It’s also one of Matt’s favorite topics, and so will appear on the midterm. This wouldn’t necessarily be a great reason to really learn it, but you also have to implement it so understanding what’s happening sooner rather then later will save us all a lot of trouble.
Key terms: page table, MMU, TLB, TLB miss, page fault, page frame,
internal fragmentation, external fragmentation
A sample problem: Suppose we are given a computer with a 16-bit virtual
addresses, and a page size of 256 bytes. The system uses one-level page tables, which start at address 0x0400. (The first few pages are reserved for hardware flags, etc. Maybe you wanted to have DMA on your 16-bit system, who knows?) Assume page table entries have eight status bits: 1 valid bit, 1 modify bit, 1 reference bit, and 5 permissions bits (this is a very secure system).

*How many pages are there? How much memory do the page tables require?

One of these is clearly derived from the other, obviously the former was lifted strait off of the latter. In a way this is cool, to have dialog in a movie taken from the actual course it's supposed to represent, but the delivery was botched. I mostly liked what the scene was supposed to be, there was a slide with an MMU on it, and what appeared to be the mapping of virtual memory pages to real memory. This would have been a very relevant slide for the topic under discussion (Virtual Memory). Now, having seen the original source to the dialog in this scene, the question makes sense. Or rather, it makes sense to me why the question makes no sense at all.

In the lecture notes, the professor is describing a hypothetical system, while providing additional irrelevant commentary in describing it. (Everything in parenthesis.) In the film, the entire question is delivered with an inquisitive tone, including the bits that are not inquisitive. Then the killer is that the actual question about paging is never asked in the film, but rather the professor asks about the assumptions in the page table entry. The problem is that the fact that this hypothetical system used one valid, one modify, one reference, and five permission bits is made up, just as a demonstration. There is no way to infer this knowledge from the question being asked. Unless the students had read the mind of the professor, or had notes printed out, no one could know.

Mark is not giving an answer to a question, and the professor hasn't even asked one. Mark isn't even giving critical details to the question, but rather just the embellishment part of the question. The actual question that would have been asked, is never asked in the film. ("How many pages are there? How much memory do the page tables require?") This is the question I was anticipating, recalling the sorts of problems faced in my own system arch and OS courses regarding paging.

I may as well go ahead and answer it. To the best of my knowledge, this means that we have a system capable of addressing 65536 bytes (2^16) of memory. Divided into 256 byte chunks, that means we have 256 pages on the system. (8 + 8 bits, in other words) Since each of these 256 pages requires a one byte page table entry (As mentioned, the 1/1/1/5 being irrelevant), that means our page table is an even 256 bytes, fitting nicely into the first page (of the four reserved for system, 0x0400 being the first 1K of memory). The breakdown of the 8 bytes is inconsequential for our example, it is extraneous information. It is mentioned just as a hypothetical. (More so to keep things at a simple one byte entry.)

So long as I'm not overlooking something, this question is actually pretty tame, I suspect it to be a feeler question used to introduce the topic to students in CS161. Even if I'm missing something in my answer, that's outside of the point that this dialog was seriously botched.

What's truly frustrating about this scene - and why I don't feel that I'm being pedantic in saying so - is that the entire purpose of this scene is to illustrate Mark's brilliance, by having him giving a correct answer to a difficult question. But they got it so wrong, he's uttering nonsense in the context of a non-question. To anyone familiar with the material, no, we *don't* see how he got there. They seem to have worked very hard at creating some level of accuracy, I'm surprised that no one pointed this out. They certainly had some technical reviewers.

It is acceptable and forgivable when a film has technical inaccuracies hidden in the background that are displayed for a single frame, the fact that they happen makes sense. But gaping holes like this just gut this scene. It hurts personally, because it ruins what could have been the first, best, and last scene in a major motion picture that takes place in a computer science OS lecture room. I appreciate it when a movie features characters outside of those that are familiar to Hollywood, (Usually menial service industry tasks and the movie industry itself.) and it's an opportunity for people to peer into the lives of others. I and most other programmers feel that the discipline is not well understood outside of the field ("You like, know how to fix computers and use Photoshop, right??") and here was the shot to get a glimpse of reality. Perhaps some of the character of the lecture was accurate, but the material was insulted in a way, by not checking its accuracy.

Imagine a professor asking a question in a math course,

Professor -- "One train is leaving City A at 100km/hr, towards City B. A second train leaves City B at.. anybody?"

Student -- "160km/hr!"

Professor -- "That is correct. Does everyone see how he got there?"


Lastly, the OS lecture had way too many girls in class. It's a well discussed issue among CS educators that for some reason the field is dominated by men, it is a profession that is 95% male. No one really knows why this is so, and why the related field of mathematics has women outnumbering men in many places. If only my classes had that many women in them. (Not for personal reasons, of course, but for egalitarian reasons.) Discussion online has corroborated this, students that have taken CS161 at Harvard agree, mostly dudes.