Mobile Insights

Thursday, September 27, 2012

Machine Learning With K-Pop

I've finally found some time to write a few blog posts about some areas of interest that I've wanted to write about that are too lengthy to cover in a G+ post. This post is about Korean pop music, machine learning and the Pandora Internet radio station that I recently created, Super K-Pop Radio, for my new favorite music genre.

If you follow my Google+ or Twitter accounts, you may have noticed all the links I've been posting to K-pop music videos on YouTube. From now on, I'm only going to do that when I find something particularly noteworthy, so as to avoid annoying my friends who have different tastes in music. I am sharing a big YouTube playlist of K, J, and C-Pop music videos, so you can check out that playlist if you're interested in all the new videos that I've discovered.

A few months ago I decided to make a serious effort to start learning Korean, primarily for business reasons (there are a few times when I think it would've been helpful for me to know a little Korean when working with one of our OEM partners on previous Nexus phones). It's something I've wanted to do for a while, and once I got started on it seriously (I'm a big fan of the Pimsleur audio CD lessons, so I ordered the Korean I 30-lesson course and am working through that right now), I discovered that I really enjoy the way the language sounds and the abundance of high quality pop music and dramas coming out of South Korea in the past few years.

About 5 years ago, I made a similar decision to learn Japanese, and I managed to learn a thing or two (also using the Pimsleur CDs plus a few books) before losing interest and moving on to other studies. I definitely want to continue studying and improving (or at least not losing) my limited Japanese, but for the moment, it's more exciting for me to work on Korean, and I have many more opportunities to practice hearing the language. There are lots of things I'd like to write about learning a new language, and about aspects of Korean that remind me of things I'd previously learned from other languages, like Japanese and also French (similarities in the vowel sounds and the way the words connect together when spoken), but this post is about music, so I'll save those thoughts for another post.

The event that really kicked off my current K-obsession started a few months ago when Google released a native YouTube player for the Playstation 3. When I dusted off my PS3 to try it out, one of the top-ranking music videos was, you guessed it, PSY's Gangnam Style (technically I believe it was the duet version of the song with Kim Hyuna). Needless to say, I was hooked, and discovered many other cool K-pop videos in short order through the updated PS3 YouTube interface. The original Gangnam Style has over 278 million views on YouTube as I write this, and the duet version has over 54 million. Clearly this is a world phenomenon. So I started to get really seriously into K-pop, as it encompasses genres of music that I already enjoy, namely catchy dance-pop with good melodies, interesting rhythms, and R&B and electronica influences.

I've been a fan of J-Pop for many years, but J-pop as a genre doesn't really encompass the entire spectrum of uptempo pop music styles that I like. For example there's not a lot of R&B or funk influence in Pizzicato Five or Puffy AmiYumi songs that I can discern. K-pop seems to draw from a broader base of influences that match my own musical tastes, and I have some interesting data to prove that point, and maybe a few others. In fact, I think I can explain the popularity of PSY's Gangnam Style and Carly Rae Jepsen's Call Me Maybe (nearly 270M views!) quite easily based on their shared musical attributes.

The subject of this post is "machine learning," and one of my favorite methods for discovering new music in my favorite genres is the Pandora Internet radio station. I must apologize here to my international readers, because Pandora is currently only available in the U.S. (with limited access in Australia and New Zealand, according to Wikipedia) due to the ridiculous complexity and expense of today's music licensing landscape.

The way that Pandora works that's different from other Internet radio stations is that every song in their archive has been listened to by a trained reviewer, who tags the song in Pandora's database with a set of attributes describing its various musical characteristics (genre, tempo, major/minor key, instrumentation, singing style, etc) according to well-defined criteria for each genre of music. For more detail, see this 2009 article or this 2011 interview.

This is different from other music matching systems such as iTunes' "Genius playlist" / "Genius Mix" feature, which uses patterns discovered from the user's own play history and ranking plus the selection of tracks that are listened to by the millions of other iTunes users using the service, to create playlists of similar songs. A "Genius mix" is just a larger playlist based around the genres that the service determined that the user listens to frequently, that is shuffled and looped like a radio station, but stored locally. The user can create any number of Genius playlists based around any single track in their library, but the list of Genius mixes and their contents is determined by the user's listening history and Apple's algorithm, and is not configurable.

Another common approach to playlist creation is taken by Google Play Music's "instant mix" feature, which uses automatic algorithmic analysis of the frequencies and rhythms of the songs in your library to find other songs with similar sonic features (according to the algorithm) to create a playlist around songs with similar themes. I would also recommend a free program called MusicIP Mixer, which is currently unsupported (the company was sold to Gracenote) but still freely available for Windows, Mac OS X, and Linux, that scans your local music collection and then generates playlists that seem to be about as good as Google's instant mixes.

Both types of approaches (user behavior analysis or algorithmic analysis) work very well for creating playlists from your own music collection (presumably composed mostly of songs that you already like), but not so much for discovering new songs. With Pandora's approach, you crate a new station starting from a specific genre, or using an individual song or artist as a seed (you can have more than one seed, but it's usually best to start with something specific). I tried to do this with a few K-pop artists but then discovered I got a much better result by starting with the entire K-pop genre as the seed. For each song that plays on a station, you can give Pandora a thumbs-up or thumbs-down, and it uses those scores to determine which songs to play on that station in the future.

So we have human reviewers who have cataloged each song in Pandora's library, and then human listeners scoring each song that plays on each of the millions of channels that people create on the service. Where does the machine learning come in? When people think of artificial intelligence, they often think of neural networks and other complex systems designed to simulate the behavior of human brains. That makes for great science fiction, and can actually be useful in certain areas such as speech and vision recognition, but for this application, we don't need anything nearly that complicated.

Instead, the simple approach (simpler and more predictable than a neural net, anyway) is to imagine each song occupying a position in a very high dimensional space (you can't actually visualize this, but try to imagine it conceptually). In this space, instead of the X, Y, and Z dimensions of the real world, you have a dimension to represent each type of musical attribute that is recorded in the system. Because there are so many attributes, there are very many dimensions, but for each dimension, there may be only a very limited number of possible values. In fact, for many dimensions the only values may be 0 or 1: a song either does or does not have an acoustic piano, or a laid-back female vocalist, or whatever, and every single attribute has its own unique dimension in the problem space.

Because there are so many dimensions, each song can be thought of as having its own unique (or nearly unique) position within this high-dimensional space, even if a particular dimension only has 2 or 3 possible values. Instead of an X,Y,Z position in space, each song's "position" is a long list of numbers representing its value for each of the many attributes along which it has scored by Pandora's reviewers. "Nearby" songs in this high-dimensional space will have close or identical values along many dimensions, while "distant" songs will have few. For each radio station, Pandora has to decide which song to play next based on the feedback given by the user (thumbs up or down) for previous songs that it has played on that channel. Songs with thumbs down are eliminated from the channel, but how can the thumbs-up scores be used to discover similar songs with similar attributes?

This post is long enough already and I'm not an expert in the field, but the Wikipedia entry for support vector machines goes into much greater detail on how the math works. Essentially what the Pandora service does (or is likely doing, since I have no inside knowledge of their algorithm) is a whole bunch of matrix arithmetic to construct a hyperplane (or more complex shape) to cut through this high-dimensional space of musical attributes, dividing the songs that are likely to be appropriate to play on that channel (based on the weighted thumbs-up scores of previously played nearby songs) from the ones that are likely to not be appropriate. Thumbs-down scores only eliminate a particular song from the channel; they don't signal a dislike for similar songs in the same way that a thumbs-up does, but I think they do signal a slight dislike for other songs from that artist. Similar models are used for tasks such as email spam classification or automated sorting of news stories into categories (e.g. Google News).

The primary weakness of this approach for music is that there isn't a dimension in the Pandora database to express whether or not a singer or a song is actually any good, by the individual listener's subjective standards of quality. The only way that Pandora could build a system that scales up to its current size is by limiting their list of attributes to reasonably objective attributes for which a relatively large number of human listeners can be trained to give consistent scores across songs and reviewers. Whether a song uses an acoustic piano or an electric guitar is something everyone can agree on, but whether a singer or a song is any good is not so clear-cut. So here's what happened when I created Super K-Pop Radio and started rating songs based on my personal standards of good pop music.

The main problem that people typically have with Pandora is that they always get the same 20 or 30 songs and have a lot of trouble adding more variety to a particular station that's of high quality. For those who aren't familiar with the service, the particular Devil's pact that Pandora had to make with the recording industry in order to be considered an Internet radio station, as opposed to some other kind of music service with an even more exorbitant royalty structure than what they currently have to pay, requires them to limit the number of songs you're allowed to skip: currently the limit is 12 skips across all channels every 24 hours. The limit used to be 6 skips per channel and 72 skips total in any 24 hour interval, but the new system actually works better for me when I'm focused on fine-tuning a single station like I am now.

So this limitation makes creating and tuning your stations a sort of game, because when a bad song comes on, you have to decide between listening to the whole thing and rating it thumbs down after the next song starts (which has no penalty) vs. hitting thumbs down immediately to skip the rest of the song and losing a chance to skip an even worse song that might come on in the future.

Because of the limitation on skips, sometimes it's difficult to build up a large thumbs-up list due to the combination of horrible songs in the exact genre that you want, horrible songs in a completely random genre, and songs you've already rated that it's already played for you 5 times in the past 5 hours. Fortunately for the last case, there's a separate "I'm tired of this song" button which will remove that song from all of your channels for some period of time, which works fairly well for me once I have a large enough collection of thumbs-up on a channel.

Pandora really needs to add an "achievements" system like Xbox to give you trophies for listening to a particular song 20 times or 50 times or 100 times on a channel. It does seem to have a habit of finding 5 or 6 songs that it thinks are particularly representative of that channel to play for you as often as possible until you are really and truly tired of it, but once you do mark it "tired", usually a few new songs pop up in the list that might have been hidden by the greater "star power" of the songs it wants to play all the time. But as a challenge I like to listen to those songs as many times as I can absolutely stand to before clicking "I'm tired". My playlist is long enough that by the time it comes around again, I'll probably really be wanting to hear it.

So here's the interesting thing that happened with the K-pop channel. Instead of getting the same 10 songs, I actually accumulated a collection of nearly 100 thumbs-up (out of about 200 songs total) before Pandora completely ran out of South Korean bands to play that it thought were closely related. Then it switched completely to songs from non-Korean bands. What I discovered is that the songs that I like in the K-pop genre blend seamlessly into songs from R&B, synth pop, J-Pop, and so the base of songs that match all the Pandora attributes that I could add to the station became huge.

But I didn't actually like more than a small number of songs from a small number of the non-Korean bands. The other thing I quickly discovered is how much bad R&B, bad synth pop, bad Latin pop, bad dance pop, etc. there is out there. By this I mean I have a very high standard in a few areas when it comes to music that I really like and want to listen to, in general:

Very high quality singing. Auto-Tune can conceal a multitude of flaws, but somehow there are so many bad singers (to my ear), including some big names, like Kesha (horrible!), Lady Gaga (I like a few of her songs, when she's singing in tune), Katy Perry, Justin Bieber, etc. that if I didn't have plenty of good K-pop songs to listen to, I'd probably abandon the entire genre.
Equally high quality musical performance and production values.
Interesting and enjoyable composition, melody and rhythm.

It's the third item that is the most challenging to judge, and of course the most subjective, but usually I can come down on one side or the other after one listen. BTW, it amazes me how many Disney child singers and musical soundtracks come up (because they're in exactly the K-Pop style) and how, even when the singing is not bad, the composition of the songs is always just so soulless and boring.

If you're a parent of young children who has to listen to those Disney soundtracks all the time, and you think it's horrible soulless corporate garbage, it's not because you're old and out-of-touch, it's because it really is soulless corporate garbage that doesn't have any reason to justify its existence. Buy your kids some Girls' Generation or Super Junior or the K-Pop version of whatever style they like, and you still might not like that kind of music, but at least it'll have some soul and some actual musicality to it, and they'll be better off for having some quality music to listen to.

So after many hours of listening and 300 or so rankings of non-Korean bands, Pandora is now starting to mix in songs from the K-Pop bands again, and even branching out into J-Pop and some other areas that I also like. But as I write this, I only liked 124 out of 527 songs: 100 songs from S. Korean bands plus 24 songs from the rest of the world.

If you like stats, I actually made a spreadsheet to add up the numbers, which I know marks me as a hopeless geek. I'm continuing to update the spreadsheet whenever I listen to the station, so here's a link to view the current version and here are snapshots at 500 songs and 400 songs ranked (you can open them in different tabs and flip back and forth). BTW, if you use Gmail and are concerned about your email privacy, you may want to open those links in an incognito window or log out of Google before opening them because otherwise your username will probably show up in the list of viewers where any other viewer could see it. Edit: because I made the doc world readable, Google Docs will keep your name and email address private and you'll show up to other users as "Anonymous user #x" if you're signed in.

Column A is the artist, column B is the # of songs from that artist that I rated thumbs-up, column C is the # rated thumbs-down, column D is 1 for a South Korean band, or 0 otherwise. B2 is the total thumbs-up count, C2 is the total thumbs-down count, D2 is the # of Korean bands in the list, E2 and F2 are the # of Korean bands I gave thumbs-up and thumbs-down ratings to (columns E and F are formulas I copied down the page, multiplying the cells in that row in columns B and C by the value in D, so the values are set to 0 for the non-Korean bands). G2 and H2 are the non-Korean thumbs-up and thumbs-down values, and in column H and I, I've put some formulas to provide the interesting stats.

So if you want an explanation for the popularity of both "Gangnam Style" and "Call Me Maybe", I think the simplest explanation is that they're catchy pop songs with high musical production values and good composition in a genre where it's easy to stand out from the rest of the pack because the rest of the pack (in this particular genre) tends to be pretty mediocre. I think it's not a coincidence that PSY, aka Park Jae-Sang, ~~is a graduate of~~ attended Boston University and Berklee College of Music. He knows his music theory and he has 10 years of experience making records in South Korea before Gangnam Style rocketed to the top of the charts. Edit: the Pandora bio for PSY says he graduated from both schools, but other sources say that he attended but did not graduate.

If there's a distinguishing characteristic of K-Pop as a genre, I would say it's the quality and attention to detail as opposed to something that's specifically Korean (they're not wearing hanbok and playing the gayageum or anything like that). I think I'm definitely a bit unusual for being so particular and not being able to really enjoy music that isn't of super high quality, because a lot of people don't have a good ear for pitch so they don't even notice the little things that turn me off, and it definitely limits the number of songs that I enjoy listening to, but then again, I had private piano lessons as a child and played in the school band for many years, so I have a certain set of standards that people who didn't play musical instruments as a kid don't have. I wouldn't be surprised if the South Korean school system and society place a much higher value on musical education compared to the U.S.

I'll keep listening to the Pandora station and adding thumbs up and thumbs down, and feel free to listen along if you enjoy the genre as much as I do. I'll try to start on my next blog post in a day or two.

Tuesday, July 10, 2012

Adventures In Analog Audio

I recently started to digitize my parents' vinyl record collection, as well as any interesting LP's I can find at music stores and from friends. Vinyl has made a big resurgence in recent years, so it's a great time to pick up the right equipment to do the job properly. While you're reading this, check out the videos I posted to YouTube of the first three tracks of Breezin' by George Benson, which happened to be the album with the best audio quality in my collection so far. Be sure to set the video quality to 720p HD in order to hear the audio at full quality (384 kbps stereo AAC).

First, a comment on the myth that vinyl sounds "superior" to CD's. I know there are a lot of people who believe this, but it's not literally true. Vinyl imparts a particular sound to music which some people enjoy. I know I have songs in my library where the artists have intentionally inserted the pops and hiss and dynamic compression of a vinyl record into the song itself, but in the best case scenario, CD quality digital audio will offer superior audio quality to even the best mastered LP records.

There's a historical element of truth to the belief that some albums sound better on LP than on CD, because good mastering of LPs has been well understood since the 1970s, while poor mastering of audio CDs was common until the mid-1990s. There are a lot of tricks that needed to be discovered to build up a set of best practices for mastering digital audio, just as there are tricks for mastering LPs for the best quality. Unfortunately, CD's issued in recent years sometimes suffer from the disease of over-compression, a consequence of the Loudness War, which leads to poor audio quality.

Don't get me started on the loss of audio quality caused by the common psychoacoustic (lossy) compression techniques used by MP3, AAC, Ogg Vorbis, and other popular audio codecs to remove portions of the audio that your ears and brain are not supposed to notice are missing. That's a topic for another post. I'm digitizing LPs at DAT quality (48 kHz, 16-bit samples), saved as FLAC (lossless compression which preserves the original bits), from which I can convert to MP3 or AAC if needed, so the appropriate comparison is to CD's converted to FLAC.

For those who want to believe in some mystical purity of analog audio, you're buying into some voodoo. For more on the psychology of the vinyl vs. CD debate, as well as the thoughts of recording engineers with experience in these matters, see this NPR story, Why Vinyl Sounds Better Than CD, Or Not.

The upshot is that in the very best case, an LP can sound indistinguishable from a CD, but there are a whole bunch of little details that you have to get right in order to get the best sound. For a CD, the only piece that needs to be of a high quality is the final digital-to-analog conversion stage, which usually happens in the receiver connected to your speakers, or the audio chip connected to the headphone jack in a portable device. The only requirement for the CD is that it not be so scratched or damaged that the bits can no longer be read from it. For an LP, the quality of the analog signal from the record needle has to be preserved as much as possible until the samples are digitized. This means that the record needs to be clean, the needle needs to be clean, and the turntable has to be well-designed to avoid distortions such as rumble, wow, and flutter.

Traditionally, a turntable will output an analog stereo signal at the unamplified voltage generated from the magnetic fluctuations in the cartridge, measured in millivolts. Then a phono pre-amp is used to amplify the signal to the more common line-level voltage used by analog audio inputs to a stereo receiver. The pre-amp also has the important task of reversing the RIAA equalization that is performed before mastering the disc. RIAA equalization reduces the bass frequencies by up to 20 dB so that the needle doesn't jump out of the groove on tracks with heavy bass. It also boosts the treble frequencies by up to 20 dB so that the background hiss picked up by the needle due to imperfections in the surface is reduced when the record is played back (similar to Dolby noise reduction in cassette tapes). Finally, the analog line-level signal can be digitized by an analog-to-digital converter.

Rather than three separate boxes, I'm using a single turntable that was designed for this application, as well as for DJ's spinning records in clubs: the Stanton T.92 USB. In addition to outputting analog audio at either phono or line-level, it includes an A/D converter and both USB and coax S/PDIF outputs. For digitizing, I've connected it to my MacBook running the free Audacity audio editor. So all of the analog stages are happening completely inside the turntable, which minimizes the chances of introducing interference or reducing audio quality due to cheap cables connecting the components, and eliminates having to set the various volumes to the appropriate levels to avoid clipping.

I had to change two settings in Audacity to get the best audio quality. The first was increasing the sample rate from 44.1 kHz to 48 kHz. While this makes a small difference in practice, especially if your hearing in the upper frequencies isn't good enough to notice, but for this turntable, it sounded to me like the absolute quality is worse at 44.1 kHz than would be expected merely from the lower sample rate. Either way, it's better to capture at the higher sample rate and resample down to 44.1 kHz if necessary than to have not captured that info in the first place.

The second change was to disable dithering in Preferences: Quality. Dithering is a step that is intended to reduce unwanted high-frequency "aliasing" but in practice, because the samples are originally 16-bit, they don't need any further processing or rounding of values, and any such dithering only reduces the high-frequency aspects of the sound. I've discovered that iTunes on Mac OS appears to automatically dither the output, which leads to slightly reduced quality in my experience (it's especially obvious that this is what's happening if you try to play a high-frequency 16 kHz "mosquito tone" through iTunes, which will play with loud low frequency overtones introduced by the dithering process).

Finally, I replaced the turntable cartridge bundled with the T.92 with a replacement cartridge designed specifically for digitizing records, the Ortofon Arkiv Concorde. It was a simple plug-in replacement for the head shell bundled with the T.92. One word of caution: the cartridge came with an O-ring to insert between the cartridge and the tone arm, which was differently sized to the O-ring on the bundled head shell. I originally used the O-ring from the Ortofon, and it did not dampen the sound sufficiently, causing the tone arm to vibrate loud enough that the music was clearly audible from the audio vibrations of the tone arm. This is clearly going to introduce distortion into the process. Using the O-ring from the Stanton head shell with the Ortofon cartridge eliminated the vibration transfer to the tone arm.

The last point I wanted to make was the importance of cleaning the album before playing it. Vinyl albums can easily build a static electricity charge, which attracts dirt and dust, leading to pops during playback. I was able to acquire a Discwasher cleaning brush from a friend, which works like a large lint brush to pick up the dust from the records. These cleaning kits originally came with isopropyl alcohol-based cleaners, which are not recommended because they can damage the vinyl. Instead, I'm using a fuzzy cleaning cloth from an LCD cleaning kit (the kind used for cleaning glossy LCD screens), along with a bottle of Xtreme Klean screen cleaner from Fry's electronics. The ingredients are listed as deionized water and "proprietary polymers." More important is what it doesn't contain: alcohol or ammonia. It's also antistatic, which is good. Records should be cleaned by brushing in a circular fashion, so as to minimize the chances of scratching the grooves.

I'll add some photos and additional comments later, but I think this covers the basics. Please let me know in comments if you'd like any more info on any of the steps. Happy listening!

Thursday, January 12, 2012

The Myth of Obsolescence

There's an insidious belief held by many people that computer hardware inevitably becomes obsolete and useless after only a few years, at which point it must simply be thrown away as E-waste and replaced with the "next new thing." It's true that the capabilities of computer hardware continue to improve at an exponential rate, but this doesn't imply that the computer that was good enough to perform a task in the past is necessarily no longer adequate today, simply because something else has come along that can perform the same task slightly faster, or otherwise better.

One point I'm trying to make with this series on old computers that we don't have to buy into the assumptions of planned obsolescence that companies often try to foist on consumers in order to make more money. In fact, with nearly 7 billion people on the planet today, our collective environmental impact, including from peak oil and global climate change, will force us to make more conservative use of our natural resources in the 21st century than the United States and other industrial countries ever had to consider in the last half of the 20th.

If we continue to treat computers and other electronics as merely cheap and disposable products, and don't give any concern to reusing our older equipment, then eventually the new products will become more expensive due to higher demand and lower supply of raw materials, and they'll probably be of lower quality, because when people don't expect for something to last for a long time, they're less likely to demand (and be willing to pay the slightly higher price for) products that are designed to last.

Where hardware is concerned, you do often get what you pay for, and high-quality components that were expensive when new are often available used for a tiny fraction of the original price. The Apple IIe (Platinum series) and Apple IIc+ (upgraded to 8MHz) that I recently purchased on eBay for $154.70 and $249.99 respectively were admittedly nostalgia-driven purchases, but they are so solidly built and the keyboards are so nice to type on that I was willing to pay the price. They were also extremely expensive computers when new: I would never have paid $750 in 1988 for an Apple IIc+, or $1400 for an Apple IIe, at a time when an Amiga 500 cost about $550, and was far more powerful, but now that I own one of each, I can almost see why some people were willing to pay such high prices for Apple II gear.

One area I will spend a lot of time covering is the SCSI bus, as this is the interface used to connect hard drives, CD-ROMs, and tape drives to many of the older machines, including my Amiga, Alphas, and VAXstations, and there are a number of tricky details to cover. Some years ago, Seagate released an informative white paper explaining why SCSI drives almost always cost more than the IDE drives of the same era. The differences were all related to the higher performance and reliability requirements of hard drives for the server applications where SCSI hard drives were used, and the lower prices that users were willing to pay (and willing to accept lower performance and reliability in exchange) for IDE drives.

In the paper they mention a typical MTBF of 1,000,000 hours for a typical high-end SCSI drive. That's over 114 years of continuous usage! Now I have no idea how manufacturers justify this estimate without a time machine, but the point is that these are the goals that the drives were built to achieve. I had to go through 5 or 6 different 36 GB SCSI drives before I found one that was quiet enough to put in the Amiga: most of them had annoying high-pitched whines, and I have extremely sensitive ears. Fortunately, I live near Weird Stuff, which has an excellent selection of used computer gear and a good return policy. for this project. According to smartctl, it had a lifetime usage of 13756 hours (about 1.5 years) when I bought it (the drive itself is probably about 10 years old). If I'm lucky, this drive will last another 20-30 years, and hopefully the Amiga itself will also last as long.

Tuesday, January 10, 2012

Restoring Old Computers

Now that I'm fairly well adjusted to my new, happier, life working at Google on Android, I'm returning to blogging with a series of articles on one of my favorite hobbies: restoring and reusing old computers. I'm going to focus on three very different computers: the Commodore Amiga, the DEC Alpha, and the Apple IIe.

What do these systems have in common (besides having names that start with the letter A)? The main feature that attracts me is that none of them uses an Intel processor or runs MS-DOS or Windows (and they're not Macs either). So if your definition of PC is a generic personal computer, then they all qualify, but far more people use the term PC primarily to mean a very specific type of computer, namely one with an Intel (or compatible) "x86" processor that runs Windows (as in Apple's famous Get a Mac ad campaign).

If you're under 25, you might be too young to remember this, but there was a time before Microsoft's relentless, predatory, and sometimes illegal, business practices forced nearly the entire computer industry to standardize on their Windows-branded operating systems, and the Intel, or Intel-compatible, CPUs that ran the most compatible flavors of Windows. It was a time of great innovation. For example, the first Web browser was invented (along with the early versions of the HTTP and HTML protocols that serve as the backbone of the Web itself) by Tim Berners-Lee, on the NeXT computer and OS, Steve Jobs's ill-fated proprietary UNIX-based platform of the 90's, which was failing miserably in the marketplace, despite being quite advanced technically, and would have disappeared completely from the mainstream, just like the Amiga, the Alpha, BeOS, OS/2, and many other quirky and often cool platforms disappeared under the onslaught of the MS-DOS/Windows juggernaut, if Mr. Jobs hadn't been canny enough to convince Apple to bring in their company, at which point Steve worked his way back into the reins of power and NeXTSTEP became the foundation of the now amazingly successful Mac OS X.

The significant thing about the invention of the Web is that it might never have happened if Mr. Berners-Lee was working on a PC running Windows, because that platform simply wasn't capable of designing something like a Web browser without running into all sorts of annoyances and weaknesses of the platform. He might have written the first Web browser on an Amiga, or perhaps on a Mac, but I think it's fairly well established that NeXT had the best development environment if you wanted to create powerful and reliable software with an extremely small team (Berners-Lee worked with another computer scientist and an intern) in a short amount of time. Anyway, Windows took another 10 years before its development tools had really caught up to what NeXT (itself a very small company) created back in 1990.

Similarly, the PC hardware of the 90's left a lot to be desired. For example, non-autoconfiguring devices, such as cards for the standard 8-bit and 16-bit ISA bus included the fun activities of setting jumpers or DIP switches on the card to manually select a port range, IRQ, and possible DMA channels to use, making sure your choices didn't conflict with other cards on the system, and editing your CONFIG.SYS and AUTOEXEC.BAT startup files to inform the OS and programs of your choices, along with possibly spending time trying to get all your MS-DOS drivers to load into "high memory" in order to maximize your "conventional memory" below 640K, and a whole bunch of other annoying stuff that I remember wasting literally hundreds of hours on myself as a young PC hacker before the PCI bus came along and things slowly started catching up to what the other platforms had been doing right all along.

That's one reason I want to point out the Apple II, because Woz had "Plug & Play" functionality in the Apple II in 1977, and IBM didn't bother to make their PC similarly user friendly, even though it came out 3 years later, and had the full weight of IBM's engineering resources (compared to a single geek in a garage). IBM didn't rectify this mistake until 1987, when they introduced a new proprietary computer bus called MCA in the PS/2 series, which other vendors couldn't use in their PC compatibles without paying patent royalties to IBM. This didn't sit well with Compaq, HP, Tandy (remember them?), and the rest of the Gang of Nine, who created their own alternative architecture called EISA which also didn't catch on. VESA Local Bus was another short-lived standard for video cards, but eventually PCI won out and we're better off for it. But the "PC compatible" architecture still includes a lot of legacy junk from the ISA days, even though it's all emulated inside a single chip these days.

The Apple II has auto-configuring cards, my two Alpha workstations use PCI (including a few slots supporting the rare 64-bit variant) and the Amiga includes a proprietary auto-configuring bus called Zorro. So at least from a usability perspective, they're all as friendly to upgrade as a PC from the year 2000, and far superior than the PC's of just five years earlier. Similarly on the OS front, as MS-DOS and Windows (particularly the 16-bit and CE variants) were similarly broken and pathetic compared to the far more capable OS's of the other platforms, and it wasn't until Windows XP that the PC-compatible industry had an OS that was both easy to use and not completely broken when compared to a typical UNIX system. You could make the same argument about the Mac's OS, which didn't really get good until Mac OS X 10.3 came out in 2003 (in my opinion).

What did UNIX (and other heavy-duty OS's like VMS) do that the wimpier platforms like DOS and Windows could not? Well, quite a number of things. Virtual memory that protected buggy apps from being able to bring down the system or interfere with each other. Support for multiple users, with file system security so users couldn't access each others files without permission or modify critical system files. The X Window System, which evolved quickly from simple beginnings because it was extensible enough to support a variety of different approaches that evolved the platform to become more user friendly and better looking, such as the popular commercial graphical toolkit called Motif (now available as open source), and today's GTK and Qt platforms that form the basis of the popular GNOME and KDE desktops on Linux and other UNIX systems. X has been so wildly successful that one of the nice features of Mac OS X (for a UNIX geek at least) is Apple's inclusion of an X server with the OS, which makes running any type of UNIX app really easy, as compared to Microsoft's half-hearted efforts to provide UNIX compatibility in Windows, which they only grudgingly seem to support.

So I'll be running UNIX on the Amiga and the Alphas, specifically the NetBSD flavor, which is open source, and well supported on a huge number (57 at last count!) of different platforms, including x86 PC's, of course. I've been running the Amiga version of the 5.x stable branch, and it has been rock solid reliable for me. I'll be switching over to the current branch that will become NetBSD 6.0, so that I can help fix bugs and submit a few changes and improvements that I've written to the NetBSD community (the current branch is where active work is done for new features).

I'll have more to say about NetBSD and the Commodore Amiga series in the next post, so stay tuned if you're interested in this sort of stuff. Comments and feedback welcome. I also have a Google+ post about some hacking I did to get CD quality audio out of the Amiga, something that I didn't think possible until I learned about the 14-bit hack, the Paula calibration hack, and the 31.5 kHz video hack.

Tuesday, May 25, 2010

Halloween XII: The Reckoning

I've been working at Google for almost three months now, and enjoying myself immensely. I'm excited about the Android platform and think it has a very bright future. It's also a fun platform to write code for. This post isn't about Android, though.

The title of this post is a reference to the Halloween Documents, the first of which was an internal Microsoft memo leaked by an employee to Eric S. Raymond, who posted it to his website in 1998 (on Halloween), where it was picked up by Slashdot and other sites. The topic of the memo was the growing threat of Linux and the open-source movement, and the ways in which Microsoft could attempt to neutralize the threat to their closed-source locked-in monopoly OS, office suite, and related products. The other Halloween documents are a combination of other leaked memos and commentary on the topic of Microsoft, and specifically their heavy-handed PR (and other) tactics to try to discredit Linux and open source.

I recently reread the Halloween documents, as well as a number of other articles about Microsoft and their dastardly ways, many of them culled from TechRights.org, a site that tirelessly catalogs all of the wrongdoings of Microsoft and related companies. The list is quite lengthy. In a little over five months, it will be the 12th anniversary of the leaking of the first Halloween memo.

My own personal experiences with Microsoft in the past have been mixed. Sometimes their stuff works well, other times it's somewhat buggy, and far too often it's completely broken in one way or another. By and large, I've tried to stay away from their products as much as possible. In July 2005, I was hired as a software engineer at Danger, Inc., a small startup company that had made a splash in certain communities as the designer of the Hiptop messaging phone, better known as the T-Mobile Sidekick.

During my time at Danger, I implemented a number of features and fixed quite a few bugs in the telephony layer of the stack, and I'm quite proud of the work that I did on the six phones that we shipped during my time there: hiptop/Sidekick 3, iD, LX, Slide, 2008, and LX 2009. I would have happily stayed there, except for the unfortunate situation that we were acquired by Microsoft in 2008.

As I described in a previous post, I did not have a very fun time at Microsoft, in any way, shape, or form. It was an incredibly stressful situation, compounded by the fact that I'm a UNIX guy at heart and found the Windows environment we were working in to be woefully primitive and clunky, and this is coming from a guy who actually likes OpenVMS, for God's sake!

I became more and more depressed until I burned out completely and had to quit last summer. I left a few months before Microsoft lost all the T-Mobile Sidekick users' data: I think I would have lost it completely if I'd been around when that happened. I would have probably thrown a chair at Roz and gotten fired or something.

Perhaps you've guessed by now that I was one of the earliest assigned to work on the Kin project (well, it was "Pink" at the time). I have nothing good to say about that phone. There is literally no reason for anyone to purchase one, unless of course you're a Microsoft employee, and you've been brainwashed into "supporting the team".

The sickest thing to me about the Microsoft experience was how incredibly cult-like the company is. Now I know that Apple has been accused of having something of a cult mentality (no non-iPhones allowed, etc.), but if they're the cult of Steve, at least their cult leader has very good taste.

For the record, Google feels very much like a university or research campus, and freedom of thought and opinion is expected, and rewarded. Engineers are an ornery bunch, and we don't have to censor ourselves if we think something isn't right. It's all about making cool stuff that people will enjoy using, and not about "killing" or "dominating" or "cutting off the air supply" of the competition, as Microsoft once liked to talk about. Of course they're not able to do that so much anymore.

One thing that Microsoft seems to do a lot of is "AstroTurfing", particularly commenting on news stories about Microsoft with fawning comments that don't seem like they came from a real person. The big tip-off signs of a comment that obviously came from a MSFT employee are that they refer to the company as MSFT and not MS or M$ or any other abbreviation, they tote the party line, cheerlead for some product or other, and never say anything negative. There is an independent Windows "enthusiast" contingent that likes to pimp Microsoft, especially if they can put down Apple or Google at the same time, but those guys tend to not sound so much like they're from the People's Republic of MSFT. Seriously, if you work at the company, write that in a disclaimer or something... oh wait, you can't do that because your comment would sound even more pathetic. Maybe you shouldn't comment at all then, if you find yourself in that position.

There's a famous quote by John Gilmore of the EFF: "The Net interprets censorship as damage and routes around it." I'd like to think that over the past 10 years, the entire tech community has decided to interpret Microsoft as damage and has quite successfully routed around them in a variety of ways, starting with Firefox, OpenOffice, and the increasing popularity of Linux distributions such as Ubuntu, and more recently with some of the stuff Google has been doing, such as Android, and Chrome. [edit: and of course all the awesome stuff that Apple has been doing since the return of Steve Jobs. Apple has been the huge tech success story of the past decade (well, Apple and Google), and I can't believe I left them off the list.]

Gmail and Google Docs aren't open source, but in terms of "routing around (the brain damage of) MS Office/Exchange/SharePoint", I can say that I'm enjoying the Google Apps experience far more than I did the equivalent Microsoft version. The Google Docs word processor and spreadsheet are still somewhat primitive, but my needs are generally pretty simple, and if I need more power, I can always use OpenOffice. I wouldn't have thought that I could be so happy switching back to Ubuntu, Gmail, vim and IntelliJ, from using the "latest and greatest" Windows, Office, SharePoint, Visual Studio, etc. at Microsoft. Did I mention how much I hated working on Windows Mobile? That stuff is just broken beyond belief.

And it's in the mobile space where Microsoft has most completely fallen down. One thing I learned about myself from the whole Kin ordeal was that I truly have a great deal of identification with the stuff that I'm working on. If I don't think there's a purpose and a meaning behind the code I'm writing, then I become very upset. Now a company like Google would tend to think that being passionate about making the best possible product is probably a good trait to have in a software engineer, but it's a genuine disadvantage at a company like Microsoft, where all is politics, and the middle management is utterly adrift.

Let me just say that I think that Steve Ballmer is a clown and a buffoon, and he has no idea just how utterly pointless the whole "Windows Phone 7" exercise is. Good luck with that, dude, but I don't see it having any more of a chance than Kin did (i.e. slim to none). The rest of the mobile industry has already routed around your flavor of brain damage, and I hate to break it to you, but I'm pretty sure that your remaining handset and carrier partners are pretty much just humoring you at this point.

In the end, I'd say that Paul Graham called it, when he blogged in 2007 that Microsoft has been dead since about 2005. They've been dead in the sense that the other innovators in the market no longer fear that they'll step all over them and screw things up. Well, they could (and did) do that to Danger by acquiring us, and they stepped all over Yahoo by threatening to buy them, so they still have some power to interfere with progress. They can still bully companies who ship Linux products into coughing up royalties for their alleged Linux patents, but that's not really something to be proud of now, is it?

I am so glad I don't work at Microsoft anymore. Actually, I'm extra glad that I landed a job at Google to work on Android. Google is pretty much the anti-Microsoft, and Android is something like the anti-Windows, and since I was so miserable over there, it's no surprise that I would be pretty happy working over here. Plus, the free food in the cafes is pretty tasty.

Wednesday, March 24, 2010

Achievement Unlocked

Working at Google is a lot of fun. The Nexus One is a very impressive phone and Android is a very cool platform to work with. Since I've been a big fan of Gmail, Google search, Blogger, YouTube, and many other Google products for years, it's really amazing to be working at the company that created all of that cool stuff, and to get to play with new features and new products before they are released to the public. The first week I really felt like I had unlocked an achievement in a videogame. It's very cool, and the free food is very tasty as well!

One fairly new Google service that not many people have heard of is Google Public DNS, which is a free public high-performance and secure DNS resolution service that you can use instead of the DNS resolver provided by your Internet provider. If you're not familiar with DNS, it's the service that translates the hostname that you type into your browser, for example "example.com", into a numeric IP address, such as 192.0.32.10 for example.com.

(BTW, Wikipedia's entry for DNS also uses example.com for its example hostname, but I hadn't read it when I wrote the previous sentence. I knew that "example.com" is officially reserved for the purpose of using as an example, and apparently so did the author of the Wikipedia entry!)

The IP address is what's used for the actual routing of data to the server, but hostnames are almost always used in URLs instead of IP addresses so that the URL will continue to point to the correct site no matter where the physical server is located. For a large site such as google.com, there are many thousands of web servers distributed across a number of data centers around the world with different IP addresses that all serve the same content. One of the features of DNS resolution is that the DNS server for a given domain can be configured to return different IP addresses depending on where the requesting server is physically located, so that requests to the actual web site will hopefully be routed to the nearest and least heavily loaded web servers.

Normally your computer will never perform the task of recursive DNS lookup directly (there are a few dozen root servers which in turn point to the DNS servers that are authoritative for a particular domain, and then you must query those servers to get the IP address of the full hostname), but will use a caching DNS server provided by your Internet provider, which will be typically be automatically configured for you by DHCP when you connect your computer to your router. The advantage of using a caching DNS server is that it saves a lot of time compared to querying the actual DNS servers, especially if it's a popular site and the answer is already in the cache.

The disadvantage of using your ISP's DNS server, as I discovered a few months ago, is that they can mess around with the DNS server to redirect you to a search page of their choosing if you look up a hostname that doesn't exist (rather than telling the browser that the hostname doesn't exist). Comcast turned that on a while back and I was really annoyed by that behavior. I tried running my own DNS server for a while, but that ended up slower than Comcast's DNS server, which I didn't like. I remembered hearing something about Google starting a DNS service, and sure enough, they had. I configured my Wi-Fi router to use 8.8.8.8 and 8.8.4.4 for DNS and now the websites I visit load more quickly with very little time spent in "Resolving host", and no redirect to some lame page I didn't want to go to if I mistype a URL with a hostname that doesn't exist.

One final point that I think is pretty neat: Google set up something called anycast routing for those two IP addresses (8.8.8.8 and 8.8.4.4) so that they will take you to the nearest Google public DNS server to your location anywhere in the world. Not only does that make DNS lookup even faster, but it's compatible with the geographic load-balancing I mentioned earlier, so when you look up google.com or some other site, the DNS server for that site will return the IP address nearest to the Google DNS server, which will hopefully also be near to you as well. You can read more about the performance and security improvements at the Google Public DNS site.

Next time I might talk about Google Web Toolkit, another cool product from Google that I have been meaning to play around with for some time now, since I'm intrigued by the concept, and since I'd much rather write an AJAX site in Java than JavaScript. I'll also continue with my series on OpenVMS, as well as any interesting Android adventures that I'm allowed to talk about.

Friday, February 26, 2010

Joining the Google Android Team!

I'm pleased to announce that I have accepted a position on the Android team at Google, starting March 8, where I will be applying my expertise in smartphone platform development towards improving an already quite exciting and impressive platform.

Going to work at Google feels like a dream come true, especially after the frustration and pain I experienced trying to fit in at Microsoft and salvage something of value out of their horrible, rotting Windows Mobile codebase, all the while dealing with a truly abusive work environment. I'd rather not say any more about that experience on a public forum, as it would detract from the positive stuff that I'd prefer to focus on in this blog.

I was quite fortunate to have my choice of great companies to work for in the Bay Area. There aren't very many developers in the U.S. with industry experience building great mobile device platforms, and it's a bit of a tight-knit community, rather like the voiceover artists for movie trailers. Many of the original Android developers came from Danger, and a number of them came from Be, Inc. before that. I was an intern at Be in 1997, and a regular contributor to the bedevtalk mailing list around that time period, and so I found out about Danger through a contact from Be when I came up to the Bay Area in 2005 looking for exciting work. So there is a creative lineage from Be to Danger and from there to Apple, Helio, Palm, Google, and others.

If you ever wondered how Apple and Palm were able to build such compelling devices on top of their own software stacks in such a short time with relatively few resources, a lot of it has to do with their hiring of engineers with experience building earlier devices for other companies, including for Danger. In my own time there, I contributed significant new code and many bug fixes to six different hiptop/Sidekick phones, starting with the Sidekick 3 (where I designed and implemented the UI and telephony layer for Danger's Bluetooth HFP implementation among other things) and continuing to the Sidekick LX 2009 (the 3G model). So I was quite flattered, but not particularly surprised, to have been actively recruited by the Google Android folks once they found out that I had left the company. I still had to pass the arduous interview process just like everyone else.

I also want to say a few nice things about Apple, since they were my second choice behind Google, and I'm sure that they would also have been an excellent company for me to work for. I'm a huge fan of Apple's computers, phones, and OS, and I think they are making some of the best products on the market in all of the categories in which they choose to compete. However, I see Apple as something like the BMW or Mercedes-Benz of the computer world: makers of fine luxury products that inspire the rest of the industry, but which aren't necessarily affordable to the masses. Also, since Apple's OS platform is only available to Apple, and their phone is only available through certain mobile operators, that leaves a big opening for Android and other open-source platforms that are available to anyone who wants to use them. That's really cool.

I'm a big fan of open source as a general principle, so I think that my own philosophy more closely meshes with Google than with Apple or any of the other players, and certainly as compared to Microsoft. In the months since leaving their employ, I've read a great deal about Microsoft's pattern of abuse of their monopoly powers, much of which I knew already, but some of which was new, particularly their ongoing attempts to sabotage OpenDocument in favor of their inferior and defective OOXML "standard" (in quotes because even Microsoft Office doesn't conform to their own proposal!). I think that from a pragmatic engineering standpoint (as opposed to a legal/philosophical one), the biggest problem with Microsoft is that their software is just so bad: badly written, poorly documented, carelessly maintained.

One of my hobbies is my collection of various vintage computers. It's funny to me that my Commodore 64 is still in perfect working order and as useful (at least for playing games) as it ever was. On my desk is a genuine DEC VT320 terminal assembled on October 23, 1990, which had been sitting in my parents' garage for a number of years. I recently took it home and connected it to my small Alpha OpenVMS cluster (big thank you to Weird Stuff for having the proprietary DECconnect serial cables and adapters that I needed in their warehouse, since the VT320 lacks a standard RS-232 port). To my great joy there is no burn-in whatsoever on the CRT (likely due in part to the built-in screen saver feature that blanks the screen after 20 minutes of inactivity), and although it is a bit slow (max 19200 bps and no hardware flow control so it has to send ^S/^Q when it can't keep up with even that relatively slow speed), the LK201 keyboard has a satisfying clunkiness and the onscreen font is quite sharp. It's also a white phosphor tube as opposed to green or amber, which is nice. I opened it up to adjust an internal knob to make it a bit brighter (following the instructions in the pocket service guide) and other than some carbonization of the insulation around the flyback transformer (which I'd expected and predicted as the likely cause of the reduced brightness), it's in almost pristine condition.

I'll have more to write about my VMS adventures in future posts. I do intend to open up the cluster for semi-public consumption as a BBS, but I need to finish a few more sysadmin tasks first. Just a few days ago I reconfigured the OS, uninstalling DECnet-Plus in favor of the older DECnet Phase IV package, which is much leaner and more well-suited to my small network. I took advantage of the volume shadowing feature (basically a form of RAID-0 that you can boot from) to modify the OS on one of the two mirrored SCSI hard drives, bring everything back up, and then once I was satisfied with the new configuration, I restarted volume shadowing and mirrored the new configuration to the other drive. If anything had gone wrong, I could have easily recovered by booting the saved image on the first drive and mirroring it back to the second. In addition, I first made a full backup to DLT VS 160 tape, because one can never be too careful when working on a "mission critical" server environment even if it's just for fun and practice (more to the point, I didn't want to lose all of the time I'd put into configuration and setup so far). More on that in a future post.

My point in telling those stories in this post is that I think Microsoft as a company has done a lot more harm than good in terms of "training" people to have low expectations for the long-term usability of their investment in computer equipment. How is that so much 10-20 year old non-Microsoft stuff is perfectly usable today, while the typical Windows PC of half that age ends up encrusted with malware or bloated after too many app installs, too slow to run the latest Windows version, and yet is still perfectly capable of running, for example, Linux? I think it's because Windows just isn't very good, and Microsoft used a lot of dirty tactics in the 1990's to cut off consumers from pursuing other avenues, including BeOS. On the server side, I will have more to say about this in future VMS posts (the topic of UNIX vs. NT has already been beaten to death, but I think that there is still some benefit to comparing NT with other proprietary server OS's of that era, especially since NT is VMS reimplemented, poorly).

One informative book that I discovered recently, thanks to a comment at Mini-Microsoft, is Barbarians Led by Bill Gates, co-authored by a former early Microsoft engineer and by the daughter of Pam Edstrom, founder of Waggoner Edstrom, Microsoft's primary PR firm. Thanks in no small part to Mr. Eller and Ms. Edstrom's insider experience, as well as the interviews they conducted with other early employees and insiders, the book paints a vivid picture of just how thoroughly Microsoft is a product of pure public relations and not of any sort of software engineering expertise.

I like to tell people that Bill Gates and company made an excellent 8K ROM BASIC back in the day, but their engineering expertise, such as it was, clearly did not lead them to develop particularly high quality software of any greater complexity. One final story and then I'll wrap up this post. First, a passage from Barbarians:

Slightly disgusted that he had just joined a company [Microsoft] that was shipping defective software, Eller decided to take matters into his own hands. After researching graphics journals and spending nearly two weeks on this complicated problem, Eller finally hacked out a solution and wrote the new flood-fill algorithm. Though it was painfully slow and crawled across the screen, it did enable BASIC to correctly flood-fill.

Eller called his boss into his office once again. Whitten was less than thrilled. He had authorized the work, but Eller had spent two weeks on the flood-fill, ignoring the translator he was supposed to be writing. Undaunted, Eller set out to let others in on the flaw he had discovered and how he had fixed it. He pulled in any random developer he could find. He even pulled in Chairman Gates, whose office was just down the hall.

“Bill, check this out,” Eller said, pointing to his computer screen. “I mean . . . who was the jerk who wrote this brain-dead piece of shit?”

Gates stared at the screen.

“See, now that’s what I call a design flaw,” Eller said. “Now check out my new version. Pretty cool, eh?”

Gates nodded, pushing his glasses up the bridge of his nose.

“Does it work with really complicated things?” Gates asked.

“Sure,” Eller told him. He proceeded to draw a complicated object and flood-fill it.

“See? It works perfectly.”

“Can you prove that this works all the time?”

“Uhh, well umm, kind of,” Eller said. “I mean, I know it always works, but I’m a mathematician. The word ‘prove’ conjures up really ugly ideas.”

Gates told Eller his program was nice, then turned and walked back to his office.

After Gates left, Whitten walked into Eller’s office. He had heard the entire conversation.

“Do you know who wrote the original flood-fill algorithm?” he said, shaking his head.

“Ahhh, nope,” Eller replied. “I don’t believe I do.” Whitten paused, rubbed his finger on his left temple, and shook his head again.

“Bill wrote it,” he said. “Bill was the jerk who wrote this brain-dead piece of shit.”

Of course Bill was also "the jerk" who is famous for berating and insulting the intelligence of just about everyone he has ever disagreed with, so to read about him getting his comeuppance was quite amusing.

Even funnier: one of the first questions I was asked during my Google interview was to write an implementation on the white board to perform ... flood-fill! Fortunately, I had half-way thought through the solution while reading that excerpt, so I was able to cobble together a correct approach without too much trouble. I must not have embarrassed myself because I got the job. :)