The Summer of Jeff

Python code for WPA stats

Posted in baseball analysis, programming by Jeff on February 7, 2011

A long time ago I put together a python version of the win expectancy/volatility calculations contained in Studes’s WPA spreadsheet.  Those were the days–if we wanted a post-game WPA graph, we had to do it ourselves :).

I’ve brushed off the cobwebs and published the code.  Click here to see it.

All this does is calculate the win expectancy and volatility (~leverage) in any situation.  It doesn’t calculate WPA on the play.  Of course, if you’re running this on a play-by-play log, it’s trivial to compare the WX of one play and the next.

‘Volatility’ is the difference between the win expectancies that would result from a home run and from a strikeout.  To normalize it so that the average volatility is 1.0, I have this code divide the result by 0.133.  Depending on your dataset, that might not be quite right.  There are more sophisticated ways to measure leverage, though this one is adequate for many purposes.

Thank you Studes, Tango, and others for publishing all that you have.  As is so often the case, I’m just the code monkey.

Comments Off on Python code for WPA stats

2011 Aussie Open Simulation Results

Posted in programming, tennis by Jeff on January 16, 2011

Using my simple ranking-points-based algorithm to determine the odds that each player wins a match, I ran simulations using the 2011 Australian Open draw.

As usual, the keyword is “simple,” and you can easily find all sorts of intuitive reasons to discount the results.  There’s no consideration for surface, so clay-court specialists are generally overrated.  Players returning from injury (Del Potro, especially, and Karlovic) have seen the hit in the rankings, and are thus underrated here, as well.

I’m also publishing the code that I use to generate these sims. It should work for any single-elimination tournament up to 128 competitors, and is easily expandable to handle larger brackets.  The function ‘calcWP’ is specific to my tennis algorithm, but you could swap in something like log5 very easily. I also included the .csv file I used for the draw, so you can see the format, or tinker with the parameters and come up with your own Aussie sim.

Your 2011 Australian Open…

Player               points    R64    R32    R16     QF     SF      F      W  
Nadal             1   12390  96.9%  92.7%  87.0%  78.1%  66.1%  49.6%  34.5%  
Daniel                  564   3.1%   1.4%   0.5%   0.1%   0.0%   0.0%   0.0%  
Sweeting          Q     486  35.3%   1.6%   0.5%   0.1%   0.0%   0.0%   0.0%  
Gimeno-Traver           844  64.7%   4.3%   1.9%   0.7%   0.2%   0.0%   0.0%  
Tomic             W     239  17.9%   3.1%   0.1%   0.0%   0.0%   0.0%   0.0%  
Chardy                  960  82.1%  39.6%   3.7%   1.4%   0.4%   0.1%   0.0%  
Falla                   540  27.3%  11.3%   0.7%   0.2%   0.0%   0.0%   0.0%  
Lopez F          31    1310  72.7%  46.0%   5.6%   2.6%   0.9%   0.2%   0.0%  
                                                                              
Isner            20    1850  74.0%  56.8%  31.7%   5.8%   2.5%   0.8%   0.2%  
Serra                   711  26.0%  14.0%   4.6%   0.5%   0.1%   0.0%   0.0%  
Stepanek                735  62.1%  20.4%   6.9%   0.6%   0.2%   0.0%   0.0%  
Gremelmayr        Q     469  37.9%   8.8%   2.2%   0.1%   0.0%   0.0%   0.0%  
Machado                 573  41.2%  10.2%   3.3%   0.2%   0.0%   0.0%   0.0%  
Giraldo                 785  58.8%  18.3%   7.2%   0.7%   0.2%   0.0%   0.0%  
Young D           Q     435  14.6%   5.4%   1.4%   0.1%   0.0%   0.0%   0.0%  
Cilic            15    2140  85.4%  66.1%  42.8%   8.7%   4.1%   1.4%   0.4%  
                                                                              
Youzhny          10    2920  85.6%  70.1%  51.9%  29.2%   8.1%   3.3%   1.1%  
Ilhan                   574  14.4%   6.2%   2.2%   0.5%   0.0%   0.0%   0.0%  
Kavcic            Q     552  38.0%   7.1%   2.4%   0.5%   0.0%   0.0%   0.0%  
Anderson K              868  62.0%  16.6%   7.4%   2.1%   0.2%   0.0%   0.0%  
Raonic            Q     351  36.4%   6.8%   1.0%   0.2%   0.0%   0.0%   0.0%  
Phau                    581  63.6%  18.0%   4.1%   0.9%   0.1%   0.0%   0.0%  
Chela                  1070  39.3%  27.8%   9.9%   3.2%   0.4%   0.1%   0.0%  
Llodra           22    1575  60.7%  47.4%  21.0%   8.8%   1.6%   0.4%   0.1%  
                                                                              
Nalbandian       27    1480  64.2%  49.1%  18.4%   8.2%   1.4%   0.4%   0.1%  
Hewitt                  870  35.8%  23.1%   6.1%   2.0%   0.2%   0.0%   0.0%  
Berankis                589  61.1%  19.1%   3.9%   1.0%   0.1%   0.0%   0.0%  
Matosevic         W     392  38.9%   8.8%   1.4%   0.2%   0.0%   0.0%   0.0%  
Russell                 547  67.0%  10.1%   3.6%   0.8%   0.1%   0.0%   0.0%  
Ebden             W     288  33.0%   2.7%   0.6%   0.1%   0.0%   0.0%   0.0%  
Nieminen               1062  20.2%  14.5%   7.7%   2.8%   0.4%   0.1%   0.0%  
Ferrer            7    3735  79.8%  72.7%  58.4%  39.4%  12.6%   5.8%   2.4%  
                                                                              
Soderling         4    5785  87.9%  83.6%  71.9%  58.3%  35.9%  15.6%   7.9%  
Starace                 945  12.1%   8.9%   4.2%   1.7%   0.3%   0.0%   0.0%  
Muller            Q     466  76.9%   6.9%   2.0%   0.5%   0.1%   0.0%   0.0%  
Stadler           Q     155  23.1%   0.7%   0.1%   0.0%   0.0%   0.0%   0.0%  
Istomin                1031  86.2%  41.8%   8.8%   3.6%   0.8%   0.1%   0.0%  
Hernych           Q     196  13.8%   1.9%   0.1%   0.0%   0.0%   0.0%   0.0%  
Mello                   627  30.0%  12.8%   1.9%   0.6%   0.1%   0.0%   0.0%  
Bellucci         30    1355  70.0%  43.5%  11.0%   5.3%   1.4%   0.2%   0.1%  
                                                                              
Gulbis           24    1505  64.3%  41.5%  20.7%   6.3%   1.9%   0.4%   0.1%  
Becker                  870  35.7%  17.9%   6.4%   1.3%   0.2%   0.0%   0.0%  
Dolgopolov              928  53.6%  22.8%   8.6%   1.8%   0.4%   0.0%   0.0%  
Kukushkin               815  46.4%  17.9%   6.3%   1.2%   0.2%   0.0%   0.0%  
Seppi                   900  59.6%  19.2%   8.7%   1.9%   0.4%   0.0%   0.0%  
Clement                 627  40.4%   9.9%   3.5%   0.6%   0.1%   0.0%   0.0%  
Petzschner              839  24.3%  12.6%   5.5%   1.1%   0.2%   0.0%   0.0%  
Tsonga           13    2345  75.7%  58.2%  40.4%  15.8%   6.3%   1.6%   0.5%  
                                                                              
Melzer           11    2785  91.2%  77.7%  54.3%  22.9%  10.4%   3.0%   1.0%  
Millot            Q     334   8.8%   3.3%   0.7%   0.1%   0.0%   0.0%   0.0%  
Ball              W     344  32.5%   4.1%   0.9%   0.1%   0.0%   0.0%   0.0%  
Riba                    672  67.5%  14.8%   5.5%   1.0%   0.2%   0.0%   0.0%  
Sela                    568  77.8%  21.8%   5.0%   0.7%   0.1%   0.0%   0.0%  
Del Potro               180  22.2%   2.4%   0.2%   0.0%   0.0%   0.0%   0.0%  
Zemlja            Q     376  15.1%   6.9%   1.2%   0.1%   0.0%   0.0%   0.0%  
Baghdatis        21    1785  84.9%  68.9%  32.2%  10.7%   3.8%   0.8%   0.2%  
                                                                              
Garcia-Lopez     32    1300  62.1%  44.0%  10.6%   4.2%   1.2%   0.2%   0.0%  
Berrer                  835  37.9%  22.8%   3.9%   1.1%   0.2%   0.0%   0.0%  
Schwank                 580  50.6%  16.9%   2.3%   0.5%   0.1%   0.0%   0.0%  
Mayer L                 572  49.4%  16.3%   2.1%   0.4%   0.1%   0.0%   0.0%  
Marchenko               624  49.3%   5.5%   2.3%   0.6%   0.1%   0.0%   0.0%  
Ramirez Hidalgo         638  50.7%   5.7%   2.4%   0.6%   0.1%   0.0%   0.0%  
Beck K                  543   7.0%   3.2%   1.2%   0.3%   0.0%   0.0%   0.0%  
Murray            5    5760  93.0%  85.5%  75.3%  56.7%  35.5%  15.6%   7.9%  
                                                                              
Berdych           6    3955  96.4%  78.5%  63.1%  42.3%  22.0%   9.6%   3.4%  
Crugnola          Q     194   3.6%   0.5%   0.1%   0.0%   0.0%   0.0%   0.0%  
Kohlschreiber          1215  63.8%  15.2%   8.3%   3.1%   0.9%   0.2%   0.0%  
Kamke                   724  36.2%   5.8%   2.4%   0.7%   0.1%   0.0%   0.0%  
Harrison          W     313  32.3%   6.7%   0.6%   0.1%   0.0%   0.0%   0.0%  
Mannarino               612  67.7%  22.8%   3.9%   0.9%   0.1%   0.0%   0.0%  
Dancevic          Q     172   9.0%   2.2%   0.1%   0.0%   0.0%   0.0%   0.0%  
Gasquet          28    1385  91.0%  68.3%  21.5%   8.8%   2.5%   0.6%   0.1%  
                                                                              
Davydenko        23    1555  60.0%  41.5%  17.1%   6.5%   2.0%   0.5%   0.1%  
Mayer F                1073  40.0%  23.9%   8.0%   2.3%   0.6%   0.1%   0.0%  
Fognini                 855  59.6%  22.7%   6.5%   1.7%   0.3%   0.0%   0.0%  
Nishikori               599  40.4%  12.0%   2.7%   0.5%   0.1%   0.0%   0.0%  
Zverev                  611  38.3%   7.2%   2.4%   0.5%   0.1%   0.0%   0.0%  
Tipsarevic              935  61.7%  16.0%   7.2%   2.0%   0.4%   0.1%   0.0%  
Schuettler              597  13.5%   5.8%   1.9%   0.4%   0.1%   0.0%   0.0%  
Verdasco          9    3240  86.5%  71.1%  54.1%  30.3%  14.2%   5.7%   1.8%  
                                                                              
Almagro          14    2160  84.5%  68.0%  41.9%  15.4%   6.8%   2.2%   0.5%  
Robert            Q     460  15.5%   6.6%   1.8%   0.2%   0.0%   0.0%   0.0%  
Andreev                 622  52.1%  13.7%   4.5%   0.8%   0.1%   0.0%   0.0%  
Volandri                574  47.9%  11.8%   3.6%   0.5%   0.1%   0.0%   0.0%  
Cipolla           Q     190  32.6%   3.4%   0.4%   0.0%   0.0%   0.0%   0.0%  
Paire             W     366  67.4%  12.5%   2.5%   0.3%   0.0%   0.0%   0.0%  
Luczak            W     400  14.7%   8.6%   1.9%   0.2%   0.0%   0.0%   0.0%  
Ljubicic         17    1965  85.3%  75.5%  43.4%  15.1%   6.2%   1.8%   0.4%  
                                                                              
Troicki          29    1385  86.2%  64.4%  16.2%   7.2%   2.4%   0.5%   0.1%  
Tursunov                263  13.8%   4.5%   0.3%   0.0%   0.0%   0.0%   0.0%  
Dabul                   584  58.6%  19.8%   2.7%   0.7%   0.1%   0.0%   0.0%  
Mahut             Q     424  41.4%  11.3%   1.1%   0.2%   0.0%   0.0%   0.0%  
Karlovic                670  52.8%   6.2%   2.5%   0.7%   0.1%   0.0%   0.0%  
Dodig                   606  47.2%   5.0%   2.0%   0.5%   0.1%   0.0%   0.0%  
Granollers              993  11.6%   7.2%   3.6%   1.4%   0.4%   0.1%   0.0%  
Djokovic          3    6240  88.4%  81.6%  71.5%  56.9%  40.2%  21.9%  10.2%  
                                                                              
Roddick           8    3565  88.5%  78.1%  61.4%  42.2%  16.8%   8.1%   2.7%  
Hajek                   560  11.5%   5.8%   2.0%   0.4%   0.0%   0.0%   0.0%  
Przysiezny              590  51.7%   8.5%   2.9%   0.7%   0.1%   0.0%   0.0%  
Kunitsyn                551  48.3%   7.6%   2.5%   0.7%   0.1%   0.0%   0.0%  
Berlocq                 725  47.1%  16.8%   4.0%   1.2%   0.2%   0.0%   0.0%  
Haase                   803  52.9%  20.0%   5.2%   1.7%   0.3%   0.0%   0.0%  
Benneteau               965  38.5%  21.8%   6.3%   2.3%   0.4%   0.1%   0.0%  
Monaco           26    1480  61.5%  41.5%  15.7%   7.2%   1.7%   0.5%   0.1%  
                                                                              
Wawrinka         19    1855  76.7%  52.3%  28.1%  12.8%   3.5%   1.2%   0.2%  
Gabashvili              626  23.3%   9.4%   2.6%   0.6%   0.1%   0.0%   0.0%  
Dimitrov          Q     518  29.5%   7.5%   1.8%   0.4%   0.0%   0.0%   0.0%  
Golubev                1135  70.5%  30.8%  12.7%   4.2%   0.8%   0.2%   0.0%  
Gil                     551  40.2%   8.3%   2.4%   0.5%   0.0%   0.0%   0.0%  
Cuevas                  790  59.8%  16.4%   6.1%   1.6%   0.2%   0.0%   0.0%  
De Bakker               950  25.2%  14.9%   6.2%   1.9%   0.3%   0.1%   0.0%  
Monfils          12    2560  74.8%  60.4%  40.2%  21.5%   7.2%   2.9%   0.7%  
                                                                              
Fish             16    1996  70.1%  52.0%  32.0%   8.2%   3.9%   1.3%   0.3%  
Hanescu                 915  29.9%  16.4%   6.8%   1.0%   0.3%   0.0%   0.0%  
Robredo                 915  65.2%  23.4%   9.9%   1.5%   0.4%   0.1%   0.0%  
Devvarman               514  34.8%   8.2%   2.4%   0.2%   0.0%   0.0%   0.0%  
Stakhovsky              925  64.4%  24.8%  10.2%   1.6%   0.4%   0.1%   0.0%  
Brands                  541  35.6%   9.3%   2.6%   0.3%   0.1%   0.0%   0.0%  
Kubot                   670  24.5%  11.4%   3.9%   0.5%   0.1%   0.0%   0.0%  
Querrey          18    1860  75.5%  54.5%  32.1%   7.8%   3.4%   1.1%   0.2%  
                                                                              
Montanes         25    1495  74.3%  48.4%   8.8%   4.5%   1.7%   0.5%   0.1%  
Brown                   573  25.7%  10.3%   0.9%   0.2%   0.0%   0.0%   0.0%  
Andujar                 683  40.9%  14.9%   1.4%   0.4%   0.1%   0.0%   0.0%  
Malisse                 956  59.1%  26.4%   3.4%   1.4%   0.4%   0.1%   0.0%  
Lu                     1141  53.7%   6.2%   3.2%   1.4%   0.5%   0.1%   0.0%  
Simon                  1005  46.3%   4.8%   2.3%   0.9%   0.3%   0.1%   0.0%  
Lacko                   553   4.3%   1.4%   0.4%   0.1%   0.0%   0.0%   0.0%  
Federer           2    9245  95.7%  87.6%  79.6%  70.0%  56.6%  40.3%  22.4% 

Python Code for Marcel Projections

Posted in baseball analysis, programming by Jeff on January 14, 2011

A while back, I posted retro-Marcel projections for over 100 seasons.  They were generated with some python code, and now you can play with it.

You’ll also need some Baseball-Databank files.  (Well, you don’t need them, but they will make the process much easier.)

The ‘import’ lines refer to a few utilities that I’ve written.  Those are also available on gitHub.  At some point, I’ll write up a summary of some of my Python utilities.  I’m sure that none of them are original (for instance, turning a 2-d matrix into a .csv, or vice versa), but I use them all the time, and they might come in handy for you, too.

Comments Off on Python Code for Marcel Projections

Python Code for Tennis Markov

Posted in programming, tennis by Jeff on January 13, 2011

I’ve published my code for the tennis markov project.  You can find it here:

  • Single game outcome. Takes the server’s probability of winning a single point and the current score, returns server’s chance of winning game.
  • Tiebreak outcome. Takes server’s probability of winning a single service point, prob of winning single return point, and current score, returns server’s chance of winning tiebreak.
  • Single set outcome. Takes server’s probability of winning a single service point, prob of winning single return point, and current game score, returns server’s chance of winning set. (Assumes standard tiebreak set.)
  • Match outcome. Takes server’s probability of winning a single service point, prob of winning single return point, current score in points, games, and sets, and number of sets, returns server’s chance of winning match.

The logic in the tiebreak problem is knotty, and the code reflects that; I’m sure there’s a better way of doing it, I just didn’t feel like working it out once I got to the answer.

In the other functions, the code is pretty clean, and I’ve commented it more than I otherwise would.  The math gets a little hairy, though.

Roll-your-own blogging software

Posted in programming by Jeff on July 28, 2010

A few years ago, I moved my GMAT Hacks website off of WordPress.  I wrote the code for a basic blogging platform using Python, and since then, I’ve built it out a little more.  A content management system (CMS) does not have to be complicated.  And as Blogger, WordPress and others have shown, the platform is generic; I’ve used almost exactly the same code to drive GMAT Hacks, GRE HQ, and the College Splits blog.

I’m not going to share any code, but I will walk through the process.  It’s very intuitive in Python, and I’m sure it’s similarly straightforward in many other languages.

The various blogging platforms offer much of what you’ve ever need, and they are generally easy to use and modify.  That’s why this very post is on a WordPress blog.  But especially in the case of my GMAT site, I needed more flexibility to automatically update special types of pages and create customized sidebars and footers.

The basics

A do-it-yourself CMS can consist of as few as three files:

  1. A database of some sort that, for each post, stores title, body text, date, and other information, possibly including category, tag(s), and anything else you can dream up.  I think this is simple enough not to require further explanation.
  2. A simple script to add items to the database and edit items already in the database.
  3. A script that uses the site template to generate pages for each post using the database.

Let’s look at the last two in a little more detail.

Add and edit items

This is also pretty simple.  The one aspect worth mentioning is that it’s important to validate everything going in–if you’re ambitious, you may even try to validate the HTML in the posts themselves.  I limit myself to checking that a new post’s category already exists and that the post’s date is valid.  (On some of my sites, I use YYYYMMDDX as a post ID, where X is an index to differentiate multiple posts from the same day.)

Generate the site

Depending on how thorough you want to be, this script can get fairly complex.  (Mine is currently a bit longer than 400 lines of code.)

At its most basic, it’s just a matter of creating a page for each post and uploading each one.  Here are a few more things it can do:

  1. Uploading some pages to multiple locations.  For example, you might want your most recent post to be the front page on your site.  So the page “category/recent-post.html” might also be uploaded as “index.html.”
  2. Creating tables of contents.  On my GMAT site, I have a chronological TOC, a site-wide TOC with posts sorted by category, and an individual TOC page for each category.  I also have a “recent posts” page with a chronological list of the last 10 posts.  The script creates each one every time I update the site.
  3. Creating an xml feed.  You might include the last 5 or 10 posts, and you have the flexibility to include all, some, or none of the body of the post.
  4. Updating pages outside of the blog hierarchy.  The first page of my GMAT site does not contain a blog post, but the script creates it, so that it always links to the most recent post.
  5. Varying sidebar and footer content.  My footers are generally predictable–they link to the previous post, as well as a category-specific table of contents.  But I also include an ad for one of my books.  (For some posts, I randomly rotate the ads with each site update.)  With full control over the script, I can put an ad for my math book on math-related pages and my verbal book on verbal-related pages.  I also have a few different sidebars for different purposes.  In a few cases, I even drop the footer content altogether.

Unlike the way, say, WordPress does things, every single page on all of my blogs is a flat html page.  This ensures that the pages are very fast to load regardless of traffic level.  It takes a little more time to generate and upload the site–for instance, my GMAT site now consists of over 300 pages, and most of them have a ‘recent posts’ box on the sidebar, so they must be updated each time I add a new post.  But with a decent connection, that only takes a couple of minutes.

The way my script works, it sorts the database by date, then goes through the list twice.  The first time, it creates the various TOCs, the XML feed, and the list of recent posts that I use in the sidebar.  The second time, it creates the individual pages.

If you have questions about the process, feel free to post them in the comments.

Comments Off on Roll-your-own blogging software

Python for baseball

Posted in programming by Jeff on July 16, 2010

A fair number of people are curious enough about my baseball projects (Minor League Splits, College Splits) to inquire about the tools I use.  Here’s the answer.

When I decided to take a crack at collecting minor league split statistics in 2006, I had no programming background at all.  For reasons I don’t recall, I ended up learning Python.  It has proven to be a very good choice–it was extremely easy to learn, and I could start building stuff almost immediately.

In fact, to this day, almost everything I do is written in Python.  The one major exception is that the web interface for Minor League Splits is written in Javascript.  (Click on “view source” on any MLS player page and you’ll see some ugly, ugly Javascript.)  Instead of rewriting the site in JS, I probably should have taken the opportunity to learn a Python web framework like Django, but I didn’t, and I haven’t since.

Even though the software that runs College Splits manages some very large databases (by baseball standards, anyway), I don’t use any kind of database-specific language.  I know many statisticians rely on MySQL; there’s a commonly-used API that allows Python scripts to work with MySQL databases.  (There are also APIs for just about any other db format.) But I don’t use it.  I’ve written a fair amount of Python code to simulate some of the power of SQL, but ultimately, my databases sit in CSV files.

Ultimately, there’s just not much that a baseball statistician needs a programming language to do.  From my perspective, the most important tools are those that allow me to do text parsing, getting play-by-play logs from various formats into a standardized version that I use (very much like Retrosheet’s).  Python’s built-in libraries make it very easy to do much of that.

XML parsing also arises quite a bit, especially if you’re grabbing data from MLB.  There are plenty of Python libraries that do that.  (I ended up writing my own.)  Creating and uploading flat HTML files is also a breeze.  For instance, I wrote my own blogging platform in a few hundred lines of code; that’s what runs the College Splits blog, as well as the GMAT Hacks and GRE HQ websites.  (More on that another day.)

I can’t say very much about what makes Python better than other languages, because I don’t have enough experience with other languages to know that it is.  For a beginner, I wouldn’t recommend anything else.  You may discover reasons to end up in another language, but even Python’s detractors acknowledge that it’s about as easy as it gets.

If you do decide to teach yourself Python, I encourage you to start working on “real” projects as quickly as possible.  Don’t bite off too much–you might just work on coding some common baseball stats (OBP, SLG, ERA, etc.), or when you’re ready for more, write a program to compare players given the parameters of a certain fantasy league.  Having a relevant goal makes it a lot easier to stay motivated.  If it hadn’t been for the motivation of Minor League Splits, I probably would never have become skillful enough to try any other major programming projects.

The one thing I wish someone had told me when I was a beginner is this: Anything you’re working on, someone else has probably done.  I’m embarrassed to recall how many functions I wrote that were no more than clumsy replicas of built-in functions.  Some of the tools I’ve written–to work with CSV and XML formats, for instance–I treated as exercises for myself, but there are several options out there.   Even when you’re engrossed in your first project, take some time out to browse through Python’s documentation, or how-to book or two.  These will remind you that the language does a lot more than you’re aware of, and keep you from spending time on work others have done.

Another word of advice I wish I’d had–don’t pay too much attention to the constant refrain to “comment your code.”  If you get a developer job, comment your code.  If you’re doing this stuff for fun, don’t worry too much about it.  Instead, always think about making your code reusable.  You don’t need to adopt all the trappings of object-oriented programming, but if you’re doing almost anything baseball related, realize you’ll need it again.

For instance, writing a function for something like OPS takes a couple of minutes no matter how good you are with the language…but once you’ve written it, if you keep it separate from other things (for instance, a function should calculate only OPS, not AVG, BABIP, and OPS), you can use it again and again.  I was an awful programmer in 2006, but there is code I wrote in my first few months that I still find myself reusing.