Wednesday, January 20, 2010

Not As Excited










A year ago I wrote about how excited I was to see Barack Obama in the White House. I was happy to see the Bush administration end after eight years and hopeful that things would change for the better.

Some things are better, but I'm disappointed that we haven't gone far enough:


  1. Guantanamo is still open.

  2. Goldman Sachs still runs the Treasury Department.

  3. The Patriot Act, Department of Homeland Security, and other measures put in place to make us safer have not had the desired effect. We have given up liberty for safety and ended up with neither.

  4. Deficits continue to climb. Our debt is reaching the point where we won't be able to turn back.

  5. We're still at war in Afghanistan and Iraq, with no end in sight. Yemen might be next. Sabre-rattling continues with Iran. At this rate we'll be fighting with the whole Middle East soon. How will we know "victory" when we see it? When does a war on terror end? At $1M per year per soldier, how long before we can't afford any more?

  6. Jobs continue to disappear and don't look like they'll be coming back soon. Part of the reason for unrest in the Middle East is a large population of educated young people who can't get jobs and establish themselves as adults. How long before our country finds itself in a similar predicament?

  7. Lobbyists and K-Street continue to represent a fourth branch of government that the founders never envisioned.

  8. Glass Steagall is still repealed.



The economy did not crater, as I feared it might back in September 2008. But it has not recovered much, either. The fundamentals are simply terrible. We cannot continue to spend and consume more than we save and produce. Eventually the Chinese and Japanese won't want any more of our bonds.

We're increasingly a country that indulges in magical thinking. Excessive belief in belief isn't getting us anywhere.

Our ruling class isn't telling us the truth: "We cannot have all-you-can-eat health care and low mortgages and billions on wars and still cut taxes. Hard choices will be forced on us soon if we don't make them ourselves. Let's start the discussion now before it's too late." Instead we have Fox News and talk radio.

Our educational system is falling behind. Our kids are encouraged to spend more time playing sports and game consoles and surfing the Internet and keeping up with the Kardashians and texting on iPhones instead of reading or doing science. Our colleges are turning out plenty of lawyers and MBAs and fewer scientists and engineers. Where do people think that innovations like iPhones and netbooks come from?

I'm still glad that Mr. Obama is in charge. But I hoped for more.

Our problems are bigger than any president. It should not take catastrophe for us to look inside ourselves and decide that we need to reconsider the path we're taking.



Tuesday, January 19, 2010

High Tech










I left a meeting this afternoon and headed towards the elevator to go back to my desk. I had the small Moleskine pocket notebook that I carry with me to jot down notes and reminders in my hand.

I suddenly flashed back to the memory of one of my all-time favorite co-workers, a brilliant guy who got a 4.0 GPA in computer science from St. John's University and currently works for Microsoft. Throughout my engineering career I made it a habit to keep a notebook/daily journal in which I scribbled technical details, derivations, sketches, handy tidbits, etc. I would date and index them for "easier" cross-reference, but being paper my ability to recall things was limited.

I'd write small personal notes in the margins. For example, each time my wife called with news that we were expecting one of our daughters I made a note of it.

After abandoning engineering and embarking on my software development adventure I tossed them all out.

I might have thrown away the now-useless notes when I stopped being an engineer, but I kept the habit. When I met my friend at the employer we had in common for a year, he gushed about the fact that I carried a notebook and wrote stuff down. He told his wife about it, who said it sounded like a great thing to do. I felt "cool" and high-tech, pleased and a little embarrassed by his effusive praise.

Flash forward twelve years. I still have my Moleskine in my hand, but now that I'm surrounded by young bucks with iPhones I don't feel very high-tech or cool anymore.

I felt quaint today. How quickly things change.


Saturday, January 16, 2010

Sacred Text











There's a question over on Stackoverflow about the value of knowing how low-level details work. The questioner cited an article written by one of the founders of Stackoverflow.com, Joel Spolsky. His point is that modern programmers tend to focus on learning high level abstractions like Java and .NET and forget about the byte-level details. He goes on to cite stories about storage of strings in C to bolster his argument. The questioner wants to hear specific examples of how knowing C can make one a better programmer.

Joel is a terrifically smart guy. His degree from Yale is wonderful; he's got a Microsoft pedigree; his Fog Creek software company has been in business for years now; he's one of the best known bloggers about software on the web; he founded Stackoverflow with Jeff Atwood. Seeing how much time I spend there for no more renumeration than reputation points, the thrill of helping others, and learning a few things along the way I'd say that Joel is a tremendous success in this field by every measure.

I've even agreed with his point on this blog. Who can argue in favor of ignorance? "Please tell the court when you stopped beating your wife, sir." There's no winning that point.

But the answer that I started to write in response to the question was negative. I recommended taking Joel with a grain of salt, since the post was written in 2001. I was surprised by this, because I'm generally in favor of learning regardless of its commercial payoff. So I decided to explore the idea a bit more here.

I learned C while I was still a mechanical engineer. The only language I ever knew was FORTRAN, of course. One day my employer disconnected us from the VAX computer we were all sharing and gave individual Sun workstations. I had Unix at my fingertips. I was fortunate enough to sit in an aisle with a brilliant guy named Kim Perlotto. He worked in another group that didn't have anything to do with the numerical analysis gang that I ran with, but he was wonderfully smart and terrific to talk to. I didn't appreciate the computer science knowledge that was spewing out of him all the time, because I was so focused on engineering that I was too ignorant to even know what he was talking about. ("Software objects? Since they're 'soft', they must be deformable - maybe viscoplastic. We'll need an appropriate large strain measure, like Green-Lagrange and its energy conjugate stress measure, 2nd Piola-Kirchoff, maybe a viscoplastic material model by Kevin Walker or Chaboche...")

I was walking by Kim's cube one day when I spied his well-thumbed copy of pre-ANSI K&R sitting on the corner of his desk. I picked it up and asked about it. He smiled and said, "Wanna learn C? You can borrow it if you like."

I struggled through that book. The whole idea of pointers escaped me for a while. I remember the day I figured out how function pointers worked. I was able to change the way a program worked simply by asking a pointer to refer to a new function. Magic! I was so happy when a friend complained about a C routine that was returning nonsense results from the input arrays that were passed in. My suggestion that C arrays being zero-based required subtracting one from the input pointers saved the day. I was able to bask in glory for an entire afternoon.

It was my first step away from engineering and towards software development. When C++ came along, it was close enough to C to entice me to learn it. (Much like you can entice a fruit-loving dog out of a crate with a wedge of apple.) I wrote C++ for a living when I first left engineering, allowing me to dip my toes into the vast ocean of object-orientation. Then Java came along, and now C#.

I'm happy to say that I did learn C and C++ well enough to feel comfortable and conversant with both. But if asked to write either one now I'd have to remember a lot of the syntactical subtleties. It's been eight years since I last wrote in either language.

So when I started to think about the Stackoverflow question, I was hard-pressed to think of a specific example of how knowing C has made me a better programmer. It changed me into a programmer in the first place, but I don't write C anymore.

I'd say I'm a much better programmer now than I was when I first picked up K&R. All those years of learning, context, and experience have helped. I find it impossible to tease my knowledge of C out of that tangle and hold it up to the light.

The follow-up question should be: Are all the layers of abstraction being used in software development harmful? Are the generations of programmers plying their trade today inferior to their predecessors, who were worried about making every byte count? I would say "it depends", in the same way that Brian Cox is both Isaac Newton's inferior and superior in physics. Brian is a brilliant guy who has internalized all that Newton gave us and has gone far beyond it, but he'd be the first to admit that he's standing on the shoulders of giants.

The difference is that it's not possible for Brian to practice physics without a thorough understanding of everything Newtonian. Calculus is the mathematics of dynamic systems. I think it is possible to make a living as a developer and never write C. The cursory knowledge of C to understand pointers and manual memory management that a skim through K&R would give you might be sufficient.

Peering behind the curtain to understand everything beneath the abstractions is a laudable impulse, but it has to be indulged given constraints of energy and time. There are only so many hours in the day, and lots to learn. Economists would tell us to be mindful of opportunity costs. Joel Spolsky is a smart guy, but his blog isn't a sacred text - yet.


Friday, January 8, 2010

Spring Transactional DatabaseTests










I've had a programming problem that's been bugging me for a while. I'm a Java developer who's a fan of both Spring and unit testing. So whenever I create a data access object (DAO) for persisting objects I like to unit test it to make sure that I've coded everything properly.

One of the special problems with database testing, especially for those cases where you share a relational database with others, is relying on data being present to make your tests pass. What happens if the data that made my tests run at 100% success is removed by someone else?

One solution is to use an isolated database with the identical schema that is completely under your control. This may not be practical for large schemas.

Another solution is to make your tests transactional: start the test, open a transaction, populate the database with data, run your tests, and roll them back. That way you're operating on the real, live schema. You seed the database with data you can rely on. Transactions wipe out your footprints and make it look like you never modified the database at all.

Spring provides base classes to accomplish exactly this: one for JUnit version 4 and another for TestNG.

I've become a fan of TestNG, but I'm unhappy to report that I couldn't make this ideal situation work for TestNG. I went back to the Spring reference docs in frustration and started again with JUnit version 4. Section 8.3.7.4. "Transaction management" lays it out perfectly. My tests were 100% successful. If I stopped in a debugger and looked at the database, I could see the seed data rows. When the test was completed, the table rolled back to its undisturbed state, as expected.

It should have been as simple as exchanging a single JUnit annotation for its TestNG equivalent, but autowiring of beans wasn't working as it should. When I tried to inject the bean manually from the application context I had another problem. I'll have to dig into this a bit more to see if I can make TestNG work with Spring.

I used a simple model object Product:


package tutorial.model;

import java.io.Serializable;
import java.text.DecimalFormat;

public class Product implements Serializable
{
private Integer id;
private String name;
private double price;
private int quantity;

public Product(String name, double price, int quantity)
{
this(null, name, price, quantity);
}

public Product(Integer id, String name, double price, int quantity)
{
this.id = id;
this.name = name;
this.price = price;
this.quantity = quantity;
}

public Integer getId()
{
return id;
}

public void setId(Integer id)
{
this.id = id;
}

public String getName()
{
return name;
}

public void setName(String name)
{
this.name = name;
}

public double getPrice()
{
return price;
}

public void setPrice(double price)
{
this.price = price;
}

public int getQuantity()
{
return quantity;
}

public void setQuantity(int quantity)
{
this.quantity = quantity;
}

@Override
public boolean equals(Object o)
{
if (this == o)
{
return true;
}
if (o == null || getClass() != o.getClass())
{
return false;
}

Product product = (Product) o;

if (Double.compare(product.price, price) != 0)
{
return false;
}
if (quantity != product.quantity)
{
return false;
}
if (id != null ? !id.equals(product.id) : product.id != null)
{
return false;
}
if (name != null ? !name.equals(product.name) : product.name != null)
{
return false;
}

return true;
}

@Override
public int hashCode()
{
int result;
long temp;
result = id != null ? id.hashCode() : 0;
result = 31 * result + (name != null ? name.hashCode() : 0);
temp = price != +0.0d ? Double.doubleToLongBits(price) : 0L;
result = 31 * result + (int) (temp ^ (temp >>> 32));
result = 31 * result + quantity;
return result;
}

@Override
public String toString()
{
return "Product{" +
"id=" + id +
", name='" + name + '\'' +
", price=" + DecimalFormat.getNumberInstance().format(price) +
", quantity=" + quantity +
'}';
}
}


There's a ProductDao interface:


package tutorial.persistence;

import tutorial.model.Product;

import java.util.List;

public interface ProductDao
{
List<Product> find();
Product find(Integer id);
List<Product> find(String name);
List<Product> find(double minPrice, double maxPrice);
void save(Product product);
void update(Product product);
void delete(Product product);
void delete();
}


The ProductDaoImpl uses Spring JDBC:


package tutorial.persistence.jdbc;

import org.springframework.jdbc.core.PreparedStatementCreator;
import org.springframework.jdbc.core.simple.SimpleJdbcDaoSupport;
import org.springframework.jdbc.support.GeneratedKeyHolder;
import org.springframework.jdbc.support.KeyHolder;
import org.springframework.stereotype.Repository;
import tutorial.model.Product;
import tutorial.persistence.ProductDao;

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

@Repository
public class ProductDaoImpl extends SimpleJdbcDaoSupport implements ProductDao
{
public static final String BASE_SELECT = "select id, name, price, quantity from product ";
public static final String FIND_ALL = BASE_SELECT + " order by id ";
public static final String FIND_BY_ID = BASE_SELECT + " where id = ? ";
public static final String FIND_BY_NAME = BASE_SELECT + " where name = ? ";
public static final String FIND_BY_PRICE_RANGE = BASE_SELECT + " where price between ? and ? ";
public static final String INSERT_SQL = "insert into product(name, price, quantity) values(?,?,?)";
public static final String UPDATE_SQL = "update product set name = ?, price = ?, quantity = ? where id = ?";
public static final String DELETE_ALL_SQL = "delete from product ";
public static final String DELETE_BY_ID = DELETE_ALL_SQL + " where id = ?";

public List<Product> find()
{
ProductRowMapper productRowMapper = new ProductRowMapper();

return getSimpleJdbcTemplate().query(FIND_ALL, productRowMapper);
}

public Product find(Integer id)
{
return this.getSimpleJdbcTemplate().queryForObject(FIND_BY_ID, new ProductRowMapper(), id);
}

public List<Product> find(String name)
{
return this.getSimpleJdbcTemplate().query(FIND_BY_NAME, new ProductRowMapper(), name);
}

public List<Product> find(double minPrice, double maxPrice)
{
return this.getSimpleJdbcTemplate().query(FIND_BY_PRICE_RANGE, new ProductRowMapper(), minPrice, maxPrice);
}

public void save(final Product product)
{
KeyHolder keyHolder = new GeneratedKeyHolder();

this.getJdbcTemplate().update(new PreparedStatementCreator()
{
public PreparedStatement createPreparedStatement(Connection connection) throws SQLException
{
PreparedStatement ps = connection.prepareStatement(INSERT_SQL, new String [] { "id" });
ps.setString(1, product.getName());
ps.setDouble(2, product.getPrice());
ps.setInt(3, product.getQuantity());
return ps;
}
}, keyHolder);

product.setId(keyHolder.getKey().intValue());
}

public void update(final Product product)
{
this.getSimpleJdbcTemplate().update(UPDATE_SQL, product.getName(), product.getPrice(), product.getQuantity(), product.getId());
}

public void delete(final Product product)
{
this.getSimpleJdbcTemplate().update(DELETE_BY_ID, product.getId());
}

public void delete()
{
this.getSimpleJdbcTemplate().update(DELETE_ALL_SQL);
}

private void update(String sql, final Product product)
{
Map parameters = new HashMap()
{{
put("id", product.getId());
put("name", product.getName());
put("price", product.getPrice());
put("quantity", product.getQuantity());
}};

this.getSimpleJdbcTemplate().update(sql, parameters);
}
}


The Spring application context uses the DataSourceTransactionManager:


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:tx="http://www.springframework.org/schema/tx"
xsi:schemaLocation="
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.5.xsd
http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.5.xsd">

<tx:annotation-driven transaction-manager="txManager"/>

<bean id="dataSourceProperties" class="org.springframework.beans.factory.config.PreferencesPlaceholderConfigurer">
<property name="location" value="classpath:product-datasource.properties"/>
</bean>

<bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="${datasource.driver}"/>
<property name="url" value="${datasource.url}"/>
<property name="username" value="${datasource.username}"/>
<property name="password" value="${datasource.password}"/>
</bean>

<bean id="productDao" class="tutorial.persistence.jdbc.ProductDaoImpl">
<property name="dataSource" ref="dataSource"/>
</bean>

<bean id="txManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
<property name="dataSource" ref="dataSource"/>
</bean>

</beans>


The Spring transactional JUnit 4 unit test has all the annotations from Chapter 8 of the reference manual:


package tutorial.persistence;

import org.junit.After;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.test.annotation.Rollback;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.AbstractTransactionalJUnit4SpringContextTests;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;
import org.springframework.test.context.transaction.AfterTransaction;
import org.springframework.test.context.transaction.BeforeTransaction;
import org.springframework.test.context.transaction.TransactionConfiguration;
import org.springframework.transaction.annotation.Transactional;
import org.springframework.beans.factory.annotation.Autowired;

import tutorial.model.Product;

import javax.annotation.Resource;
import java.util.List;

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "file:resources/product-*.xml" })
@Transactional
@TransactionConfiguration(transactionManager="txManager", defaultRollback=true)
public class ProductDaoTest extends AbstractTransactionalJUnit4SpringContextTests
{
@Autowired
private ProductDao productDao;
Product [] testProducts =
{
new Product("Dell", 1000.0, 100),
new Product("HP", 2000.0, 200),
new Product("Cisco", 3000.0, 300),
new Product("Microsoft", 4000.0, 400),
};
private static final double TOLERANCE = 1.0E-8;

@BeforeTransaction
public void verifyInitialDatabaseState()
{
List<Product> products = this.productDao.find();

assert products != null && products.size() == 0;
}

@Before
public void populateDatabase()
{
for (Product product : testProducts)
{
productDao.save(product);
}
}

@Test
@Rollback(true)
public void testFindAll()
{
List<Product> actual = this.productDao.find();

assert actual != null && actual.size() == testProducts.length;
for (Product product : testProducts)
{
assert actual.contains(product);
}
}

@Test
@Rollback(true)
public void testById()
{
List<Product> actual = this.productDao.find();

assert actual != null && actual.size() == testProducts.length;
for (Product product : testProducts)
{
Product byId = productDao.find(product.getId());
assert byId.equals(product);
}
}

@Test
@Rollback(true)
public void testFindByName()
{
List<Product> actual = this.productDao.find();

assert actual != null && actual.size() == testProducts.length;
for (Product product : testProducts)
{
List<Product> byName = productDao.find(product.getName());
assert byName != null && byName.size() == 1 && byName.get(0).equals(product);
}
}

@Test
@Rollback(true)
public void testFindByPriceRange()
{
List<Product> actual = this.productDao.find();

assert actual != null && actual.size() == testProducts.length;
for (Product product : testProducts)
{
double minPrice = product.getPrice() - 10.0;
double maxPrice = product.getPrice() + 10.0;
List<Product> byPriceRange = productDao.find(minPrice, maxPrice);
assert byPriceRange != null && byPriceRange.size() == 1 && byPriceRange.get(0).equals(product);
}
}

@Test
@Rollback(true)
public void testUpdate()
{
List<Product> actual = this.productDao.find();

assert actual != null && actual.size() == testProducts.length;
double priceIncrease = 1000.0;
for (Product product : testProducts)
{
double oldPrice = product.getPrice();
product.setPrice(oldPrice + priceIncrease);
productDao.update(product);
Product byId = productDao.find(product.getId());
assert Math.abs(byId.getPrice() - (oldPrice+priceIncrease)) < TOLERANCE;
}
}

@Test
@Rollback(true)
public void testDelete()
{
List<Product> before = this.productDao.find();

assert before != null && before.size() == testProducts.length;
productDao.delete(testProducts[0]);

List<Product> after = this.productDao.find();

assert after != null && after.size() == (before.size()-1) && !after.contains(testProducts[0]);
}


@AfterTransaction
public void verifyFinalDatabaseState()
{
List<Product> products = this.productDao.find();

assert products != null && products.size() == 0;
}
}


Since all the transactional annotations are Spring, I thought that switching from JUnit 4 to TestNG would be as simple as the following three steps:


  1. Remove the @RunWith annotation calling the JUnit 4 runner

  2. Switch the base class

  3. Replace the JUnit 4 @Before annotation with its closest TestNG equivalent (@BeforeSuite)


Unfortunately, there's some autowiring magic that's lost in the translation. I get a NullPointerException for the ProductDao reference in the populateDatabase method. When I added code to inject the bean from the application context it failed as well.

If anyone has any advice that would get me off the dime with TestNG I'd appreciate hearing it. In the meantime, I know that Spring's transactional database tests work exactly as advertised with JUnit 4.



Tuesday, January 5, 2010

New England Oireachtas 2009 Results










The New England Oireachtas was held in Providence RI the weekend before Thanksgiving. Oireachtas is a Gaelic word that means "gathering of the tribes,", but in this case it's the name of the regional championships for Irish dance. Even if you've never seen Irish dance, it's likely that you're familiar with Riverdance, the internationally famous show that brought the art form into the collective consciousness. Or perhaps you've seen send-ups of the dancing on Saturday Night Live.

Riverdance was a revelation. It sparked a huge interest in Irish dance. Schools that had been dormant for years were suddenly overwhelmed by a generation of little girls who decided they wanted to dance like Jean Butler. A dance competition was called a "feis" (pronounced "FESH") in the singular and "feisanna" in the plural. There was an explosion in the sales of all the paraphernalia that attended it: shoes, dresses, trophies, wigs, etc.

My daughter Erin fell under its spell the first time she saw it on a video. We signed her up at Duffy Academy in East Hartford, where she took instruction every Saturday morning from Mary Duffy. I would drive her over every weekend, bring a book, and read while she practiced. I loved that time with her. She was at the age where her father could do no wrong. I'd do silly, fun things like pretend that I'd forgotten the way to the school and force her to tell me when to turn. When I walked to and from the car I'd hold her hand and give it two quick squeezes, simulating the beating of a heart. She'd pump back twice, and we'd alternate all the way to our destination. If one of us stopped, a panic would ensue and CPR would have to be administered until all was well again.

I wasn't the right parent to fulfill her ambitions. We changed schools when Mary Duffy's health started to fail her. My wife took over chauffeur duties to lessons and competitions. Dresses were made and sold, each more elaborate than the one before.

Erin progressed rapidly when she started competing. She quickly rose to the ranks of the best dancers in New England, placing in the New England and North American championships regularly, but she never ranked high enough to qualify for the World Championships in Ireland.

Until this year.

She had a great day in Providence and placed fifth in her age group, her best showing yet. She qualified to compete in Glasgow at Easter 2010.

It's a great accomplishment, of course. But even better is her humility and graciousness after winning. She's a competitor who still manages to enjoy the dancing and interacting with her fellow dancers. We experienced a spike in our phone bill after the competition, because she was fielding so may congratulatory text messages from the other girls in her age group.

Riverdance is coming to an end this year, but Erin plans to continue dancing for a little longer - at least until the coming Easter.


Sunday, January 3, 2010

Beautiful Data










I finished reading "Beautiful Data" a couple of weeks ago. It's an O'Reilly book edited by Toby Segaran (of "Programming Collective Intelligence" fame) and Jeff Hammerbacher.

I love the topic. We're awash in data, thanks to the Internet, but there will be a greater premium placed on gleaning insight from it all in the future. This book doesn't provide lessons on how to do that, but it does list some wonderful examples. The fields are diverse: biology, social sciences, criminology, space exploration, and others.

I thought the book was a little uneven, as compendiums of essays often are. It's hard to establish and meet a high quality standard consistently when many authors are involved.

Three essays stood out for me.

Peter Norvig is a new technical hero of mine. If a B.S. from Brown in applied math, a Ph.D. in computer science from Cal Berkeley, and Director of Research at Google isn't enough, maybe "Teach Yourself Programming In Ten Years" and "How To Write A Spelling Corrector" will put you over the top. His essay entitled "Natural Language Corpus Data" discusses the trillion word data set published by Thorsten Brants and Alex Franz of Google and its applications. This includes a predecessor of the spelling corrector I cited earlier.

Biology is an up-and-coming field that I'm largely ignorant of. The little I know is being spoon fed to me by my brilliant youngest daughter, but I'm a messy, inattentive eater who gets as much on the floor as I do into my gob. "Life in Data: The Story of DNA" by Matt Wood and Ben Blackburne gives a nice overview with a data slant that I enjoyed.

"Superficial Data Analysis: Exploring Millions of Social Stereotypes" by Brendan O'Connor and Lukas Biewald took data from FaceStat.com, which is down 'temporarily', and mined it for relationships between gender and attractiveness, gender bias in word usage, etc. I thought it was a brilliant use of a public data source.

I'd give an honorable mention to "Building Radiohead's House Of Cards" by Aaron Koblin with Valdean Klump. I love the band. I ran right over to YouTube to see the video after reading the piece. It's brilliant stuff.

There are lots of large data sets that are publicly available. The Stackoverflow.com data is available under a Creative Commons license. It would be a great source of information about the millions of programmers who frequent it.

It doesn't require expensive tools, either. The R statistics package is free to download. The learning curve is steep, but there are resources available here, here, and there to be your sherpa on your way to the summit.

My list of goals for 2010 is growing. If a goal is a dream with deadline, I need to become much better at setting deadlines for myself. Too many dreams remain unfulfilled.