Thursday, January 14, 2010

Power law curve in surnames

There are purportedly six million unique surnames in the United States.

Think about that. Considering how many Smiths and Johnsons are running about, that means there are millions of surnames being clutched onto by only a handful of survivors. Indeed, the New York Times says that while 151,000 surnames were shared by a hundred or more Americans, four million were held by only one person. For some reason, I suspect that spelling typos are responsible for at least a million of those. How many census records, for example, accidentally point to the surname "Smioth" or "Wikkiams"?

A few years ago, I wanted to know how many Kohses there were in the United States. I suspected there were about 200 or so. I was able to obtain a tally sheet from the United States Census Bureau, and while they did not preserve the exact 1990 census count, I learned at least that "Kohs" was about the 56,229th most frequently occurring surname in America.

Recently, though, I went back to review that file, and it's been updated with more detailed year 2000 census data. They now tally over 150,000 surnames having more than 100 members. The "Kohs" name has dropped to (approximately, since many families share the same estimated number of members) 84,631st place. They estimate about 206 people with that last name in the United States -- very close to my personal guess of "about 200". Funny that I've already collected at least 30 of them on Facebook.

While poking through the first 1,000 surnames, though, I made a fascinating discovery. After I charted the data, it was clear that the top 1,000 American surnames follow a power law curve in terms of distribution.

Naïve as I may be, I momentarily thought I might have made an impressive discovery, but this is not the case, of course. Academic studies have already examined the power law properties of surnames in Puerto Rico, Japan, England and Wales, Korea, and doubtless many other countries, including the United States.

Labels: , , , , , ,


At 10:07 AM, July 24, 2010, Blogger Steve St Clair said...

Hi Gregory,

I enjoyed the surname post. I run a DNA study of the St. Clair family worldwide and, while I know you're writing about Mkt Research, it's very interesting for genealogy as well.

You clearly have carefully considered opinions about Marketing and I enjoy the blog. I am working with a company that has a solution that may be very interesting for Marketing Research companies and I wonder if I could get you and your other followers to tell me if they agree. The company is Earth Class Mail.

Their solution would help with survey returns and list management.

Our understanding is that more and more mkt research happens in the mail due to the do not call registry.

So let's say you're sending out 5,000 surveys.
1. You'll get 10-15% back with bad addresses
2. You'll pay staff to open and read the results of the surveys.

Small to medium size firms have interns and junior research analysts cleaning the lists and coding the results of the surveys.

With this solution, small to mid size firms direct the survey returns to Earth Class Mail, they scan them.

They can even automate the returns to go the right person or groups in the Pacific or India to compile the data, if that's a work flow of interest.

This is a broad video that explains the entire solution more from a consumer angle, but you can imagine the use for outsourcing marketing research back office work -

Please let me know what you think.



At 10:03 PM, July 25, 2010, Blogger Gregory Kohs said...

I published your comment, Steve, if only so that I may respond that while I think the concept of Earth Class Mail is interesting, the rates seem awfully expensive and would undo any cost-benefit advantage that I would have initially supposed of the service.

At 2:03 AM, September 17, 2010, Anonymous Philip Brookes said...

I really do learn something new every day! I was just searching for random discussions on marketing topics and was not in the least expecting a discussion on the statistics behind the distribution of surnames in the general population, but I must say I found it fascinating - I'm curious as to why it works out this way. Is it just what you would expect when there's random distributions in such a large population? Or are people more inclined to marry somebody with an attractive surname and adopt their name, and pass it on to their kids, etc...??? On the face of it, it's baffling to me - and fascinating! Thanks for sharing :-)


Post a Comment

<< Home