Category Archives: visualisation

Sienna, Rihanna, Cameron and Usama: Baby names in England and Wales

If you are a parent, or soon to be a parent, you may already have discovered the US’s Baby Name Voyager. It’s a data-visualization classic, a wonderful way to bring 100 years of American baby names to life. And like (I think) the very best visualizations, it is useful as well as interesting: not only does it reveal broad social trends, but you can hunt for names for your own children.

Recently, for fun, I decided to make a version for the UK, using modern JavaScript (Backbone and D3). The Office of National Statistics only releases 15 years of name data, but I thought that would still be long enough to make a useful tool for British parents, and find some interesting trends. After all, the country has changed plenty since 1996.

So I built a web app called, imaginatively, England & Wales Baby Names. Just like the Voyager, you can look up names for your own children and see naming trends. You can quickly search through the 27,000 names used by parents since 1996, and see the exact number of babies given each name every year since 1996.

I’ve tried to make the tool as easy to use as possible – and if you type slowly, it will show you results letter by letter. So if you’d like a name starting with the letter i, you can search that way. You won’t be alone, because intriguingly, names beginning with i have trebled in popularity since 1996.

Names beginning with i since 1996

The tool also reveals some striking celebrity-related trends – such as the precipitous decline of the name Jordan. In 1996, Jordan was a very popular name, accounting for 5750 boys and 372 girls. From 1996-1998, when Ms. Price was a fresh-faced Page 3 girl, there was a small fall for boys, and a jump for girls. But in the following decade, as her chest inflated, parents increasingly avoided the name – only 268 boys (20 times fewer) and 5 girls were named Jordan in 2010.

Trends for the name Jordan since 1996

I analysed the ONS data to find the top rising and falling names over the period – both in absolute terms, and proportionally. (I define absolute rises in a name by taking the highest number of babies with that name recorded in any year, and subtracting the lowest number in any year. And I define proportional rises in a name by taking the highest number of babies with that name (corrected for the birthrate that year) recorded in any year, and dividing by the lowest number in any year.)

Biggest absolute rises (F)

  1. Lily
  2. Grace
  3. Ava
  4. Evie
  5. Amelia
  6. Ellie
  7. Isabella
  8. Olivia
  9. Mia
  10. Maisie
Biggest absolute rises (M)

  1. Oliver
  2. Ethan
  3. Charlie
  4. Lucas
  5. Noah
  6. Archie
  7. Oscar
  8. Riley
  9. Jayden
  10. Logan
Biggest absolute falls (F)

  1. Chloe
  2. Lauren
  3. Rebecca
  4. Shannon
  5. Megan
Biggest absolute falls (M)

  1. Daniel
  2. James
  3. Jordan
  4. Matthew
  5. Thomas
Biggest proportional rises (F)

  1. Lexie
  2. Amelie
  3. Miley
  4. Macy
  5. Macie
  6. Lyla
  7. Nevaeh
  8. Macey
  9. Ava
  10. Zuzanna
Biggest proportional rises (M)

  1. Olly
  2. Jenson
  3. Kayden
  4. Ayaan
  5. Jakub
  6. Kaiden
  7. Kenzie
  8. Kacper
  9. Filip
  10. Rocco
Biggest proportional falls(F)

  1. Brittany
  2. Jordan
  3. Courteney
  4. Lauryn
  5. Kirby
Biggest proportional falls (M)

  1. Macaulay
  2. Grant
  3. Chandler
  4. Jordan
  5. Courtney

I think of the names with the biggest absolute rises and falls as the seismic trends that will come to define the period. Broadly, in recent years, girls’ names have become more flowery and old-fashioned, while Biblical boys’ names are out of favour.

However, names with proportional changes show fast-moving trends more clearly, and haven’t been analysed in detail before (as far as I know). In the rest of this post, I discuss some trends influencing proportional rises and falls.

Celebrity big brother

No surprise that celebrity is a big influence. Pop stars with unusual names really seem to affect the trends: thus Macy, Miley, Olly, and Kenzie are all in the top-10 fastest risers over the whole period. Pixie was the fastest-rising girls’ name from 2005 to 2010 (83 babies in 2010), and Tulisa was the fastest from 2008 to 2010 (34 babies in 2010).

But other homegrown celebrities have also raced up the charts in recent years: I noticed big jumps for Fearne and Alexa in particular. Keira is popular too, though has fallen since 2004/5.

Celebrity names may also give an insight into public opinion: I enjoyed comparing trends for Jude and Sienna, especially what happens when Jude is exposed as a a CHEATING LOVE RAT in 2005 – the popularity of his name dips sharply, but hers continues to rise.


Trends for Keira since 1996

Trends for Jude since 1996

Trends for Sienna since 1996

Celebrities’ children are a big influence: thus Rocco (Madonna’s son), and Lyla (a derivation of Lila, Kate Moss’s daughter) both appear on the top-10 fastest-rising names. Brooklyn (Beckham) was the fastest-rising boy’s name in the first five years of the period, between 1996 and 2001, and is still on the up.

Not all celebrity names catch on, though: even a beautiful, famous, and multi-talented owner can’t popularise a truly terrible name. Sorry, Nigella.

Incidentally, I don’t think we can assume parents always name their babies “after” a particular celebrity: Myla first rose to fame as the name of an expensive lingerie brand, but has still clearly inspired many parents (79 babies in 2010), who presumably aren’t deliberately naming their daughters after posh pants.

You’re toxic, baby

Some names, like Jordan, are chiefly notable for falling out of fashion over the period. Most striking is Britney, who explodes into fame in 1999 and almost as swiftly falls from favour again (killing the previously-popular name Brittany in the process). Courtney also drags down Courteney. Unsurprisingly, both Usama and Osama fall sharply in popularity after 2001.

Sometimes, celebrities just get less famous. For boys, a big hero-to-zero is early-90s child star Macaulay. And Lauryn Hill’s career never recovered after the late 1990s.

This sporting life

Sporting names, delightfully, seem to mirror their owners’ careers even more precisely than celebrity names. Jenson (the second fastest-rising name over the whole period) is a case in point. It first gains traction in 2000 (when Jenson Button became Britain’s youngest-ever F1 driver), zooms ahead in 2004 when he finished in the rankings for the first time, falls back again, then races up in 2009, when he won the World Drivers’ Championship.

I also noticed this being true of Thierry (peaking at 51 babies in 2004, when he was Europe’s top goalscorer) and Rio (peaking at 355 babies in 2008, when United won the double).


Trends for Jenson since 1996

Trends for Thierry since 1996

Trends for Rio since 1996

Royalty on the rise

Perhaps surprisingly, the young royals’ names William, Harry, Zara and Beatrice are all steadily on the up since 1996 – indeed, Harry is now the 3rd most popular boys’ name, up from 17th in 1996. No sign of a Kate/Catherine bump yet, though.

Political poison

Political names almost invariably seem to have negative, if any, effects. There’s a significant drop in the name Cameron in recent years, and from a lower base, Blair post 1997. And Cherie has collapsed as a girls’ name. I only spotted one political exception (it might be influenced by the rise in Polish names, but still): Boris, slowly but surely on the rise.


Trends for Cameron since 1996

Trends for Blair since 1996

Trends for Boris since 1996

Eastern Europe

The ONS data doesn’t include ethnicity, but if you browse the site for any length of time, you’ll spot a big jump in Eastern European names following the expansion of the EU in 2005. This probably accounts for Filip, Kacper and Zuzanna being in the top-10 lists (though they still account for small numbers of babies overall.)

I think the archtypal British name of recent years may be Jakub, the fifth fastest-rising boys’ name from 1996 to 2010, and the fastest-rising Polish name. Not only is it Polish, it is also a famous footballer’s name (Borussia Dortmund star Jakub Blaszczykowski).

Art and culture

On the list above, Amelie (the second fastest-rising girls’ name over the whole period) is probably the biggest fictional influence, becoming popular after the 2001 film of the same name. The Matrix is also a big film influence in the late 1990s – both Neo (boys) and Trinity (girls) made it into the top-10 fastest-rising names between 1996 and 2001.

From the book world, a notable new name is Lyra, which Philip Pullman invented for His Dark Materials, and inspired a whopping 152 sets of parents in 2009. And I’m not sure this counts as either art or culture, but Chardonnay (in various different spellings, from Chardenay to Chardae to Chardonnai) explodes in popularity after Footballers’ Wives.

And you don’t have to be a pop star or a sporting hero to popularise a name: the fastest-rising boy’s name between 2006 and 2010 was Grayson. The only famous Grayson I know is the ceramicist Grayson Perry. So you can be an artist too.


Trends for Trinity since 1996

Trends for Chard- since 1996

Trends for Grayson since 1996

Finally, we began choosing more unusual names during the last decade and a half. In 1996, the ONS reported 8,671 unique names for 649,488 babies, or roughly 74 babies for each name. By 2010, this had risen to 13,421 unique names for 723,165 babies, or roughly 55 babies for each name. (The ONS does not report names only given to 1 or 2 babies in a year, so a mathematician wouldn’t regard this as proof, but the overall trend is clear.)

And parents consistently show more variety when naming their daughters than their sons. In 2010, there were 7,388 unique names for 352,248 girl babies, but just 6,033 unique names for 370,917 boy babies.

What trends have I missed? Let me know in the comments.

A note on colour: I really wanted to avoid using pink for female names. I tried green and purple, but the visual contrast was poor, and early testers found it confusing. So I used relatively un-girly dark-red pink. Sorry, pink haters.

Try looking for your own name on the site: England & Wales Baby Names. If you want to do your own analysis, please see the ONS raw data, or my aggregated dataset (reproduced under the Open Government Licence), and check out the script I used to identify trends.

Introducing… What Size Am I

For many women, there are few things more frustrating than trying on clothes. To put it in terms that my (mostly male) coder friends will understand: debugging CSS doesn’t come close to the blood-boiling irritation of trying to work out whether you are a size 8, a size 10, or both. Because, yes, you can be one size for tops and another for skirts, all in the same shop.

It may surprise men reading this to learn that there is no agreement on what makes a size 10. Shops differ. A lot. When I am shopping on the high street, I take each item into the changing room in two or three different sizes. When shopping online, I’m sure you can see that this is even more of a problem.

Anyway, here is my attempt to help make sizing a bit easier for female customers, inspired by this New York Times article about the madness. The Times pointed out the problem, but they didn’t turn it into a solution – that’s what I’ve tried to do here, having noticed that most stores do publish their own size details online.

And so: presenting “What Size Am I?”, a web app to help women in the UK and the US find clothes that fit.

Here is a screenshot:

What Size Am I? page

As a female hacker, this combines two of my main interests in life: clothes and nice tech. If you’re using a modern browser with SVG support, you should be able to enter your bust, waist and hip measurements in inches or cm, and see an interactive graph of where you fit, from roomy Jaeger to tiny Reiss. If you’re using IE8 or below, you’ll just see a table (sorry IE-using folks).

I’ve also included the closest fits of all (using an admittedly blunt least-squares metric), because it’s helpful to know a shop or two where you’re guaranteed to find things that fit. Currently that’s the kind of knowledge only gained after a lot of Saturday afternoons struggling with a lot of zips.

While working on this, I noticed some interesting trends. Firstly, all stores size in evenly spaced increments – because they are using fitting models rather than individual models for each size – but different stores aim for different markets.

Some retailers seem to cover pretty much every widely available size – in the UK, these include Gap, Marks & Spencer, Monsoon, and Next:

What Size Am I? page

Others are unashamedly aimed at what I call the “fashionable midget” end of the market, like TopShop, Banana Republic and Kate Middleton’s beloved Reiss:

What Size Am I? page

Secondly, I assumed that the fashionable-midget and pricier stores would size smaller, but that’s not actually true. Counter-intuitively, a size 10 in upmarket Whistles, Zara, or Reiss is actually quite a lot larger than a size 10 in ASOS, Monsoon, or M&S.

I think that’s because the “whole of market” stores have larger gaps between their sizes. Or it might be vanity sizing, because Whistles, Reiss et al probably have wealthier, older customers. Who knows?

Thirdly, this is really best shown by comparing sizes with your own body shape, but it’s possible to see the different body types that different shops fit. Compare LK Bennett (light blue) with TopShop (dark blue):

What Size Am I? page

The light blue curves are much, well, curvier than the dark blue. LK Bennett is cut for the strongly hourglass, and slightly pear-shaped: TopShop is more up-and-down.

Broadly and unscientifically speaking, M&S, Karen Millen and French Connection look the most pear-shaped to me: Banana Republic and Warehouse look best for the top-heavy: LK Bennett and Zara are cut for a fitted waist, while Oasis and TopShop appear least curvy overall.

This is pleasing, because it confirms the suspicions I’ve held for a long time. I hope you find the tool useful: if you see anything I could do better, please let me know in the comments.

PS: OMG, D3 FTW

Building this has been an excuse to play with D3.js, the JavaScript library formerly known as Protovis, which I use to draw the chart. D3 is awesome: many thanks to Mike Bostock for building it and making it open source.

Train times v. house prices: the commuter belt, on a graph

We’re house-hunting. And for me, like most coders, house-hunting involves lots and lots and lots of screen-scraping.

As well as crawling Rightmove listings, I’ve been looking at transport and house-price data. Specifically, I’ve scraped travel times to London by train versus house prices, to examine the theory that houses get much cheaper once you escape the commuter belt.

To test this, I gathered mean journey times to London from Traintimes for every railway station in the UK, and mean asking prices for 3-bed houses near each station from Nestoria. Here’s the graph of all stations, with a moving-average line added:

Waiting for graph to load…

Mouse over the graph to see data for individual stations. Or type a station name to highlight it on the graph:  

Thoughts on the graph

  • The sharp initial drop, up to about 30 minutes, must show just how much extra you pay to live in zone 2 rather than zone 6 of London itself. Yikes.
  • Prices do start dropping more steeply about 70 minutes from London, which probably marks the edge of the commuter belt.
  • Once you get to about 150 minutes, prices flatten. Except…
  • …There’s a distinct “Edinburgh bump” at about 270 minutes from London, which I wasn’t expecting at all.
  • There are a few high outliers, presumably where a mansion has skewed the average price. (It’s difficult to tell from the Nestoria data.)
  • But there’s a striking baseline below which house prices near a station never fall. Actually, pretty much the closest thing to an outlier on the downside is poor old Corby.

About the data

For clarity, the graph excludes London stations, and the long tail of stations that are 400-900 mins from the capital, mostly in the Scottish Highlands.

This is roughly what I did:

  • Find and geocode the 2500+ stations in England, Scotland and Wales, from this Guardian version of Office of Rail Regulation station usage data.
  • For each station, find the mean travel time for the first 5 journeys to London after 8am on a weekday, scraped from TrainTimes, Matthew Somerville’s accessible version of National Rail Enquiries.
  • For each station, find the mean asking price for a 3-bed house within 2km in the past 6 months, from the Nestoria API. (Nestoria shows listing prices, rather than transaction prices like Zoopla, so it may contain duplicates and is probably less accurate – but Zoopla isn’t granular enough to search just for 3-bed houses.)
  • Plot the moving average price, with a frame of 100 datapoints.

This is the code I used (on Github), and the resulting raw data (in Fusion Tables). The next logical step would be to plot distances against house prices, I guess. If I’ve missed anything, let me know.

And with that, back to the screen-scrapers, the mortgage brokers and – God help us – the estate agents.