{"id":70,"date":"2011-10-13T15:16:36","date_gmt":"2011-10-13T15:16:36","guid":{"rendered":"http:\/\/darkgreener.com\/?p=70"},"modified":"2020-09-07T20:21:52","modified_gmt":"2020-09-07T20:21:52","slug":"train-times-v-house-prices-graphing-the-commuter-belt","status":"publish","type":"post","link":"https:\/\/anna.ps\/blog\/train-times-v-house-prices-graphing-the-commuter-belt","title":{"rendered":"Train times v. house prices: the commuter belt, on a graph"},"content":{"rendered":"<p>We&#8217;re house-hunting. And for me, like most coders, house-hunting involves lots and lots and <em>lots<\/em> of screen-scraping.<\/p>\n<p>As well as crawling Rightmove listings, I&#8217;ve been looking at transport and house-price data. Specifically, I&#8217;ve scraped travel times to London by train versus house prices, to examine the theory that houses get much cheaper once you escape the commuter belt.<\/p>\n<p>To test this, I gathered mean journey times to London from <a href=\"http:\/\/traintimes.org.uk\">Traintimes<\/a> for every railway station in the UK, and mean asking prices for 3-bed houses near each station from <a href=\"http:\/\/nestoria.co.uk\">Nestoria<\/a>. Here&#8217;s the graph of all stations, with a moving-average line added:<\/p>\n<div id=\"train_graph\" style=\"width: 100%; height: 400px; margin-bottom: 20px;\">Waiting for graph to load&#8230;<\/div>\n<p data-children-count=\"1\">Mouse over the graph to see data for individual stations. Or type a station name to highlight it on the graph: &nbsp; <input id=\"station_names\" style=\"margin-bottom: 0 ! important;\" size=\"40\" type=\"text\"><\/p>\n<p><strong>Thoughts on the graph<\/strong><\/p>\n<ul>\n<li>The sharp initial drop, up to about 30 minutes, must show just how much extra you pay to live in zone 2 rather than zone 6 of London itself. Yikes.<\/li>\n<li>Prices do start dropping more steeply about 70 minutes from London, which probably marks the edge of the commuter belt.<\/li>\n<li>Once you get to about 150 minutes, prices flatten. Except&#8230;<\/li>\n<li>&#8230;There&#8217;s a distinct &#8220;Edinburgh bump&#8221; at about 270 minutes from London, which I wasn&#8217;t expecting at all.<\/li>\n<li>There are a few high outliers, presumably where a mansion has skewed the average price. (It&#8217;s difficult to tell from the Nestoria data.)<\/li>\n<li>But there&#8217;s a striking baseline below which house prices near a station never fall. Actually, pretty much the closest thing to an outlier on the downside is poor old <a href=\"http:\/\/www.youtube.com\/watch?v=zvzPwOsf7fg\">Corby<\/a>.<\/li>\n<\/ul>\n<p><strong>About the data<\/strong><\/p>\n<p>For clarity, the graph excludes London stations, and the long tail of stations that are 400-900 mins from the capital, mostly in the Scottish Highlands.<\/p>\n<p>This is roughly what I did:<\/p>\n<ul>\n<li>Find and geocode the 2500+ stations in England, Scotland and Wales, from&nbsp;<a href=\"https:\/\/docs.google.com\/spreadsheet\/ccc?key=0AonYZs4MzlZbcktheEZFeF84U1J4dFFvckI5X0VBcEE#gid=4\">this Guardian version<\/a> of&nbsp;<a href=\"http:\/\/www.rail-reg.gov.uk\/server\/show\/nav.1529\">Office of Rail Regulation station usage data<\/a>.<\/li>\n<li>For each station, find the mean travel time for the first 5 journeys to London after 8am on a weekday, scraped from <a href=\"http:\/\/traintimes.org.uk\">TrainTimes<\/a>, Matthew Somerville&#8217;s accessible version of National Rail Enquiries.<\/li>\n<li>For each station, find the mean asking price for a 3-bed house within 2km in the past 6 months, from the <a href=\"http:\/\/www.nestoria.co.uk\/help\/api-metadata\">Nestoria API<\/a>. (Nestoria shows listing prices, rather than transaction prices like <a href=\"http:\/\/developer.zoopla.com\/docs\">Zoopla<\/a>, so it may contain duplicates and is probably less accurate &#8211; but Zoopla isn&#8217;t granular enough to search just for 3-bed houses.)<\/li>\n<li>Plot the moving average price, with a frame of 100 datapoints.<\/li>\n<\/ul>\n<p>This is the <a href=\"https:\/\/github.com\/darkgreener\/house-scraping\/blob\/master\/get_stations.py\">code I used<\/a> (on Github), and the <a href=\"https:\/\/www.google.com\/fusiontables\/DataSource?docid=1FpFQL2MHMYkowBgnaZo-WBn3eAHiLSqCUuOm_FQ&amp;hl=en_US\">resulting raw data<\/a> (in Fusion Tables). The next logical step would be to plot distances against house prices, I guess. If I&#8217;ve missed anything, let me know.<\/p>\n<p>And with that, back to the screen-scrapers, the mortgage brokers and &#8211; God help us &#8211; the estate agents. <script type=\"text\/javascript\" src=\"https:\/\/houseprices.darkgreener.com\/jquery-1.6.2.min.js\"><\/script><br \/>\n<script type=\"text\/javascript\" src=\"https:\/\/houseprices.darkgreener.com\/jquery-ui-1.8.16.custom.min.js\"><\/script><br \/>\n<!-- [if IE]><script language=\"javascript\" type=\"text\/javascript\" src=\"https:\/\/houseprices.darkgreener.com\/excanvas.min.js\"><\/script><![endif]--><br \/>\n<script type=\"text\/javascript\" src=\"https:\/\/houseprices.darkgreener.com\/jquery.flot.min.js\"><\/script><br \/>\n<script type=\"text\/javascript\" src=\"https:\/\/houseprices.darkgreener.com\/jquery.csv.min.js\"><\/script><br \/>\n<script type=\"text\/javascript\" src=\"https:\/\/houseprices.darkgreener.com\/chart.js\"><\/script><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We&#8217;re house-hunting. And for me, like most coders, house-hunting involves lots and lots and lots of screen-scraping. As well as crawling Rightmove listings, I&#8217;ve been looking at transport and house-price data. Specifically, I&#8217;ve scraped travel times to London by train versus house prices, to examine the theory that houses get much cheaper once you escape [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,8,7],"tags":[],"class_list":["post-70","post","type-post","status-publish","format-standard","hentry","category-life-hacking","category-transport","category-visualisation"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/posts\/70","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/comments?post=70"}],"version-history":[{"count":144,"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/posts\/70\/revisions"}],"predecessor-version":[{"id":1291,"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/posts\/70\/revisions\/1291"}],"wp:attachment":[{"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/media?parent=70"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/categories?post=70"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/anna.ps\/blog\/wp-json\/wp\/v2\/tags?post=70"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}