Mapping £2.6 billion of farm payments

Farmers in England are paid via the Environmental Stewardship Scheme to keep their land in good agricultural and environmental condition. This scheme is smaller than the better-known CAP payments, but still accounts for around £3 billion of public funding over the past decade, which isn’t exactly small potatoes (sorry).

I recently discovered there’s open data about the land in England that’s funded in this way. With Brexit hurtling unstoppably towards us, and no-one sure what’s going to happen to farm funding, it seems like a good time to map it.

Talking to farmers about Environmental Stewardship, it seems broadly positive: it encourages good land management, and provides life support for many small farms. But on the other hand, it’s complex, it subsidises golf courses and grouse moors, and it’s weighted towards bigger landowners. So the current uncertainty could provide an opportunity to simplify and rebalance.

Here, I write about what’s in the data: if you just want to see the map of all payments, go to farmpayments.anna.ps. You can search by payee name, and see the payments near you.

Screenshot of farm payments map

What’s Environmental Stewardship then?

The ES scheme pays farmers to keep their land in good environmental condition, and make it attractive for wildlife. For example, they might leave a strip at the edge of a field unplanted, so that animals and birds can live there, or they might maintain traditional hedges or preserve historic features.

The payments are awarded for entry level, higher level, or organic stewardship. Payments are generally awarded over 5 or 10 years, and the total sum in the dataset (which I think is all current agreements) is £2.64 billion. This compares with annual CAP payments of about £3 billion.

There are just over 26,000 unique ES payments active, to around 20,000 payees. The names in the dataset are the managers of the land, not the owners (otherwise I’d be writing about this over at Guy Shrubsole’s wonderful Who Owns England). Sometimes the managers are individuals, sometimes companies, sometimes LLPs and trusts.

The distribution of plot size is highly unequal: if you rank all the plots by size, the top 10% of payees hold 48% of all the physical land area. The distribution of payments is even more unequal. The bottom 15% of payees receive less than £2,500, while the top 10% each receive more than £259k. About 54% of all the funding – £1.44 billion in total – goes to the top 10%.

(The Gini coefficient – the standard measure of inequality – for the land size of the holdings is 0.62, and for payments is higher at 0.71, which again suggests that the schemes are easier to access for bigger payees.)

The total area of farmland covered by current agreements is just under 3.8 million hectares. The total farmed area of England is about 9 million hectares, so wherever you are in the English countryside, you’re probably looking at land that’s been physically shaped by this scheme.

Biggest payees

Here are the top 15 payees overall. The top payees are wildlife and heritage trusts, and then some big farming groups, generally in the east of the country:

  1. National Trust £51,213,835.15
  2. RSPB £41,228,907.04
  3. VERDERERS OF THE NEW FOREST £19,131,601.84
  4. Forest of Dartmoor Commoners Association £13,600,496.27
  5. NORFOLK WILDLIFE TRUST £10,499,614.08
  6. Surrey Wildlife Trust £7,451,642.10
  7. The Hampshire and Isle of Wight Wildlife Trust £6,974,996.65
  8. STANFORD SHEEP £6,230,709.63
  9. Moorhouse Commoners Committee £5,296,905.74
  10. LILBURN ESTATES FARMING PARTNERSHIP £4,766,364.45
  11. Sir Richard Sutton Limited £4,292,253.83
  12. YORKSHIRE WILDLIFE TRUST £4,223,676.92
  13. ELVEDEN FARMS LIMITED £4,066,993.72
  14. ALBANWISE LTD £3,985,773.08
  15. Lincolnshire Wildlife Trust £3,939,815.48

Some of these “commoners” committees are actually groups of sheep farmers.

You’ll notice the Lilburn Estates entry at number 10, with £4.7 million of grants – that’s a huge grouse moor in Northumberland. The Guardian recently covered a Friends of the Earth investigation suggesting that grouse moor management was anything but environmental, with large-scale heather burning.

As well as these subsidies, the estate receives more than £1 million per year from CAP. It’s owned by Duncan Davidson, founder of mega-housebuilder Persimmon Homes. Here’s the extent of the estate:

Screenshot of Lilburn Estates

Notable payees

As far as I know, anyone can apply for environmental stewardship funding, as long as they do the work required. You don’t need to be a farmer, and the rest of your use of the land doesn’t need to be environmentally friendly.

As well as grouse moors like the above, there’s at least £3.2 million going to golf clubs around the country, though many would regard the mere existence of golf clubs as environmentally problematic. For example, Sunningdale Golf Club, near Virginia Water, has been granted £348,839 for “higher level stewardship”:

Screenshot of Sunningdale golf club

There are some surprising grantees, who might well be doing environmental work, but who could probably afford to do it without public money. Eton, Winchester, Millfield, and Wellington schools all receive funding, as do Jesus, Caius, and Pembroke at Cambridge. The University of Oxford Botanic Garden gets £55k, and Christ Church Meadows in Oxford receives £33k:

Christ Church meadows
Christ Church meadows, looking poor. Pic by Tejvan Pettinger.

There’s also money going to some of the wealthiest landowners around. You can see by searching the payees list that the City of London Corporation gets over £2 million for large areas outside London, the Duchy of Cornwall gets £68k directly and another £136k for the Duchy Home Farm, and the Royal Farms at Windsor receive over £1 million for ‘organic stewardship’:

Screenshot of Royal Farms

Some of the land for which grants are awarded is held offshore – nothing illegal about that, but we might ask whether we want to subsidise property owned in tax havens. The commercial pheasant shoot at the Downton Estate, west of Ludlow, has been granted £800k. The land on which it runs was bought by an Isle of Man company in 2010, as you can see on the Private Eye map of overseas land ownership that I built.

Sometimes, grants go to landowners who are both extraordinarily wealthy, and use offshore vehicles. The Marquess of Cholmondeley has a net worth of £60m, according to the Sunday Times Rich List. His estate at Harpley in Norfolk is owned in Jersey, and receives £400k per year from CAP, and £500k total from ES.

Or take the Culham Court Estate outside Henley, bought for £32 million in 2006 by Swiss billionaire Urs Schwarzenbach. (Schwarzenbach is the delightful chap who sacked his gardener for getting injured.) The land receives £120k per year from CAP and £250k total for ES: the Eye’s map shows that the land is owned by a British Virgin Islands company.

What happens now?

While some of the above might seem absurd, incentivising environmental management of land is obviously sensible. And there’s no doubt that the ES scheme supports many small family farms.

But many bigger landowners could be asked to carry out environmental work without subsidy. It also seems clear that the schemes are easier for ‘big farmer’ to access, while small farms have it tough. (There’s a great discussion of the context in this London Review of Books article.)

So whether our post-Brexit priorities are to support family farms, produce cheap food, protect heritage, or encourage diverse wildlife, we need to discuss how we fund the countryside. Many post-Brexit discussions would benefit from a bit of data: I hope the payments map will help people working on this problem.

—–

Thanks to Will Perrin, Seb Bacon, Guy Shrubsole, and Charlie Fisher for comments on the first draft of this post.

My year-and-a-bit working on tech-for-good projects

In the past year or so I did a lot of work on public-interest tech and data projects. I was so busy writing code, designing systems and hiring people that I failed to write anything at all about why these projects were worthwhile, and the sort of design and engineering challenges I had to overcome.

If you’re even slightly into projects that use data and coding for public good, I hope you’ll find this write-up at least mildly interesting!

Work for Private Eye

Once in a while, a dream project comes along. This was the case when a Private Eye journalist called Christian Eriksson wrote to say that he’d obtained details of all the UK properties owned by overseas companies via FOI, and wanted help with the data. This is how I came to build the Overseas Property map for Private Eye, which lets you see which of your neighbours own their property through an overseas company. I’ll write more the tech side of this separately at some point, but essentially the map shows 70,000 properties, two-thirds of which are owned in tax havens.

Detail from the Private Eye offshore map
A detail from the map showing streets in Mayfair – whole blocks are owned by overseas companies.

Christian and fellow Eye hack Richard Brooks wrote more than 30 stories about the arms dealers, money launderers and tax avoiders hiding property via these companies – the stories eventually became a Private Eye Special Report. The map was discussed in Parliament, written up in the FT, and the government eventually released the same data publicly.

This December, the investigation and map were nominated for the British Journalism Awards, in the ‘digital innovation’ and ‘investigation of the year’ categories, so I got to go to a fancy awards party. (Not too fancy – the goodie bag consisted of some nuts and a bottle of Heineken.) We were highly commended in the ‘digital innovation’ category, which was nice.

I also worked on another project for Private Eye. Freelance journalist Dale Hinton spotted that some local councillors (amazingly!) choose not to pay their council tax, and dug out the numbers across the country. Then the Eye’s Rotten Boroughs editor, Tim Minogue, suggested mapping the data. The resulting map just shows the number of rogues in each council. There were some creative excuses from the rogues, but my favourite was the councillor who admitted simply: “I ballsed up”.

Tech lead at Evidence-Based Medicine DataLab

My day job for most of 2016 was as tech lead at the Evidence-Based Medicine DataLab at the University of Oxford. This is a new institution set up by the brilliant Dr Ben Goldacre (of Bad Science fame). Evidence-based medicine uses evidence to inform medical practice, and the Lab aims to extend that by helping doctors use data better. I was the first hire.

As you might expect, this was a fascinating and rewarding job. I led on all the technology projects, collaborated on research, and helped build the team from 2 to 9 full-time staff, so a big chunk of my year was spent recruiting. In many ways 2016 was the year when I stopped being ‘just a coder’, and started to learn what it means to be a CTO. Here are some of the projects I worked on.

OpenPrescribing

I got the job at EBM DataLab on the strength of having been the sole developer on OpenPrescribing, collaborating with Ben and funded by Dr Peter Brindle at West of England Academic Health Sciences Network. This site provides a rapid search interface, dashboards and API to all the prescribing decisions made by GPs in England & Wales since 2010. Basically, it makes it easier to see which medicines were prescribed where.

The big challenge on this project was design and UX. I interviewed doctors, prescribing managers and researchers, and we ended up with dashboards to show each organisation where it’s an outlier on various measures – so each GP or group of GPs can quickly see where it could save money or improve patient care.

The charts use percentiles to allow users to compare themselves with similar organisations, e.g. here’s how the group of GPs in central Birmingham used to prescribe many more expensive branded contraceptive pills than similar groups elsewhere, but improved things recently:

Cerazette chart for NHS Birmingham Cross-City CCG
If this group of GPs prescribed branded contraceptives in the same proportion as the median (blue dashed line), they would have spent about £30,000 less in the past six months alone. This is the exact same drug – the only difference is the brand name.

There’s also a fast search form for users who know what they’re looking for, and an API that lets researchers query for raw data files. Technically, it’s a Postgres/Django/DRF back-end, and JavaScript front-end with Highcharts to render the graphs (code here).

The raw data files are so unwieldy that previously (we were told) they were only really used by pharma companies, to check where their drugs were being under-prescribed and target marketing accordingly. In fact, we heard that lobbying from pharma was what got the NHS to release the open data in the first place!

OpenPrescribing was also an interesting technical challenge, because the dataset was reasonably large (80GB, 500 million rows), and users need to run fast queries across all of it. Since I didn’t have millions to give to Oracle, which is what the NHS does internally, I used Postgres for our version. With a bit of love and optimisation, it was all performant and scaled well.

Writing papers with BigQuery

As well as building services, EBM DataLab writes original research. Over the year I co-authored three papers, and wrote numerous analyses now in the paper pipeline. I can’t go into detail since these are all pre-publication, but they’re mostly based on the prescribing dataset, about how the NHS manages (or doesn’t) its £10 billion annual prescribing budget.

Probably the most enjoyable technical aspect of last year was setting up the data analysis tools for this – well, I’m not going to call it ‘big data’ because it’s not terabytes, but let’s say it’s reasonably sized data. I set up a BigQuery dataset, which makes querying this huge dataset fast, and as simple as writing SQL. Then I connected the BigQuery dataset to Jupyter notebooks, writing analyses in pandas and visualising data in matplotlib – I highly recommend this setup if you’ve got big reasonably sized data to analyse.

Tracking clinical trials

Another project was TrialsTracker, which tracks which universities, hospitals and drug companies aren’t reporting their clinical trial results. This matters because clinical trials are the best way we have to test whether a new medicine is safe and effective, but many trials never report their results – especially trials that find the medicine isn’t effective. In fact, trials with negative results are twice as likely to remain unreported as those with positive results.

The TrialsTracker project tries to fix this by naming and shaming the organisations that aren’t reporting their clinical trials. This was Ben’s idea, and I wrote the code to make it work. It gets details of all trials registered on clinicaltrials.gov that are listed as ‘completed’, and then checks whether their results are published either there or on PubMed using a linked identifier (so a researcher can find them easily). Then it aggregates the results by trial sponsor, showing the organisations with the worst publication record:

Screenshot of TrialsTracker

My approach to this was minimum viable product: it’s a simple, responsive site that clearly lays out the numbers for each organisation, and provides an incentive to publish their unpublished trials (since the data is updated regularly, if they publish past trials, their position in the table will improve over time). We wrote a paper on it in F1000 Research, the open science journal, and the project was covered in the Economist.

The best part of this project was getting numerous mails from researchers saying “this will help me lobby my organisation to publish more”. Yay!

Other projects

I also worked on the alpha of Retractobot (coming soon), a new service to notify researchers when a paper they’ve cited gets retracted. This matters because more and more papers are being retracted, yet they continue to get cited frequently, so bad results go on polluting science long after they’ve been withdrawn. And I built the front-end website for the COMPare project – this is a valiant group of five medical students, led by Ben, who checked for switched outcomes in every trial published in the top five medical journals for six weeks, then tried to get them fixed. (Spoiler: the journals were NOT happy.) Here’s more about COMPare.

Onwards!

After just over a year at EBM DataLab, I decided to move on to pastures new at the end of 2016. I’d had a lot of fun, but the organisation was now stable and mature, and I was keen to explore other interests outside healthcare. I’ve left the tech side of things in the highly capable hands of our developer Seb Bacon, previously CTO at OpenCorporates.

Since then, I’ve been having fun working through a list of about 25 one-day coding and data analysis side projects (of the kind you always want to do, but never have time). These side projects include: several around housing and land, including with Inside AirBNB’s data; statistical methods for conversion funnels; building an Alexa skill; setting up a deep learning project with Keras and Tensorflow to classify fashion images; more work on dress sizing data; and a few data journalism projects.

Longer-term, I’m thinking of joining an early-stage venture as tech lead. If you’d like to chat about the above, or just about anything related to coding, stats or maps, I’m always keen to have coffee with interesting people: drop me a line.

How to use Land Registry data to explore land ownership near you

Land ownership in Britain is secretive, and always has been. About 18% of land in England and Wales is unregistered, and not even the government knows who owns it. Even information about registered land is not freely available – you have to pay Land Registry £3 to find out who holds any piece of land.

But not many people know that you can use Land Registry data to explore land ownership near you, easily and for free. You can’t see who owns what without paying, but you can see the shape of the land that is registered.

Here’s how the data looks for central Oxford. You can see clusters of small plots for houses, much larger areas owned by a single landowner, and big swathes of unregistered land:

Screenshot of all Oxford

You can see the plots for individual houses, which is super useful for house-hunting:

Screenshot at street level

The data you can use to do this is called the INSPIRE Index Polygons. I used it to build the Private Eye map of offshore property ownership.

However, the INSPIRE Polygons come with draconian licensing conditions, imposed not by Land Registry but by Ordnance Survey, the great vampire squid wrapped around the face of UK public-interest technology. So you can’t usually share or republish them without paying huge fees.

As a consequence, no-one has created a convenient way to look at them, and most non-nerds don’t know this data exists. (Well, in theory, there’s some kind of online map viewer on data.gov.uk, which kinda sorta works if you check the checkbox and zoom down to a few streets… but it’s pretty limited.)

So the rest of this post is about how you can legally use this INSPIRE data yourself to explore land ownership near you. No programming knowledge needed.

1. The easier way: use QGIS

This is probably the best approach if the words “edit your PATH variable” don’t fill you with excited anticipation.

First, install QGIS, which is a free GIS desktop tool. Then go to the INSPIRE download page and choose the council you care about. Download the zip file and unzip it.

Open QGIS. Go to Layer > Add Layer > Add Vector Layer. Use “Browse” to find the GML file that you just unzipped, and add it. It may take a little time to import. When it’s imported, you should see something like this:

QGIS data import

Now you want a background map. Go to Plugins > “Manage and Install Plugins”, and search for “Tile Map Scale Plugin”, then install it. Once you’ve installed that, you should see a new panel in the bottom left of the screen. Click on the middle button and add “osm_landscape.xml”. This will hide your INSPIRE layer. In the “Layers” panel, use the mouse to reorder the layers, so the INSPIRE layer is on top:

Screenshot 2016-04-06 10.09.25

Bam! Let’s format the INSPIRE layer to make it more useful. Right-click on the “PREDEFINED” layer and open Properties. Drag the transparency slider to about 50%, so you can see the map below each polygon. Click on “simple fill” and adjust the border width to set a thicker border around each polygon. This makes it easier to see individual plots:

QGIS screenshot of INSPIRE ID

And finally let’s show INSPIRE IDs on hover. Back in Properties, click on “Display” and then under “field” choose INSPIREID. Then, from the View menu, make sure “Map Tips” is selected. Now when you hover, you should see the INSPIRE ID of each polygon pop up.

This is useful because if there’s a particular piece of land that interests you, you can search Land Registry by INSPIRE ID and pay your £3 to find out who owns it.

2. The slightly harder way: use CartoDB

CartoDB is basically a geographic database in the cloud. It’s amazing, and easier to use than QGIS, but you’ll have to do some work to get the data into shape first.

First, install GDAL. On OSX, Homebrew is easiest:

brew install gdal

Test the above worked by typing ogr2ogr in a terminal.

Now change to the directory where the GML file is, and use ogr2ogr to transform the data:

ogr2ogr -f "GeoJSON" inspire.geojson Land_Registry_Cadastral_Parcels.gml -s_srs EPSG:27700 -t_srs EPSG:4326

This transforms the projection of the data from British National Grid to WGS84, and transforms the data format from GML to GeoJSON. This will mean that CartoDB can use it.

(UPDATE: If your final inspire.geojson file is more than 250MB, it’ll be too big for CartoDB’s free tier, and you’ll need to use QGIS instead. Thanks Matthew for reporting that!)

The hard bit is over. Make a free account on CartoDB, then add a new dataset, and upload your new inspire.geojson file:

CartoDB screenshot of upload screen

Again, this may take a while. Once it’s imported, click on “Map View” to see your map:

CartoDB map view screenshot

Wham! Click on “infowindow” in the right-hand menu to show the INSPIRE ID on click or hover, and on “wizards” to change the transparency.

In theory, you could now click “Publish” and create a link to this map to share with family, friends and neighbours. However, under OS’s aggressive INSPIRE terms, you can’t freely use the data for anything except personal non-commercial use, and you mustn’t make the data available to third parties. So that would be highly risky – definitely don’t do that!

A word on open data

The government recently announced a consultation on the privatisation of Land Registry. Leaving aside whether or not this is generally a good deal for the taxpayer, it would remove Land Registry outside the reach of Freedom of Information.

Land ownership in England & Wales is already incredibly opaque. The government only released this INSPIRE data because of a European directive, which it tried to oppose. Does anyone seriously imagine that transparency over land in Britain will increase after privatisation? No? Thought not. So head over now and respond to the consultation.

UPDATE: David Read points out that the dataset is specifically called the “INSPIRE Index Polygons”. Updated, thanks David!