Ever wondered where your ancestors met their Valentine?

A new website, ‘Named’, created by researchers from UCL’s Department of Geography predicts where lovers met (or could potentially meet) using surnames.

The website, which is part of a wider research project funded by the Economic and Social Research Council (ESRC), invites users to enter two surnames. It then generates a ‘heat map’ of the geographic concentrations of the two names overlain on top of one another, thus identifying areas where the couple most probably met.

Director of the Consumer Data Research Centre, Professor Paul Longley, is leading the project. The data used for the website comes from the Consumer Data Research Centre.

He said: “The website is a quirky start of our research project which is looking into whether our surnames are linked to our geographical locations – something which has been long perceived. It is known that many names remain surprisingly concentrated in specific parts of the UK, and this project helps us extend our understanding of name geography to combinations of names too when we enter relationships.”

Paul said the study so far shows that on average surnames have not moved far in distance over the last 700 years

“Most Anglo Saxon family names came into common usage between the 12th and 14th centuries, and were first coined in particular parts of the country. What is interesting is that most individuals do not move far from their ancestral family homes and so, 700 or more years later, most names can still be associated with particular localities. So if your Valentine is named ‘Rossall’, for example, it is still about 40 times more likely that you met him or her in the environs of Blackpool than in Central London.

“This doesn’t work for all names, however: the geography of many popular family names (like Smith or Brown) is much more evenly spread, although even popular names like Jones, Williams or Davies still have strong regional connotations.

“Different patterns hold for names imported from abroad over the last 60 years or so. Many of these names remain concentrated in major cities and towns, although the overall pattern of such names is becoming more dispersed as migrants assimilate into UK society.”

He added: “With all the current focus on population migration, it is remarkable to see that most individuals and families stay put throughout the generations. As a consequence it is interesting to reflect that names are still often strong indicators of kinship and regional identity.”

Data scientist Oliver O’Brien, who is part of the project team, added: “The maps on our website make predictions based upon geographic patterning, and we are really interested to learn whether we get things right.”

Users of the website are invited to feedback to the researchers whether they really are able to predict the locations at which romance blossomed – email your feedback to n.vij@ucl.ac.uk.

*This article is an edited version of the original press release created by the Economic & Social Research Council. For original press release:

 

How can big data help deliver sustainability strategies?

Save the date: Monday 25th April 13.00 BST – CDRC and Innovation Forum webinar on big data and sustainability.

With: William Young, professor of sustainability and business, University of Leeds; Chris Brown, senior director of sustainable business, Asda; Andy Peloe, concept manager, Callcredit; Wouter van Tol, director of sustainability and citizenship, Samsung. 

Register your interest to attend.


 

The rapid growth of “big data” presents companies with real opportunities to improve business performance. Here are ten

Over the last few years there has been much talk about how so-called “big data” is the future and if you are not exploiting it your company is losing its competitive advantage.

Well luckily for sustainability professionals, we have all been using types of big data for a long time, such as energy flows in companies, environmental lifecycle assessments of products, greenhouse gas emissions reporting, waste recycling indicators, resources use balances, transport systems and so on.

Data and information is the day-to-day business of sustainability professionals and they are well qualified to take advantage of big data to better understand the environmental aspects or risks of their companies, product or service.

So what is there in the latest wave of enthusiasm on big data to help organisations achieve sustainability strategies and competitive advantage?

Data growth 

There is better data breakdown and new forms of data such as sales data, loyalty card data, social media, product sensors, new monitors and mobile phone data. There are lots of these data, often in real time and there are many ways to analyse and model them. This is nicely summarised in the famous “four Vs” of big data from IBM (volume, velocity, variety and veracity).

I think there are ten opportunities to use big data for sustainability professionals.

(1) Gaining greater detail behind global sustainability performance indicators. For example energy use by using smart meters on production lines, in retailers, on products or in people’s homes can produce a better understanding of energy use in the system.

(2) Accessing supply chain data more readily. There is an opportunity from being able to access data from global suppliers up and down the supply chain more readily, in a timelier fashion and with better accuracy. This will help to make better decisions over product/service changes knowing the associated sustainability implications. As climate change impacts global supply chains, this data may help adaptation and resilience of supply.

(3) Gaining an insight in supply chain logistics and customer transport habits. There is now the ability to use mobile phone data to identify patterns in transport networks, giving the opportunity for better planning for more efficient use of fuel and reduced congestion. This may also provide consumers better opportunities to change to cleaner forms of transportation.

(4) Predicting changes in behaviour from social media. This is one of the most talked about aspects of big data and yet the most technically difficult. Much of social media data is unstructured and in picture, pixels or abbreviated language. But there are opportunities to see how individuals react to an emerging sustainability issue or a new technology.

(5) Social media is a good way for sustainability professionals to identify up and coming sustainability issues from their own stakeholders. These may be key local NGOs, community leaders, political leaders, suppliers, competitors, employees as well as customers. Identifying opinion formers is vital for filtering the volume of social media.

(6) Customer behaviour with products and services. As companies try to influence consumers to reduce the environmental impacts on the use phase of products and services, getting feedback on the effectiveness of these interventions is important for future strategy.

(7) Transparency to customers and NGOs. Access by consumers to the data behind product eco-labels, or working condition audit results from the factories producing their products, is important for confidence. Better presentation, accuracy and timeliness of this is an advantage.

(8) Better marketing or targeting of greener products, services and corporate sustainability programmes. Being able to better segment and directly contact potential customers with personalised promotions is already being developed. This can help in the sustainability arena as well.

(9) Interaction with consumers and stakeholders in the shared or collaborative economy. The growing ability to share resources, between companies and consumers has been facilitated by social media. Entrepreneurs are already in this space with apps allowing sharing of food leftovers or power tools. There are great opportunities for this to be further developed reducing the material flow though society using different business models.

(10) Growing emphasis on smart cities, combined with the development of “mega cities” where the majority of the world population may live. Smart energy, water, waste and transport grids are just one area, but the buildings being able to heat and cool more smartly is another opportunity.

Don’t get lost!

There are some difficulties with big data that sustainability professionals need to be aware.

Firstly, getting lost in the enormous amount of data is easy, so having objectives or research questions is essential. Secondly, a few big corporations have been quick to jump on correlations between different data sets without common sense kicking in quick enough to identify that there cannot be a causation. Finally, there are the ethics of the privacy of individuals and communities, which need to be protected even if the data is publicly available.

Overall there is much here for sustainability professionals to work on and to improve the sustainability performance of their company’s operations, products, services, supply chains and even customers. However, as much data as possible needs to be open access for consumers, researchers, local communities and innovators for big data to have the biggest benefit for people and planet.


 

This article was originally published on www.innovation-forum.co.uk 

ESRC’s Bruce Jackson on the CDRC

The Economic and Social Research Council (ESRC) is the largest funder of social and economic research in the UK, supporting high quality, independent research helping to address some of the major key societal challenges we face today.  The ESRC Big Data Network brings together a series of highly innovative ESRC investments seeking to facilitate access to a greater range and variety of data to support high-quality social science.  In making new forms of data available for research purposes, the ESRC Big Data Network has the potential to fundamentally transform the social sciences.

Led by two world class research teams based at UCL and the University of Leeds, the Consumer Data Research Centre has created a cutting-edge physical and virtual infrastructures which facilitate access to, and linking of, business and local government data in safe and secure settings.  The Centre is at the forefront of ESRC’s work to open closed, proprietary datasets for research purposes and is demonstrating and exploring new and novel forms of partnership working with data owners which both support world class social science and deliver real value for our partners.  By working through the methodological, ethical, legal and technical challenges associated with accessing, curating and analysing these datasets, the Centre is providing a fundamental foundation upon which the future of innovative, impactful social science can be built.

James Cheshire live at Cartographic Summit 2016

CDRC’s Assistant Director Dr James Cheshire will be delivering a talk at this year’s ‘Cartographic Summit: The Future of Mapping’, to be held in Redlands from 8 to 10 February.

Titled ‘Lightning Talk on Data’, James’ talk will be live-streamed on Monday 8 February at 17.30pm UK time (GMT) and 09.30am (PST).

To access this and other live streams’ for the event:
Follow the link here 
Enter the password ‘carto.summ’ when prompted.

The summit aims to explore research, innovation and strategic thinking to support cartographic needs as new data, information and infographics revolutionize mapping . The intent is to set a marker for understanding common challenges from a range of perspectives in and outside the traditional cartographic communities; to draw together different ways of thinking and working; and to build bridges across the many communities in the mapmaking and visualization fields.

For those unable to tune into the live stream, keep track of our website as we will aim to post a video of James’ talk in the future.

For the full agenda click here.

For twitter coverage visit twitter.com, hashtag #cartosummit.

 

 

 

 

 

Using Big Data to Identify Outbreaks of Food Poisoning

 

In October 2015, I started a part-time PhD in the Consumer Data Research Centre on the topic of Spatial Data Analytics for Food Safety.   This is a White Rose DTC project which forms part of the Big Data and Food Safety Network, funded by the Economic and Social Research Council (ESRC). I’m roughly three months into what is going to be a long (five year) journey and although I’m trying to keep things general at the moment, I’m beginning to have a vague idea of where this project might take me.

Every year there are around 500 000 known cases of food poisoning in the UK and a further 10 million cases of gastrointestinal illnesses which may be food related. As the NHS recommends that people recover from mild-bouts of food poisoning at home, without visiting their GP, the true number of cases is hard to estimate via traditional methods. The Food Standards Agency (FSA) is a case partner for this project and they are particularly interested in how big data and social media can be used to more effectively monitor cases and outbreaks of food poisoning in the UK.

Source: Salidek et al. (2013)

Source: Salidek et al. (2013)

 

In the US, a number of studies have proved that Twitter and other online data sources can be used to undertake syndromic surveillance and identify whether people are suffering from known symptoms. Although the majority of these studies are Influenza focussed, the methodologies are also relevant for monitoring food-poisoning. These studies prove that a language model can be used to retrieve food-related messages from Twitter and, through natural language processing, identify if the author is suffering from a food-borne illness or not. In some cases, geo-located Tweets can be tracked to a specific restaurant location which is particularly useful for monitoring outbreaks of food poisoning, where more than one person has been infected at the same origin. Online restaurant reviews are somewhat simpler to process than Tweets, as they are restaurant specific and require less filtering. An author will often specify the type of food eaten which may be useful for identifying new food pathogen vehicles.

The timely reporting of foodborne illness is an essential component in avoiding a large-scale epidemic.  Social media data and online restaurant reviews can be collected in near-real time and are therefore much timelier than traditional data sources such as GP visits or FSA inspection reports which can take up to two weeks to process.  Despite their timeliness, information reported via Twitter and online reviews needs to be handled carefully. Food-borne pathogens have extremely varied incubation periods (typically between 1 and 28 days) making it difficult to attribute illness to a specific food establishment. For this reason, these data sources may not be suitable for monitoring food-borne illness caused by certain pathogens.

I’m still knee-deep in literature at the moment but the broad objectives of this project are:

  • To evaluate the availability and quality of data from a variety of sources relating to demographics, movement patterns, social media messages, the quality and performance of facilities, and health outcomes;
  • To construct spatial-temporal models of food safety across a country or region and to assess the effectiveness and robustness of that model;
  • To explore the utility of the models as a means for targeting scarce resources and to suggest other means for the extraction of value from the application of this research.

In the next few months, I hope to start collecting online restaurant review data to carry out some preliminary analysis against the FSA inspection data. Watch this space!

Rachel Oldroyd is a part time PhD student at the Consumer Data Research Centre,  she also teaches the Face to Face and Distance Learning MSc GIS courses in her role at the Centre for Spatial Analysis and Policy (CSAP) at the University of Leeds.

 

CDRC in the News

We’ve had a busy few months since we launched the new CDRC website and datastore in October.  You may have seen some of our work in various regional and national newspapers.  Here is a summary of some of the media highlights:

The Guardian
How cities grow: the age of houses – mapped

Guardian

Timeout
How Londoners commute to work

timeout

ESRC Britain in 2016 Magazine
The CDRC featured in the ‘Retail Revival?’ article in the recent Britain in 2016 magazine.  Copies are available to buy at WH Smith.

Britain In

The Bristol Post
Unique maps chart history of housing development in Bristol

Bristol

Midlands Express & Star
Average Black Country house prices vary by £200k

Midlands

 

Where do people cycle to work? Popular cycle commuter routes visualised

Research funded by the Department for Transport illustrates where the most popular commuter routes for cycling are around the country.

CDRC researcher Robin Lovelace is part of the team developing the Propensity to Cycle Tool (PCT), an open source transport planning system developed by academics at the Universities of Leeds, Cambridge, Westminster and the London School of Hygiene and Tropical Medicine.

Maps produced by the PCT are shown below for Leeds, London and Cambridge to demonstrate where the most popular routes are from the 2011 Census. Only the top 10 most cycled ‘desire lines’ are shown, in the maps below.  The tool also provides evidence on where there is most potential for growth (not shown).

The PCT aims to assist the decision of where to build new cycle infrastructure such two way cycle paths in place of a lane for motorised traffic, as is happening in parts of London.  This should make cycle commuting a more attractive option nationwide, building on research into scenarios of growth in cycling at the city level.

cycling commute

The maps below, generated by an early prototype version of the PCT, show popular cycling routes in Leeds, London and Cambridge to provide a taster of the types of route that are favoured by commuter cyclists.

The types of routes cycle commuters use vary from city to city and in the future the geographical distribution of cycling to work could shift as cycling grows as a form of transport.

The thickness of the line represents which desire lines were most cycled in 2011 and the shade of the zones represent how much cycling there is in each zone: the more yellow the zone, the more people living in that zone cycle to work.

Leeds

Leeds is a ‘mono-centric’ city, meaning that the most cycled routes head to the city centre. Otley Road is one of the most cycled routes in the city as illustrated by the thick line from Far Headingly in the northwest to city centre and the University of Leeds. Interestingly the most popular commuter routes in 2011 were all from the North of the city centre, raising the question: what are the barriers to cycling in South Leeds?

Leeds cycling

 

London

Unsurprisingly in London all the most popular routes head to the City of London. Surprisingly, some of the most popular commutes are more than 5 km in Euclidean distance (or ‘as the crow flies’) with 95 people saying they commute the 8 km from Clapham Junction to the City of London in the 2011 Census.

London cycling

 

Cambridge

Cambridge is the city with the highest proportion of cycle commuters in the UK, with 18% cycling to work in the 2011 Census. In Cambridge, the commute from the North and East of the City Centre are the most popular:

cambridge cycling

Future versions of the PCT will increase the ‘geographical resolution’ of the tool (making the zones smaller), analyse which road sections should be prioritised for cycling and explore where there is high potential for other types of trip, such as cycling to school.  This is linked to wider research to better understand how to build more cycle friendly cities at the participating universities.

Interested in the age of your property? CDRC Maps might give you a clue

CDRC Maps has launched a new, interactive map that indicates the age of properties across England and Wales.

Using the ‘dwellings ages’ dataset published by the Valuation Office Agency and the house price summaries from the ONS, CDRC researcher Oliver O’Brien has combined both these open datasets’* into a record on CDRC Data, then mapped both the prices and ages data on CDRC Maps.

Oliver has made use of a color scheme to track the changes in age: grey = old, yellow = 1960s, red = very new.

This feature has generated significant interest across social media. Click here to access the map and have a play yourself.

For more information on the methods used to create this, visit Oliver’s blog post.

*Background mapping is based on Ordnance Survey Open Data.

Public dialogues on the (re)use of private sector data for social research

Earlier this year, CDRC Director’s Paul Longley and Mark Birkin participated in a series of public dialogue events in order to better understand the public’s views on the use and re-use of private sector data for social research.

The purpose of these dialogues was to inform the work of the CDRC, the Urban Big Data Centre and the Business and Local Government Data Research Centre.

The Aims

The aims of the public dialogues were:

  • to explore public views and related concerns about key aspects of the Data Research Centres’ work towards enabling access to private sector data for social research
  • to start creating a language around private sector data and access to and use of these for research purposes that is meaningful and accessible to the public

The Findings

Overall, the dialogue demonstrated that there is wide public support for the use and re-use of private sector data for social research. Access to information about the Data Centre processes alleviated a lot of the concerns people had initially around security and privacy. An increased appreciation of the benefits of social research for everyone in society meant that a trade-off took place between concerns and perceived risks of the use of private sector data in favour of research that leads to improvements in policy and services.

Next Steps

The CDRC, along with our colleagues at Urban Big Data Centre and the Business and Local Government Data Research Centre, will be working with the ESRC to take forward the findings and recommendations of the public dialogues to improve communication on the work that we do with private sector data for social research that can ultimately benefit society.

Further Information

The full report and further information on the public dialogue process can be found on the ESRC Website.