February 2016 - Consumer Data Research Centre

Big Data & Health – Get Involved

18th February 2016 by Robyn Naisbitt

Michelle Morris
Director of the ESRC Strategic Network for Obesity

The Big Data revolution has been embraced by health researchers and professionals. The new data horizon provides exciting opportunities to utilise new data and methods to ultimately improve the health of our nation – and in fact the world. In the health arena big data analytics has the chance to make a real impact on society, through research which informs policy and improves practice.

I’m thrilled to see so many Big Data & Health related opportunities coming up in the next few months. The events aren’t just restricted to hearing about what is happening either, there are lots of chances for you to get involved, explore the available data and join the fight against the health issues affecting the UK.

Health Innovation Lab – 27 February (Leeds)

Student Datalabs will be running a data driven innovation lab at the Leeds Institute for Data Analytics later this month. It’s a great hands on opportunity for students to learn new data skills whilst working on health problems, specifically health inequalities and Type 2 Diabetes in Yorkshire.
Find out more

Obesity Network Seminar – 16 March (Cambridge)

The second meeting of the Obesity Network is open to everyone and will focus on Data, Methods & Models. We’ve got some interesting presentations lined up including:

Modelling and visualising large and complex datasets to guide active travel policies
Interpreting results from analysis with big data: examples from epidemiology
Big Data and the Obesity Epidemic

Find out more

Big Data and the Obesity Epidemic – 17 March (Leeds)

Adam Drewnowski, Professor of Epidemiology at the University of Washington will be talking about Big Data and the Obesity Epidemic. The seminar will be the first of a series of Big Data seminars held at the Leeds Institute for Data Analytics.
Find out more

Webinar: Exploring diet and obesity in children using geodemographic classifications – 6 April (online)

I’ll be giving an overview of the study I conducted exploring social and spatial determinants of diet and obesity in children at a local area level in two regions of the UK. I’ll discuss the findings and focus on the benefits of combining multiple datasets to generate greater insight into applied societal challenges.
Find out more

New Health Datasets Available

The CDRC have just made two new health related datasets available to researchers. There is Staff Health Survey Data from Heart Research UK and a Synthetic Population, both of which are available via application to the Centre.

Funded PhD Opportunity

We currently have a funded PhD opportunity available – Generating a Leeds Geodemographic classification: applications in policy, commerce and health. The appointed student will have the chance to work with our partners Callcredit and Leeds City Council. The deadline for applications is 5 March 2016.
Find out more

You can find out more about Michelle’s work with the ESRC Strategic Network for Obesity on the CDRC website or via twitter.

New Datasets Available

16th February 2016 by Robyn Naisbitt

We are constantly seeking new datasets for you to access via our ‘safeguarded’ and ‘secure’ services.

We have recently added the following to our datastore:

Also coming soon – datasets from Appliances Online and Shop Direct.

Ever wondered where your ancestors met their Valentine?

12th February 2016 by Robyn Naisbitt

A new website, ‘Named’, created by researchers from UCL’s Department of Geography predicts where lovers met (or could potentially meet) using surnames.

The website, which is part of a wider research project funded by the Economic and Social Research Council (ESRC), invites users to enter two surnames. It then generates a ‘heat map’ of the geographic concentrations of the two names overlain on top of one another, thus identifying areas where the couple most probably met.

Director of the Consumer Data Research Centre, Professor Paul Longley, is leading the project. The data used for the website comes from the Consumer Data Research Centre.

He said: “The website is a quirky start of our research project which is looking into whether our surnames are linked to our geographical locations – something which has been long perceived. It is known that many names remain surprisingly concentrated in specific parts of the UK, and this project helps us extend our understanding of name geography to combinations of names too when we enter relationships.”

Paul said the study so far shows that on average surnames have not moved far in distance over the last 700 years

“Most Anglo Saxon family names came into common usage between the 12th and 14th centuries, and were first coined in particular parts of the country. What is interesting is that most individuals do not move far from their ancestral family homes and so, 700 or more years later, most names can still be associated with particular localities. So if your Valentine is named ‘Rossall’, for example, it is still about 40 times more likely that you met him or her in the environs of Blackpool than in Central London.

“This doesn’t work for all names, however: the geography of many popular family names (like Smith or Brown) is much more evenly spread, although even popular names like Jones, Williams or Davies still have strong regional connotations.

“Different patterns hold for names imported from abroad over the last 60 years or so. Many of these names remain concentrated in major cities and towns, although the overall pattern of such names is becoming more dispersed as migrants assimilate into UK society.”

He added: “With all the current focus on population migration, it is remarkable to see that most individuals and families stay put throughout the generations. As a consequence it is interesting to reflect that names are still often strong indicators of kinship and regional identity.”

Data scientist Oliver O’Brien, who is part of the project team, added: “The maps on our website make predictions based upon geographic patterning, and we are really interested to learn whether we get things right.”

Users of the website are invited to feedback to the researchers whether they really are able to predict the locations at which romance blossomed – email your feedback to [email protected].

*This article is an edited version of the original press release created by the Economic & Social Research Council. For original press release:

How can big data help deliver sustainability strategies?

12th February 2016 by Robyn Naisbitt

Save the date: Monday 25^th April 13.00 BST – CDRC and Innovation Forum webinar on big data and sustainability.

With: William Young, professor of sustainability and business, University of Leeds; Chris Brown, senior director of sustainable business, Asda; Andy Peloe, concept manager, Callcredit; Wouter van Tol, director of sustainability and citizenship, Samsung.

Register your interest to attend.

The rapid growth of “big data” presents companies with real opportunities to improve business performance. Here are ten

Over the last few years there has been much talk about how so-called “big data” is the future and if you are not exploiting it your company is losing its competitive advantage.

Well luckily for sustainability professionals, we have all been using types of big data for a long time, such as energy flows in companies, environmental lifecycle assessments of products, greenhouse gas emissions reporting, waste recycling indicators, resources use balances, transport systems and so on.

Data and information is the day-to-day business of sustainability professionals and they are well qualified to take advantage of big data to better understand the environmental aspects or risks of their companies, product or service.

So what is there in the latest wave of enthusiasm on big data to help organisations achieve sustainability strategies and competitive advantage?

Data growth

There is better data breakdown and new forms of data such as sales data, loyalty card data, social media, product sensors, new monitors and mobile phone data. There are lots of these data, often in real time and there are many ways to analyse and model them. This is nicely summarised in the famous “four Vs” of big data from IBM (volume, velocity, variety and veracity).

I think there are ten opportunities to use big data for sustainability professionals.

(1) Gaining greater detail behind global sustainability performance indicators. For example energy use by using smart meters on production lines, in retailers, on products or in people’s homes can produce a better understanding of energy use in the system.

(2) Accessing supply chain data more readily. There is an opportunity from being able to access data from global suppliers up and down the supply chain more readily, in a timelier fashion and with better accuracy. This will help to make better decisions over product/service changes knowing the associated sustainability implications. As climate change impacts global supply chains, this data may help adaptation and resilience of supply.

(3) Gaining an insight in supply chain logistics and customer transport habits. There is now the ability to use mobile phone data to identify patterns in transport networks, giving the opportunity for better planning for more efficient use of fuel and reduced congestion. This may also provide consumers better opportunities to change to cleaner forms of transportation.

(4) Predicting changes in behaviour from social media. This is one of the most talked about aspects of big data and yet the most technically difficult. Much of social media data is unstructured and in picture, pixels or abbreviated language. But there are opportunities to see how individuals react to an emerging sustainability issue or a new technology.

(5) Social media is a good way for sustainability professionals to identify up and coming sustainability issues from their own stakeholders. These may be key local NGOs, community leaders, political leaders, suppliers, competitors, employees as well as customers. Identifying opinion formers is vital for filtering the volume of social media.

(6) Customer behaviour with products and services. As companies try to influence consumers to reduce the environmental impacts on the use phase of products and services, getting feedback on the effectiveness of these interventions is important for future strategy.

(7) Transparency to customers and NGOs. Access by consumers to the data behind product eco-labels, or working condition audit results from the factories producing their products, is important for confidence. Better presentation, accuracy and timeliness of this is an advantage.

(8) Better marketing or targeting of greener products, services and corporate sustainability programmes. Being able to better segment and directly contact potential customers with personalised promotions is already being developed. This can help in the sustainability arena as well.

(9) Interaction with consumers and stakeholders in the shared or collaborative economy. The growing ability to share resources, between companies and consumers has been facilitated by social media. Entrepreneurs are already in this space with apps allowing sharing of food leftovers or power tools. There are great opportunities for this to be further developed reducing the material flow though society using different business models.

(10) Growing emphasis on smart cities, combined with the development of “mega cities” where the majority of the world population may live. Smart energy, water, waste and transport grids are just one area, but the buildings being able to heat and cool more smartly is another opportunity.

Don’t get lost!

There are some difficulties with big data that sustainability professionals need to be aware.

Firstly, getting lost in the enormous amount of data is easy, so having objectives or research questions is essential. Secondly, a few big corporations have been quick to jump on correlations between different data sets without common sense kicking in quick enough to identify that there cannot be a causation. Finally, there are the ethics of the privacy of individuals and communities, which need to be protected even if the data is publicly available.

Overall there is much here for sustainability professionals to work on and to improve the sustainability performance of their company’s operations, products, services, supply chains and even customers. However, as much data as possible needs to be open access for consumers, researchers, local communities and innovators for big data to have the biggest benefit for people and planet.

This article was originally published on www.innovation-forum.co.uk

ESRC’s Bruce Jackson on the CDRC

9th February 2016 by Robyn Naisbitt

The Economic and Social Research Council (ESRC) is the largest funder of social and economic research in the UK, supporting high quality, independent research helping to address some of the major key societal challenges we face today. The ESRC Big Data Network brings together a series of highly innovative ESRC investments seeking to facilitate access to a greater range and variety of data to support high-quality social science. In making new forms of data available for research purposes, the ESRC Big Data Network has the potential to fundamentally transform the social sciences.

Led by two world class research teams based at UCL and the University of Leeds, the Consumer Data Research Centre has created a cutting-edge physical and virtual infrastructures which facilitate access to, and linking of, business and local government data in safe and secure settings. The Centre is at the forefront of ESRC’s work to open closed, proprietary datasets for research purposes and is demonstrating and exploring new and novel forms of partnership working with data owners which both support world class social science and deliver real value for our partners. By working through the methodological, ethical, legal and technical challenges associated with accessing, curating and analysing these datasets, the Centre is providing a fundamental foundation upon which the future of innovative, impactful social science can be built.

James Cheshire live at Cartographic Summit 2016

4th February 2016 by Robyn Naisbitt

CDRC’s Assistant Director Dr James Cheshire will be delivering a talk at this year’s ‘Cartographic Summit: The Future of Mapping’, to be held in Redlands from 8 to 10 February.

Titled ‘Lightning Talk on Data’, James’ talk will be live-streamed on Monday 8 February at 17.30pm UK time (GMT) and 09.30am (PST).

To access this and other live streams’ for the event:
Follow the link here
Enter the password ‘carto.summ’ when prompted.

The summit aims to explore research, innovation and strategic thinking to support cartographic needs as new data, information and infographics revolutionize mapping . The intent is to set a marker for understanding common challenges from a range of perspectives in and outside the traditional cartographic communities; to draw together different ways of thinking and working; and to build bridges across the many communities in the mapmaking and visualization fields.

For those unable to tune into the live stream, keep track of our website as we will aim to post a video of James’ talk in the future.

For the full agenda click here.

For twitter coverage visit twitter.com, hashtag #cartosummit.

Using Big Data to Identify Outbreaks of Food Poisoning

4th February 2016 by Robyn Naisbitt

Using New Data and Novel Methods to Undertake Syndromic Surveillance to Identify Outbreaks of Food Poisoning

By Rachel Oldroyd

In October 2015, I started a part-time PhD in the Consumer Data Research Centre on the topic of Spatial Data Analytics for Food Safety. This is a White Rose DTC project which forms part of the Big Data and Food Safety Network, funded by the Economic and Social Research Council (ESRC). I’m roughly three months into what is going to be a long (five year) journey and although I’m trying to keep things general at the moment, I’m beginning to have a vague idea of where this project might take me.

Every year there are around 500 000 known cases of food poisoning in the UK and a further 10 million cases of gastrointestinal illnesses which may be food related. As the NHS recommends that people recover from mild-bouts of food poisoning at home, without visiting their GP, the true number of cases is hard to estimate via traditional methods. The Food Standards Agency (FSA) is a case partner for this project and they are particularly interested in how big data and social media can be used to more effectively monitor cases and outbreaks of food poisoning in the UK.

Source: Salidek et al. (2013)

In the US, a number of studies have proved that Twitter and other online data sources can be used to undertake syndromic surveillance and identify whether people are suffering from known symptoms. Although the majority of these studies are Influenza focussed, the methodologies are also relevant for monitoring food-poisoning. These studies prove that a language model can be used to retrieve food-related messages from Twitter and, through natural language processing, identify if the author is suffering from a food-borne illness or not. In some cases, geo-located Tweets can be tracked to a specific restaurant location which is particularly useful for monitoring outbreaks of food poisoning, where more than one person has been infected at the same origin. Online restaurant reviews are somewhat simpler to process than Tweets, as they are restaurant specific and require less filtering. An author will often specify the type of food eaten which may be useful for identifying new food pathogen vehicles.

The timely reporting of foodborne illness is an essential component in avoiding a large-scale epidemic. Social media data and online restaurant reviews can be collected in near-real time and are therefore much timelier than traditional data sources such as GP visits or FSA inspection reports which can take up to two weeks to process. Despite their timeliness, information reported via Twitter and online reviews needs to be handled carefully. Food-borne pathogens have extremely varied incubation periods (typically between 1 and 28 days) making it difficult to attribute illness to a specific food establishment. For this reason, these data sources may not be suitable for monitoring food-borne illness caused by certain pathogens.

I’m still knee-deep in literature at the moment but the broad objectives of this project are:

To evaluate the availability and quality of data from a variety of sources relating to demographics, movement patterns, social media messages, the quality and performance of facilities, and health outcomes;
To construct spatial-temporal models of food safety across a country or region and to assess the effectiveness and robustness of that model;
To explore the utility of the models as a means for targeting scarce resources and to suggest other means for the extraction of value from the application of this research.

In the next few months, I hope to start collecting online restaurant review data to carry out some preliminary analysis against the FSA inspection data. Watch this space!

Rachel Oldroyd is a part time PhD student at the Consumer Data Research Centre, she also teaches the Face to Face and Distance Learning MSc GIS courses in her role at the Centre for Spatial Analysis and Policy (CSAP) at the University of Leeds.

Michelle Morris Director of the ESRC Strategic Network for Obesity

Using New Data and Novel Methods to Undertake Syndromic Surveillance to Identify Outbreaks of Food Poisoning

Michelle Morris
Director of the ESRC Strategic Network for Obesity