Home » Archives for August 2017

Video now available – Dr James Cheshire speaking at the Data Science Festival in May 2017

Here is the video of Dr James Cheshire’s seminar at this years Data Science Festival.

This talk showcased how large and complex datasets can be visualised in compelling and informative ways. Drawing from a range of examples that cover everything from commuter flows to baboons, cyclists to songbirds in order to demonstrate how maps and data visualisations offer a window into big data. Many of the selected examples started out life in R so it’s a chance to see how R is not just great for data wrangling but visualisation as well.

Classification of Westminster Parliamentary constituencies using e-petition data

In a representative democracy it is important that politicians have knowledge of the desires, aspirations and concerns of their constituents. Opportunities to gauge these opinions are however limited and, in the era of novel data, thoughts turn to what alternative, secondary, data sources may be available to keep politicians informed about local concerns. One such source of data are signatories to electronic petitions (e-petitions). Such e-petitions have risen greatly in popularity over the past decade and allow members of the public to initiate and sign an e petition online, with popular e-petitions resulting in media attention, a response from the government or ultimately a debate in parliament. These data are thus novel in their availability and have not yet been widely used for research purposes. In this article we will use the e-petition data to show how semantic classes of Westminster Parliamentary constituencies, fitted as Gaussian finite mixture models via EM algorithm, can be used to typify constituencies. We identify four classes: Domestic Liberals; International Liberals; Nostalgic Brits and Rural Concerns, and illustrate how they map onto electoral results. The findings and the utility of this approach to incorporate new e-petitions and adapt to changes in electoral geography are discussed.

Read the full paper online in EPJ Data Science

Predictive Data Analytics for Urban Footfall

Molly Asher (Leeds Institute for Data Analytics), Simon Brereton (Leeds City Council), and me have recently finished a project whose aim was to analyse footfall in Leeds city centre and build computer models (using machine-learning) that could estimate footfall given some external conditions (e.g. the weather, time of year, whether it was a holiday, etc.). We would like to use a model like this to help the Council with questions like:

  • If it is going to rain next Tuesday, how busy will the city be?
  • Last Wednesday we organised x, how successful was our event, taking into account that it was cold and rainy?

We’ve yet to compile the final report, but if you’d like any more information about the project (including the data we used and the code that Molly wrote), you can find more details on the main github page. This post will briefly summarise some of the more interesting findings.

Initial Data Analysis

The first stage was to find and analyse the required input data. We brought together:

  • Footfall camera data: hourly counts of footfall from a number of locations, published by the Data Mill North
  • Weather data: daily temperature, wind, and rainfall, data published by the School of Earth and Environment at the University of Leeds
  • Dates for school, university, and public holidays in Leeds

In the future we could find other data sets that might represent factors that influence footfall, such as car parking availability, train prices, etc., but for now we just used the weather and holiday data.

One of the most interesting findings from the first stage in the data analysis was that the times that people use the city centre seem to have changed over the years. For example, the figure below shows how the proportion of people visiting the centre during the day, in the evening, and at night, has changed from 2009. After the opening of the Trinity Shopping Centre in March 2013 there has been a substantial increase in the proportion of people coming to the city centre in the evenings. Shops in the Trinity Centre don’t close before 8pm, which is later than the time that shops in the area traditionally closed, so it seems as if this has encouraged later attendance. Other shops in the area have probably started to stay open later into the evening as well.

Modelling Footfall with Machine Learning

The main aim of the work was to create a model that could predict levels of footfall given some external conditions. We tested a large number of models using the Scikit Learn python library to see which was the best, and in the end a Random Forest model performed the most strongly. Again, for full details about the methodology, data (training, test, validation, etc.) and the code, see our github page.

Model Accuracy
The right figure shows how well the model actually made its predictions. On the whole it behaved reasonably well. Although on some days the predictions were very poor (±20%) the majority are in the range of (±10%).
Feature Importance
A benefit with random forest models, over some other machine learning techniques, is that it is possible to extract information about the input parameters (‘features’) that are the most important. This doesn’t tell us whether they are linked with more or less footfall, but does tell us which are the most useful for predicting footfall. The list below shows the top 10. It is important to note that this list is not definitive as there are a number of factors that can affect the importance and if we had chosen another model we would have found slightly different results, but on the whole the variables below were fairly consistent across all of the models tested. The weather variables appear to be the most important, which isn’t especially surprising, but is still interesting.
VariableRelative Importance
Mean daily temperature1142
Mean daily rainfall383
Monday131
2013131
Saturday130
2016130
After Trinity opened123
Thursday122
Tuesday116
School holiday115

Analysing Events
The most useful application of the model is its use as a tool to evaluate how successful previous events in the city were, after taking account of external conditions (day of the week, weather, whether it was a holiday, etc.). For example:

  • For the Tour de France Grand Depart on 5th July 2014, there was 37% more footfall in the city centre than we would have expected otherwise
  • The Christmas light switch-on (10th Nov 2011) attracted 22% more people than we would have expected).
  • The opening of the Trinity centre on 21st March 2013 attracted 33% more footfall.

At the other end of the scale, the model can also help to explain why some days have very low footfall. This occurs during snow, for example, or where other events such as Leeds Festival actually appear to draw people away from the city

The model discussed here is in early stages, and still needs some work to make it more rigorous, but it is clearly a useful tool and one that could provide valuable insight into the drivers of footfall into city centres.

Tackling Food Waste with Asda

It is estimated that one-third of edible food produced for human consumption is lost or wasted globally each year. In the UK, food waste derived from households accounts for 7.3 million tonnes of total food and drink wasted annually. UK households throw away approximately a third of the food they purchase for consumption. In a bid to tackle this problem, CDRC Co-investigator Professor William Young and his team at the University of Leeds joined with Asda to implement a multichannel initiative aimed at changing customer attitudes and behaviour.

The research team used six national communication channels at Asda (in-store magazine, e-newsletter, Asda’s Facebook site, product stickers and in-store demonstrations), to send out standard food waste reduction messages (taken from the WRAP Love Food Hate Waste campaign) during two 4-6 week interventions periods, one in 2014 and the other in 2015. Six national surveys over 21 months tracked customers’ self-reported food waste. Customers answered the online questionnaire a few months before, two weeks and a few months after the each intervention period. Participants were recruited from Asda’s existing customers that had signed up to complete market research panel of 30,000 customers.

How was food waste measured?

The degree to which consumers had engaged in food waste behaviours was measured using two items, frequency and quantity.  Frequency of waste was measured by asking consumers “How regularly do you think good is thrown away in your household?”  Responses were given on a five-point Likert scale (1 = Never, 5 = Most mealtimes).

The quantity of foods wasted was measured by asking “Over the past week have you thrown out any of the following items? Please select all that apply.” Participants indicated the types of foods wasted from nine product categories including: fruit, vegetables, salad, bakery, dairy, meat and poultry etc.  After each survey the costs of food waste were calculated by coding each product type using WRAPs cost of food waste.

The difference between the figure calculated from the survey conducted before the intervention and the one conducted after the intervention was then calculated to give food waste savings.  Once the food waste analysis was complete, the results from the sample population were upscaled and applied to the total customer base.

Food waste behaviour change:

  1. Three interventions were implemented in 2015, when surveyed 81% of those who recalled the interventions said they planned to follow the advice provided.
  2. An estimated two million customers are making changes in their homes as a result of the campaign. Examples include using shopping lists to shop smarter, planning meals and using up food that would be otherwise thrown away.
  3. Customers saved on average £57 per annum by applying these changes in their home.

Asda’s Chief Customer Officer, Andy Murray, said: ‘As a major food retailer, we have a responsibility and the ability to bring about large scale change when it comes to tackling food waste. By partnering with the University of Leeds, the team has been able to take our insight and really explore this area, meaning that we now have a greater understanding of customer attitudes and behaviour, helping shape the way we communicate with our customers and ultimately how we do business.’

University of Leeds Professor, William Young, said: Working with a large scale retailer like Asda, and its millions of customers, has been an invaluable experience. Not only have we come away with real, measurable insight from shoppers but we’ve also seen the direct correlation between our recommended actions and tangible behavioural change. While our formal partnership is coming to a close, the legacy of this project will certainly live on in the benefits passed to customers and of course the environment.”

 

Related Papers

Social media is not the ‘silver bullet’ to reducing household food waste, a response to Grainger and Stewart (2017) – C. William Young, Sally V Russell, Ralf Barkemeyer

Bringing habits and emotions into food waste behaviour – Sally V. Russell, C. William Young, Kerrie L. Unsworth, Cheryl Robinson

Can social media be a tool for reducing consumers’ food waste? A behaviour change experiment by a UK retailer –  C. William Young, Sally V Russell, Cheryl A. Robinson, Ralf Barkemeyer

 

Research Team

This research was commissioned by Innovate UK (Knowledge Transfer Partnership Scheme) and Asda-Walmart.

The research team:

Professor William Young, Sustainability Research Institute/Consumer Data Research Centre

Dr Sally Russell, Sustainability Research Institute

Dr Phani Kumar Chintakayala, Consumer Data Research Centre

Dr Ralf Barkemeyer, Kedge Business School, France

Cheryl Robinson and Laura Babbs, KTP Associates Asda-Walmart