Home » Covid-19

Measuring Ambient Populations during COVID-19 (Case Study)

Graph showing footfall data results

Measuring Ambient Populations during COVID-19 in Leeds City Centre
(Case Study)

The COVID-19 pandemic led to lockdowns being implemented all over the world, including in the UK.  The aims of the project were to investigate relevant data sources for modelling the ambient population of Leeds City Centre during COVID-19 and analysing the impacts that lockdown policies had on urban footfall.  The research builds on previous work undertaken with Leeds City Council by intersecting key dates from the English lockdowns and integrating these into machine learning models to assess the importance of different aspects of lockdowns. It also predicts what “business as usual” may have been like had there been no pandemic.

Analysis notebooks and scripts can be accessed at https://github.com/tbalbone31

Data and methods

Leeds City Council have been collecting footfall data for more than a decade. The data were wrangled and aggregated to create a history going back to 2008.  These data were then analysed alongside key lockdown dates to determine where trends in urban footfall intersected, raising questions about what aspects of these policies might have had the most impact.  The data cover a relatively small geographical area of Leeds City Centre and only reflect pedestrian traffic going past the locations identified by the cameras.  There are issues with data quality, such as potential double counting, periods of time with missing data and inconsistent file formatting, however it covers a large temporal scale and many problems can be worked around.

Google COVID-19 Community Mobility data was analysed as a potential alternative data source to the Council data.  It shows changes in mobility from a baseline for six different destinations (see the website for more details).  The smallest relevant spatial coverage is the Leeds City Region.  This was considered too large to isolate any changes impacting the city centre, making comparison of trends difficult.

The Council footfall data were resampled to show daily counts on which analysis was then conducted.  Visual analysis was undertaken to identify footfall trends over the course of the pandemic against key dates pertaining to the implementation and lifting of certain COVID-19 restrictions. These key dates were decided from research into when major legislation came into force or government announcements about restrictions were made.  The questions generated from initial analysis were then explored by creating a series of machine learning models using Random Forest Regression in the Python SciKit Learn package. 

The first model included a series of input variables to represent different aspects of society that had restrictions placed on them alongside other external conditions (such as weather, school/bank holidays, day of week, etc).  Variable importance was used to identify what (if any) aspects of lockdown might be significant in predicting future changes in footfall. The second model omitted any lockdown related inputs and was designed to make predictions on what “business as usual” might have been like had the pandemic not happened. 

Due to the inherently ordered nature of time series data, both models were validated using a method known as “Walk-Forward Validation” instead of the default Cross-validation included in SciKit Learn and often used on Random Forest Ensembles.  The implementation of Walk-Forward validation allows the model to be retrained after every prediction on the validation dataset, essentially “walking forward” through the time series.  This avoids potential data leakage because of the randomised nature of Cross-validation.

Key findings

The chart below shows the resampled footfall data intersecting with key dates from COVID-19 restrictions.

Key dates are shown as a dotted line with a number relating to a key.  Red zones indicate “official” lockdowns whilst orange represents periods where a variety of restrictions were in place but in the process of being lifted/introduced individually.  A summary of how this impacted footfall is below:

  • Footfall started to drop immediately after the announcement on 16th March 2020, no official restrictions implemented.
  • After non-essential shops and schools reopened on 15th June 2020, footfall started to rise again.
  • Footfall continues to rise through summer until around 22nd September 2020 when some restrictions were announced.
  • Footfall rises whilst Leeds is in tier 2 and 3, potentially because gatherings are only permitted in public spaces.
  • The second and third lockdowns drive footfall back down again until restrictions begin to ease again in April 2021.

The first machine learning model was intended to explore whether any lockdown variables would be significant in predicting future changes in footfall.  Variable importance (top 10) is shown below.

The most important lockdown-related features were indoor entertainment and non-essential retail.  Whilst this is only an initial model and not a definitive conclusion, it does help indicate what aspects of lockdown might have impacted pedestrian traffic in the city centre more than others.

The second model was designed to test how useful the data would be in predicting what “business as usual” may have been like.

There was little difference between error scores across different numbers of trees, so a compromise of the best score and least processing power (500 trees) was chosen.  The model predictions using this hyperparameter are shown below.

Results from this initial model are by no means definitive, however the potential to quantify how much footfall has been lost exists.  For example:

  • Average daily footfall in the lead up to Christmas (taken as 30th November to 24th December 2020) was approximately 36% lower than predicted.
  • Average daily footfall over the school holidays was approximately 63% lower than predicted.
  • Approximate footfall loss for individual Bank Holidays was also calculated.  Most recorded over 90% lower than predicted values except for the August Bank Holiday which was around 22% lower.

Value of the research

Initial analysis has already been delivered to Leeds City Council.  An aggregated dataset of footfall camera data has been created and is available on the Consumer Data Research Centre (CDRC) Data Store for future research.  The initial models developed can be used and refined by future researchers and develop more accurate predictions, whilst more specific time series packages can be explored.


  • Urban footfall and ambient population was significantly impacted by COVID-19 lockdown policies (as was intended).
  • Closure of Indoor Entertainment and Non-Essential retail appear to be the most important lockdown-related factors in predicting footfall change.
  • Consideration must be given to how time series data is processed in classic machine learning models such as Random Forests.

Research theme

Urban analytics


Tom Albone – Data Scientist Intern (LIDA)

Dr Nick Malleson – Professor of Spatial Science

Professor Alison Heppenstall – Professor in Geocomputation

Dr Vikki Houlden – Lecturer in Urban Data Science

Dr Patricia Ternes – Research Fellow


Leeds City Council


Consumer Data Research Centre

What can tweets about contact tracing apps tell us about attitudes towards data sharing for public health? (Part 3)

Crowd of people in railway station

What can tweets about contact tracing apps tell us about attitudes towards data sharing for public health? (Part 3)

At the end of my last blog post about Covid-19 apps, I speculated it was unlikely that the UK’s Track and Trace app would gain enough public trust and support to be a success.  Since that blogpost was published, it was announced that UK’s app might not be ready until winter, followed by news that the centralised NHSX app has been abandoned for a decentalised alternative developed by Apple/Google.  

Many people have reacted to this news on Twitter resulting in a spike of tweets about the Track and Trace app (Figure 1). In this blog post I will present findings from sentiment analysis on these tweets to understand people’s reactions to the new decentralised app and discuss the future of data-sharing post-Covid-19.  

Graph showing number of tweets about Covid-19 tracing apps
Figure 1: Daily number of tweets about all ‘Covid-19’/’Coronavirus’ apps from all countries (blue) and only  the UK’s ‘Track/Test and Trace’ app (orange), collected 24 April to 16 June 2020 and 17 June to 25 June 2020 respectively.  

Holly Clarke

Leeds Institute for Data Analytics

Holly Clarke is an Intern at Leeds Institute for Data Analytics, applying data science solutions to solve complex, real-world challenges. She is working for the LifeInfo project with Michelle Morris, researching attitudes towards novel lifestyle and health data linkages and how access to this information could improve public health. 

Read the previous parts of this blog:

Read part 1
Read part 2

The positives and negatives of Covid-19 apps  

This sentiment analysis includes tweets about the UK’s Track and Trace app posted between 17th and 25th June 2020, thereby, focusing in on recent events.  Sentiment analysis matches words within tweets with common positive and negative words categorised in the “Bing” dataset, developed by Bing Liu in order to identify their sentiment. Overall, this analysis tells us there are more commonly used negative words within recent tweets about Track and Trace app than positive, indicating the tweets hold mainly negative content (Figure 2).  

Proportion of sentiment words in tweets that are positive and negative for tweets about all Covid-19 apps, collected 24 April to 16 June 2020, and tweets about the UK’s Track/Test and Trace app, collected 16 June to 25 June 2020.
Figure 2: Proportion of sentiment words in tweets that are positive and negative for tweets about all Covid-19 apps, collected 24 April to 16 June 2020, and tweets about the UK’s Track/Test and Trace app, collected 16 June to 25 June 2020.  

The nature of these positive and negative words is also very telling. The negative words refer predominantly to the management of the app rather than issues about data privacy and the app itself; “failure”, “incompetence”, “fiasco”, “shambles”, “chaos”, “disaster”, “lying” and “debacle” all feature prominently (see Figure 3).  As a comparison, sentiment analysis on all general tweets from 24 April to 16 June 2020 about Covid-19 apps (Figure 4) shows common negative words to be more technology focused and in line with common concerns about data-sharing – “breach”, “risk”, “concerns”, “issues”.  

50 most frequently used positive and negative sentiment words used in tweets about the UK’s Track/Test and Trace app, collected 16 June to 25 June 2020.
Figure 3: 50 most frequently used positive and negative sentiment words used in tweets about the UK’s Track/Test and Trace app, collected 16 June to 25 June 2020.  

The positive words refer to more common topics around data-sharing and technology e.g. “trust”, “protection” and “safe” across both datasets of tweets. This indicates engagement with the topic of data sharing and a significant proportion of the tweet sentiment words are positive across both datasets.  

As part of the sentiment analysis I have controlled for negation, inversing the positive/negative categorisation if a common negator is directly before the word (e.g. “not good” or “don’t trust”). In the figures these are shown with the pre-fix “neg_”.  However, linguistic features such as sarcasm, humour and questioning are not easily picked up through sentiment analysis. Some instances of positive words like ‘wow’ or ‘promises’ may also be used in a critical way.   

Overall, although both datasets of tweets include more negative than positive words, the recent events around the UK’s Track/Test and Trace app seem to have framed the app more negatively than Covid-19 apps more generally due to “waste” and issues around the development of the app.   

50 most frequently used positive and negative sentiment words used in tweets about ‘Covid-19’/’Coronavirus’ apps from all countries, collected 24 April to 16 June 2020.
Figure 4: 50 most frequently used positive and negative sentiment words used in tweets about ‘Covid-19’/’Coronavirus’ apps from all countries, collected 24 April to 16 June 2020. 

What will Track and Trace mean for people’s attitudes to data sharing?  

When I began writing this blog series on Covid-19 apps, countries across the world were rapidly launching contact tracing apps to quelle the spread of coronavirus through technology and the UK was poised to trial their app on the Isle of Wight. Two months later the Track and Trace app journey has certainly not been smooth and the app’s importance has been downgraded from “world beating” to “the cherry on the cake”.  But what does this late-stage Apple/Google switch will mean for public opinion?  

Research on attitudes to data-sharing, as discussed in my last blog post, frequently finds that people’s willingness to share their data is dependent on which actors are involved. People tend to have high trust in the NHS and the lowest trust in private companies. Hence, we might expect the shift from an NHSX app to one involving tech giants Apple and Google to be met with opposition. However, the sentiment analysis indicates conversation about the Track and Trace app mainly focuses on the wastefulness and “shambles” of the switch rather than inherent mistrust in private companies.  

Initial findings from my work with the LifeInfo project, exploring public opinion about linking lifestyle data (e.g. supermarket loyalty card or fitness app data) with health records, may explain this. My analysis highlights that data-sharing and trust in actors is not as straight forward as might be expected.  

Although people generally have high levels of trust in health organisations, respondents repeatedly expressed concerns that their supermarket loyalty card data might be seen by their GP if these data were linked for health research. Many worried their GPs would unfairly judge their diet and lifestyle, and even withhold treatment. Yet, respondents were happy for supermarkets (private companies in which research finds people to have the least trust) to store and use their loyalty card data.  This indicates that attitudes about data sharing are not simply informed by trust in actors but are also influenced by the type of data involved and social norms about how it is currently used.  

In the context of coronavirus apps, this could mean that users are more comfortable with mobile phone providers using data to alert them about exposure to coronavirus than the government or NHS. Many mobile phone users share vast amounts of data with technology companies through everyday use of apps and services which they may be uncomfortable sharing with the government or healthcare providers.  Therefore, a contact tracing system involving Apple and Google, and especially a decentralised one which enables more data privacy, might encourage wider use than the NHSX app.  

The future of data sharing post Covid-19 

The Covid-19 pandemic will undoubtedly create lasting changing across many aspects of our lives including attitudes towards data sharing.  The pandemic had led us to consider sharing unprecedented amounts of data it has also made clear the inadequacies of our medical data sharing systems.   

In the context of the LifeInfo study, access to lifestyle data linked to health records could help researchers better understand and prevent diseases such as diabetes, certain cancers and heart disease, The World Health Organization attributes 30% of yearly global deaths to poor diet and physical inactivity, so it is a substantial challenge. However, for participants to willingly share their data they must trust organisations to safely, responsibly and transparently use it. 

Successful contact tracing apps had the potential to demonstrate that data sharing could help improve health while maintaining personal privacy and data security. Yet, technological failings, privacy concerns, and government mismanagement in the UK could turn public opinion against data sharing initiatives in the same way other high-profile failings such as care.data didAbove all, the Track and Trace app highlights how detailed consideration of peoples attitudes towards data sharing is vital for initiatives to be successful.