Home » Archives for March 2018

Changing Broadband Speeds in the UK

The CDRC’s Broadband Speed map has proved to be one of our most popular interactive maps to date. In his blog post below, our Research Associate Oliver O’Brien details the nuances behind the map and also highlights key changes between 2016 and 2017.

Broadband Speed – by Oliver O’Brien
The Broadband Speed map is based on data from Ofcom, the UK’s digital connectivity and broadcast media regulator, and I was invited to talk at their Innovation Workshop event, hosted by ODI Leeds, earlier this month. My brief was to demonstrate the Broadband map but also critique Ofcom’s open data offering (which provided the data for the map).

As part of the preparation for the event, I produced a new version of the Broadband map, showing 2017 data from the Connected Nation report (the original was based on the 2016 data). This gave the opportunity to therefore prepare a third map, showing the change between 2016 and 2017. Note that this is showing the change in the average broadband download speed experienced across both business and residential premises connections, averaged by postcode with each postcode averaged then averaged again across the local output area (which typically contains five postcodes for residential areas, but many more than this for business areas.) The metric population numbers displayed when you mouse across each area, therefore, is the number of business and residential connections – typically 50-150 for the latter.

The map shows a general light green gradient across the country, showing broadband connection speeds are gradually increasing, as more and more fibre to the cabinet (FTTC) is installed and people change organically contracts to providers with better service. The places where other colours appear are the interesting results. Large increases are seen in rural Lancashire, near Kendal in the Lake District, as a community-driven ultra-high-speed rural service there continues to roll out. More dramatic improvements are seen just to the east of Cheltenham, again a rural area with specialist high technology and defensive industries.

Cranham, for example, has seen a 11000% improvement, from 1.7mbit/s to 190mbit/s, as new business connections have come online:

Appleton, on the other hand, has seen a 99% decrease, from 540mbit/s to 2.3mbit/s:

In London, the drop around King’s Cross, the previous year’s fastest postcode, is almost certainly not due to a general decrease in available speed, but actually because residential connections have come online, and demonstrates the problem with aggregating by the residentially defined “Output Area” geography. The previous, ultrafast result was likely due to dedicated ultra-high-speed links into Google’s new UK office, and other high-technology businesses opening there. Since then, the residential blocks nearby have opened. These still have pretty nice connections, but not the business-level infrastructure needed. So, it shows as an average fall in London.

Rotherhithe is always an interesting area:

A traditionally very poorly connected area, both in transport but also digital connectivity, it has seen dramatic improvements in many areas. but also big falls in the newest area – again possibly due to an increased residential component in the mix.

Explore the broadband difference interactive map.
Browse Oliver’s Innovation Workshop presentation.

*This blog post has been re-published from http://oobrien.com/.

CDRC host the Annual Roger Tomlinson Prize Lecture

The Consumer Data Research Centre (CDRC) hosted the Annual Roger Tomlinson Prize Lecture at Senate House, University College London (UCL) on 22 March 2018.

Dr Mike Gould, Esri’s Global Education Manager, delivered an engaging talk on ‘Data, Software, and Services: Views from Industry and Academia’ where he threw open the debate on Geographic Information Systems (GIS) and it’s use across a range of services.

During the event Mike Gould presented the annual Roger Tomlinson Prize, which was established at UCL by the founder of ESRI, Jack Dangermond, and has become an inherent feature of the Roger Tomlinson lecture. This year’s prize was awarded to Guy Lansley, Research Associate with the CDRC; an additional prize was presented to Abhinav Mehrotra, Research Associate at UCL.

The event had an excellent turnout of people from across academia, business and retail sectors. A number of attendees took to social media to share their pleasure at being able to view the  original hard copy of Roger Tomlinson’s PhD thesis, submitted in July 1974, with one attendee coining it the “first book of GIS”.

Mike Gould (l) presenting CDRC’s Guy Lansley with the Roger Tomlinson Prize 2018

 

Mike Gould, Global Education Manager for ESRI, delivering his talk on ‘Data, Software, and Services: Views from Industry and Academia’

 

Abhinav Mehrotra (r) holding his Roger Tomlinson Prize; pictured with CDRC Director Prof Paul Longley (l) and colleague Mirco Musolesi

 

“The first book of GIS” – excerpt from Roger Tomlinson’s PhD thesis submitted in July 1974

 

“The first book of GIS” – excerpt from Roger Tomlinson’s PhD thesis submitted in July 1974

 

 

The CDRC causal inference summer school is back!

Applications are now open for the much-anticipated Consumer Data Research Centre (CDRC) summer school Causal Inference with Observational Data: challenges and pitfalls (9th-13th July 2018). The school ran in 2017 to huge acclaim from participants, and returns this year with even more content and teaching as it is now a 5 day as opposed to 4 day school.

Taking place at the Leeds Institute for Data Analytics (LIDA), the summer school comprises state-of-the-art training in the analysis of observational data for causal inference. By exploring the philosophy and utility of directed acyclic graphs (DAGs), participants will learn to recognise and avoid a range of common pitfalls in the analysis of complex causal relationships, including the longitudinal analyses of change, mediation, nonlinearity and statistical interaction.

Specifically, the course will cover the following:

  • Prediction vs causal inference
  • Advanced use of directed acyclic graphs (DAGs);
  • The role and relevance of covariates in multiple regression;
  • Collider bias in sample selection (including reversal paradox and the Table 2 Fallacy)
  • Conditioning on the outcome (including regression to the mean);
  • Compositional data errors (including mathematical coupling and composite variable bias)
  • Analysis of change and statistical evaluation of longitudinal data
  • Statistical interaction and model parameterisation issues.
  • Time-varying exposures, time-varying confounding, and G-methods

The key aim of the summer school is to introduce the thinking into statistical modelling of observational data, and to debunk some common misconceptions around causal inference using DAGs, and to teach participants how to avoid the common pitfalls of using observational data. Although the training is delivered by three professionals from the School of Medicine, Prof Mark Gilthorpe, Dr Peter Tennant and Dr George Ellison, applications are not limited to those with a research background in health or medical sciences but are open to all social science disciplines.

Last year’s participants gave glowing feedback on the course, saying that it was well structured, and a game-changer in terms of their thought-processes when they approach their research now. Below are some of the comments following the 2017 summer school:

 

“The course was a paradigm shift for me in terms of thinking around causal inference and gave me the tools to think about some important pitfalls in analysis that I would have otherwise missed.”

“I found the discussion of how to construct DAGs and the different causes of bias (e.g. regression to the mean, numerical coupling etc.) the most interesting.”

“It was great to think about my own research in a different, more critical way.”

“The tutors were all extremely knowledgeable, approachable, and oozed enthusiasm for the subject!”

“I would not have had any formal training in causal analysis if it were not for this course, it has made me aware of many issues which I will now be alerted to.”

“Now I’ll think more causally about my analysis and hypotheses, and will definitely use things like DAGs to clarify my thinking and analysis strategy.”

 

Fees and how you can apply

Full details of the summer school can be found here.

Fees are as follows:

£295 (postgraduate students)*

£595 (researchers, academics, public and charitable sector)*

Places for the summer school are limited to 25 so that tutors have good contact time with each participant. These places are expected to fill fast, so apply now using this application form to avoid disappointment. Your application will be reviewed within 1 week and, if you are successful, we will then send you a booking link to pay for your place.

 

*Fees include tuition, refreshments and lunch for the 5 days; accommodation, breakfast and travel are not included.

Tableau Workshop: opinion from Dr Phani Kumar Chintakayala

After the 22nd February Tableau Workshop, hosted by the CDRC at Leeds, we invited Dr Phani Kumar Chintakayala to share his perspective on the training and Tableau version 10.3 …

“It was back in 2015 that I first heard about Tableau visualization software – it was version 9.0 at this point, and I tried out bits and pieces but didn’t explore much further. Then in February 2018, I attended a 1-day workshop on Tableau 10.3 hosted by the Consumer Data Research Centre (CDRC) in the Leeds Institute for Data Analytics (LIDA). Even before the start of the workshop, as soon as I opened the software, I realised that the latest version of Tableau has come a long way and is far superior to the version I tried back in 2015.

The workshop started off with a couple of introductory sessions, the first of which was delivered by Prof. Roy Ruddle of the School of Computing, who is perhaps best known for his research into novel interactive visualization techniques. He highlighted how visualization can act as a tool to better understand data, especially in the current era of Big Data. The second session was delivered by representatives from Tableau, Thierry Driver and Archana Ganeshalingam, who demonstrated and took us through some examples of how Tableau is used by a range of people from researchers to professionals, for visualizing interesting inputs/outputs.

After the introductory session we were let loose on Tableau 10.3 to gain hands-on experience with its various features. It was quickly apparent how easy it is to use Tableau.  We were given around eight, carefully-tailored challenges to complete, using real open data in conjunction with Tableau. Each challenge built on the last and we were required to use a range of different Tableau features, putting into practice our training gained in the morning session. Taken together, these challenges really helped us get to know the various tools and features of the software.

As with many other latest software versions, data-loading in Tableau is as easy as simple drag and drop. Tableau supports basic analytics and provides various means of visualizing data. Some features of the software are very versatile and feel unique: for example its ‘dashboard’ feature which allows users to bring a number of plots (generated using the same data) into one window and enables you to link them in the window. This means you can gain better insight into how a change in one aspect will affect the others.

While doing sentiment analysis on a sensitive topic using Twitter data, I used Tableau to visualize origin of tweets on the issue. I plotted the data on Tableau Map with the size of the blob representing the volume of tweets originating from a city or town. Below is the amateur map that I managed to develop. Through visualization in Tableau, I was pleased to find I could easily identify the cities that reacted on the issue and the volume of tweets generated – much better and more impactful than presenting the same information in, for example, a table.

Map of India with certain cities and towns highlighted to reflect Twitter data concentrations
Please click on image to see a higher resolution version.

 

All in all, the CDRC Tableau Workshop was very useful as it introduced several features of Tableau that will allow me to visualize data in order to better understand it and its implications before proceeding to analyse it. This is a very useful tool, especially when dealing with Big Data/secondary data where understanding the data and their context is very important. Although I have not tried any other visualization software, I believe Tableau is a helpful tool for any analyst wanting to visualize data and thereby gain a better understanding of data and their context.”

Dr Phani Kumar Chintakayala is Senior Research Fellow in the Business School and Leeds Institute for Data Analytics at the University of Leeds. His primary research interest is Behavioural Economics covering consumer behaviour, econometric modelling, sustainability etc. 

We are planning to run this workshop again in Spring 2019. Please register your interest for this course by emailing Kylie Norman.

SmartStreetSensor Footfall Atlas explained

We recently launched our SmartStreetSensor Footfall Atlas dataset and it has proved extremely popular. Read on to find out why.

The Footfall Atlas is a derived product based on the Consumer Data Research Centre (CDRC)-Local Data Company (LDC) SmartStreetSensor Footfall Data. It contains information about every sensor installed by the LDC across the UK since July 2015 until December 2017 along with relevant metadata for the network of sensors.

This unique dataset provides a detailed insight into the flows of people around many different types of retail locations by the hour and – unlike the Google popular times feature – these data contain actual quantities in the y-axis.

With these data users can draw from different fields and produce a wide variety of products, such as retail areas classification, comparisons of different footfall signals across different seasons (summer, winter), different types of retailers (restaurants, charities, mobile shops) and even different types of footfall at various points during the day (morning rush hour, lunch hour, night economy).

For detailed information about the CDRC-LDC SmartStreetSensor project, click here.

The SmartStreetSensor Footfall Atlas dataset can be located here.

For more information on how to apply to access this dataset for use in your own research go to ‘using our data services‘.

We launched the SmartStreetSensor Footfall Atlas dataset via the CDRC newsletter. To subscribe, email info@cdrc.ac.uk.

Leeds Digital Festival 2018

The Leeds Digital Festival is a multi-venue, city-wide festival celebrating digital culture in all its forms, and the CDRC are pleased to be a part of it once again this year.  We will be a hosting a number of events throughout the two weeks (16-27 April), including the following training courses:

 

Introduction to R – 16th April, 13.00-16.00

R is an increasingly popular open source programming language and can be used in many different ways to manipulate and tidy data. In this half-day course provided by the Consumer Data Research Centre (CDRC) in Leeds, you will have the opportunity to work within the R ‘ecosystem’ to import, clean, manipulate and visualize real world data. No prior experience with R required.

Find out more


Building Simple Smartphone Apps Without Coding (same course running on 19th and 23rd April), 9.00-12.30

Ever wanted to learn more about how mobile applications can be built? Do you have questions about how mobile apps can gather the data you need? Hosted by the Consumer Data Research Centre, this course presented by Dr Chris Birchall from University of Leeds School of Media & Communication, will take an introductory look at some of the alternative routes that exist to create mobile content without pre-existing coding skills. Participants will be able to experiment with ways to create the mobile functionality that they need, working with Software such as MIT AppInventor and tools, such as PhoneGap. No prior experience in digital creation is necessary.

Find out more


R for Transport Applications: Handling Big Data in a Spatial World – 26-27 April 

Delivered by Dr Robin Lovelace of the Leeds Institute for Transport Studies, this 2 day course teaches two skill-sets that are fundamental in modern transport research: programming and data analytics, with a focus on spatial data. Combining these enables powerful transport planning and analysis workflows for tackling a wide range of problems, including: how to effectively handle large transport datasets and where to locate new transport infrastructure. The first day will focus on how the R language works, general concepts in efficient R programming, and spatial and non-spatial data classes in R; the second day will cover its application to geographical transport datasets.

Find out more

It’s not just us though, CDRC friends and partners across the city are hosting a range of events which may also be of interest to you:

ODI Leeds – the team at ODI Leeds will be starting the week with the ODI Leeds Showcase, hosting their data science training and will be ending the festival with the ODI Leeds Open House.  You can find out more about their full programme of events on the ODI Leeds website.

KX SystemsHow data & AI are enabling the real-time retail revolution

Trinity McQueenResearch in the tech/digital space and Measuring & defining consumer trust in a changing digital landscape