Home » Archives for February 2018

‘Tidying’ up Brexit and Trump

2016 was an eventful year. The narrow votes in favour of Brexit in the UK and Trump in the US were a shock to many. You’ve probably heard commentators remark on the underlying causes for why people voted as they did. A familiar caricature is of blue collar disaffection (Leave and Trump) versus liberal, metropolitan values and relative affluence (Remain and Clinton). But is this true of the entirety of the UK and US?

The CDRC is pleased to launch a new training short course at Leeds this year, “Explaining Brexit and Trump with Tidy data graphics”, to tantalise the analytical imaginations of researchers interested in using data to examine the influences on human decision-making within the social sciences. Taking place on the 2nd May, this course will be delivered by Dr Roger Beecham, who will lead an exploration of the story told by the data behind the EU referendum vote and American presidential election. You’ll learn how to develop a family of data graphics (in R), each of which will reveal a bit more of the data puzzle behind the UK referendum and US election results.

A lot of theories have been ventured in popular media as to why both votes went the way they did – some ‘clickbait’, others more intriguing – but in this 1 day course you’ll have the opportunity to explore the hard data behind the votes; to look at these data in the specific contexts of socio-demographic variables; and to evaluate area-level variation in the votes. It is a course designed to elucidate and lead you toward more data-grounded answers to the questions of what happened in 2016 and why people voted the way they did.

In addition to understanding a little more about the political phenomena, you will:

  • learn how to data wrangle, reshape and curate Tidy data in R
  • appreciate some key principles of good data visualization design
  • confidently generate data graphics using a consistent vocabulary (ggplot2)
  • develop an intuitive understanding of statistical modelling procedures

 

Sound like fun? You can find out more and book on the course here.

Want to take part in a Data Challenge on the subject of Brexit? The CDRC is currently challenging those planning to attend the GISRUK 2018 conference to use CDRC datasets to submit responses to the theory launched in The Economist article, “The immigration paradox Explaining the Brexit vote” – find out more here.

ArcGIS and R – Uneasy Bedfellows?

We’ve been seeing some interesting trends developing in researchers’ preferences for certain programming software over others.

When it comes to spatial modelling, building maps and visualising data, ArcGIS has long been considered the front runner. Its intuitive ‘point and click’ interface means that you can build professional looking outputs from scratch without the need for any programming experience.

Some of the stand-out merits of ArcGIS include:

  • Its ability to handle a large variety of spatial and non-spatial data formats
  • The layout view which allows creation of professional outputs
  • The ‘Add data’ button which recognises tables, rasters and all GIS formats
  • It’s a great tool for easy visualisation
  • The table joins in ArcGIS are intuitive, enabling the linking of spatial and other data
  • It handles Coordinate Reference Systems in a user friendly way
  • It gives you access to ArcGIS Online (AGOL): a great resource for sourcing a whole range of GIS datasets
  • A basic licence in ArcGIS still gives you access to a large number of tools
  • It has a variety of purchasable useful plug-ins such as crime analyst
  • You’re a shoe-in for creating beautiful maps where titles and legends are far easier to add
  • The ArcMap splash screen (i.e. start-up screen) displays all your latest documents

Of course, it is a subscription software and access to full ArcGIS features is determined by the level of your licence. Although the subscription fee can be expensive, ESRI offers award-winning support services and extensive help documentation. At the end of the day, its map-building and visualisation potential is vast and user-friendly.

But could there be another software contending in the race for slick visualisation and professional outputs?

R has been called a “a statistical powerhouse programming language”* which combines a framework of integrated processing, analysis and modelling. It’s perfect for the researcher as all outputs are easily reproducible and seemingly the most popular tool for data mining in business and academia. However, it has a steep learning curve in order to get up to speed with its command programming set-up, which may not be intuitive for non-programmers, though this is immeasurably helped by the use of RStudio.

Some of the hands-down merits of R include:

  • It’s open source and therefore supported by a vast online community of users happy to share their wisdom for free
  • It’s a superior data analysis tool as it is able to handle large amounts of data
  • Its base package includes all standard statistical tests, models and analyses
  • It’s versatile in allowing you to manipulate data
  • It’s designed to be used with spatial and statistical data
  • You can use R to solve complex data science, machine learning and statistical problems
  • Geographically weighted regression and spatial interaction models can be custom built around your spatial data in R

The key difference between R and ArcGIS though when it comes to spatial visualisation and mapping is that operations in R are command-centred and therefore visualisations may only be created and edited by altering command codes. R does have an extensive graphics library but creating a professional output can be very time-consuming for a beginner. Once you get the hang of it, this is great, but it is not necessarily as intuitive as ArcGIS. There’s also no dynamic canvas with which to pan and zoom. But on the other hand it is free!

Of course, you can combine R with ArcGIS and get the best of both worlds to become an absolute spatial wizard! There is now the Esri R-ArcGIS bridge which enables you to increase the capabilities of analyses across different disciplines. In real terms it allows you to transfer your data between ArcGIS and R without losing any of the functionality and formatting. Learn more here. Or you might prefer to use QGIS as an alternative to ArcGIS – learn more about the relative merits of ArcGIS and QGIS here.

Co-authored by Rachel Oldroyd

 

Want to learn more about ArcGIS? Book on our training short course on 19th March with expert Rachel Oldroyd to find out more: https://www.cdrc.ac.uk/events/19725/

Want to learn more about R? Book on our training short course on 16th April with expert Richard Hodgett to find out more: https://www.cdrc.ac.uk/events/introduction-to-r-2/

Continue the debate with us! Let us know what you think @CDRC with the hashtag #ArcGISvsR

 

*citation: https://blogs.esri.com/esri/arcgis/2016/07/21/put-the-r-in-arcgis-2/

Data-Sprint 2018

The Consumer Data Research Centre (CDRC) is proud to announce that we will be supporting the Financial Conduct Authority (FCA) in an upcoming Data-Sprint event. Spread across two days on the 20th – 21st March 2018, at the Data-Sprint teams of individuals with varying skill sets will generate innovative solutions for tackling organisational challenges. Data for the event will be provided by the FCA through the CDRC data platform and made available to participants the week before.

The Data-Sprint will focus around cases relating to the recent Financial Lives survey. For the published report, questionnaire and data tables, click here

The FCA has collated an extraordinary volume of consumer information and will experiment with this, in order to present findings that are insightful and/ or creative. Some of the key questions to be tackled during the event are likely to be: 

  •   What tools can the FCA use to best visualise these data and make the survey findings both insightful and easy to use?
  • How the FCA can create insights to inform specific organisational decisions? This part of the Data-Sprint will include a focus on the wealth of data on the financial products consumers have that the survey provides.
  • Can the Financial Lives data be linked with, or reported alongside, other data, to create enhanced insight? These other data could be FCA proprietary data, other proprietary data or data in the public domain.

Who can attend?
The FCA would like to invite you along to participate in the Data-Sprint. This will be a great opportunity to help shape the way the FCA thinks about data and collaborates with external experts.

How to apply?
If you’d like to attend, please email Ed Towers and Jimmy Galloway at ‘theanalyticscommunity@fca.org.uk’ by 3rd March 2018, including a brief description of your current role, your technical capability and what it is about the Data-Sprint that appeals to you.
Places are limited and will be allocated based on skills and availability.

For further information about the event click here.

WICI Specialist Conference: Call for Abstracts Currently Open

The Waterloo Institute for Complexity & Innovation (WICI) has announced the theme of this year’s Specialist Conference: “Modelling complex urban environments”. The conference takes place on the 21st-22nd July 2018 at the University of Waterloo (Ontario, Canada) with the intention of bringing together researchers from multiple disciplines with experience and interest in modelling complex environments, from smart cities to urban planning.

The call for abstracts is now open with a deadline of 1st March 2018 (see below for specific guidelines).

One theme may be of particular interest, organised and led by Alison Heppenstall of the Leeds Institute for Data Analytics:

Integrating “big data” and “smart cities” data with urban modelling

This theme is designed to interrogate the role which increasingly available so-called ‘Big’ data has to play in altering the ways in which spatial, statistical and geographical analysis is conducted. It will look at the ways in which new methodologies have been fostered by the arrival of Big data in order to better understand how urban systems and infrastructure behave, with an emphasis on how they may be useful in the drive towards sustainability and efficiency. The goal of the session is to bring research together on the topic in order to produce a journal issue.

Within this theme, WICI are particularly interested in papers that engage with the following:

  • Integrating urban analytics and agent/individual-based modelling
  • Machine learning for urban analytics
  • Innovations in consumer data analytics for understanding urban systems
  • Real-time model calibration and data assimilation
  • Spatio-temporal data analysis
  • New data, case studies, demonstrators, and tools for urban analytics
  • Geographic data mining and visualisation
  • Frequentist and Bayesian approaches to modelling cities.

Paper proposals should include:

  • a short but descriptive title
  • a list of all contributing authors and their affiliations
  • an abstract of no more than 250 words
  • a list of 3-5 keywords
  • and an identification of the theme to which the proposal is submitted, if applicable.

Session proposals should include:

  • a short but descriptive title
  • a session abstract of no more than 250 words
  • a list of organizers and their affiliations
  • 3-5 keywords
  • and a list of potential paper contributions, following the format from above.

All proposals should be directed to Noelle Hakim (noelle.valeriotehakim@uwaterloo.ca). For full details on the conference and all proposed themes, see here.

Corporate Responsibility Research Conference: Call for Abstracts Currently Open

The 13th annual Corporate Responsibility & Research Conference takes place on the 11th-12th September 2018 with the theme “Engaging Business and Consumers for Sustainable Change”. The conference is being hosted by the Sustainability Research Institute (SRI) and Business and Organisations for Sustainable Societies research group (BOSS) at the University of Leeds and promises a rigorously dynamic environment in which to experiment with new ideas, test theories and challenge perceived norms in the fields of corporate responsibility and research ethics.

The call for abstracts is now open with a deadline of 30th April 2018 (see below for specific guidelines).

With this year’s focus on sustainability and change, the conference especially encourages papers that will challenge the status quo in corporate attitudes towards sustainability and poses the question: how do we get business and consumers truly engaged in addressing the grand challenges of the present ecological and sociocultural crisis? Two sub-themes may be of particular interest, chaired by researchers from our very own Leeds Institute for Data Analytics:

From the CRRC:

“The conference is looking for theoretically informed and practically relevant papers on business and consumer involvement for sustainable change. It welcomes contributions from different disciplines and fields of study, including literatures on corporate responsibility, corporate sustainability, sustainable consumption, sustainable development, business and society, business ethics, ethical consumption, sustainable entrepreneurship, and organisation and the environment.”

The requirements for initial abstracts are as follows:

  • Files should be sent in MS Word format, and the file name should be first author’s surname. Please include names, affiliations and contact details of all authors.
  • Please use a maximum of 500 words, answering the following questions:
  • Research Question: What is the research question that the submission aims to answer?
  • Theoretical Framework: What are the main concepts, models or theories used in the paper? Include 3-4 central references.
  • Method: Which method is used for the research work?
  • Findings: What are the main outcomes and results of the paper?
  • Which sub-theme is your paper aimed at or is it for the open call?
  • Abstracts should be emailed to abstracts@crrconference.org
Abstracts will be reviewed and selected by the scientific committee of the conference and authors will be notified of acceptance by 15th May when the conference registration opens.
For details on conference costs, see here.

Smart cities need to be more human, so we’re creating Sims-style virtual worlds

Nick Malleson, University of Leeds and Alison Heppenstall, University of Leeds

Huge quantities of networked sensors have appeared in cities across the world in recent years. These include cameras and sensors that count the number of passers by, devices to sense air quality, traffic flow detectors, and even bee hive monitors. There are also large amounts of information about how people use cities on social media services such as Twitter and foursquare.

Citizens are even making their own sensors – often using smart phones – to monitor their environment and share the information with others; for example, crowd-sourced noise pollution maps are becoming popular. All this information can be used by city leaders to create policies, with the aim of making cities “smarter” and more sustainable.

But these data only tell half the story. While sensors can provide a rich picture of the physical city, they don’t tell us much about the social city: how people move around and use the spaces, what they think about their cities, why they prefer some areas over others, and so on. For instance, while sensors can collect data from travel cards to measure how many people travel into a city every day, they cannot reveal the purpose of their trip, or their experience of the city.

With a better understanding of both social and physical data, researchers could begin to answer tough questions about why some communities end up segregated, how areas become deprived, and where traffic congestion is likely to occur.

Difficult questions

Determining how and why such patterns will emerge is extremely difficult. Traffic congestion happens as a result of personal decisions about how to get from A to B, based on factors such as your stage of life, your distance from the workplace, school or shops, your level of income, your knowledge of the roads and so on.

Congestion can build locally at pinch points, placing certain sections of the city’s transport networks under severe strain. This can lead to high levels of air pollution, which in turn has a severe impact on the health of the population. For city leaders, the big question is, which actions – imposing congestion charges, pedestrianising areas or improving local infrastructure – would lead to the biggest improvements in both congestion, and public health.

We know where – but why?
worldoflard/Flickr, CC BY-NC

The irony is, although modern technology has the power to collect vast amounts of data, it doesn’t always provide the means to analyse it. This means that scientists don’t have the tools they need to understand how different factors influence the way cities function and grow. Here, the technique of agent-based modelling could come to the rescue.

The simulated city

Agent-based modelling is a type of computer simulation, which models the behaviour of individual people as they move around and interact inside a virtual world. An agent-based model of a city could include virtual commuters, pedestrians, taxi drivers, shoppers and so on. Each of these individuals has their own characteristics and “rules”, programmed by researchers, based on theories and data about how people behave.

After combining vast urban datasets with an agent-based model of people, scientists will have the capacity to tweak and re-run the model, until they detect the phenomena they’re wanting to study – whether it’s traffic jams or social segregation. When they eventually get the model right, they’ll be able to look back on the characteristics and rules of their virtual citizens, to better understand why some of these problems emerge, and hopefully begin to find ways to resolve them.

For example, scientists might use urban data in an agent-based model to better understand the characteristics of the people who contribute to traffic jams – where they have come from, why they are travelling, what other modes of transport they might be willing to take. From there, they might be able to identify some effective ways of encouraging people to take different routes or modes of transport.

Seeing the future

Also, if the model works well in the present time, then it might be able to produce short-term forecasts. This would allow scientists to develop ways of reacting to changes in cities, in real time. Using live urban data to simulate the city in real-time could help to inform the managers of key services during periods of major disruption, such as severe weather, infrastructure failure or evacuation.

Using real-time data adds another layer of complexity. But fortunately, other scientific disciplines have also been making advances in this area. Over decades, the field of meteorology has developed cutting-edge mathematical methods, which allow their weather and climate models to respond to new weather data, as they arise in real time.

There’s a lot more work to be done before these methods from meteorology can be adapted to work for agent-based models of cities. But if they’re successful, these advancements will allow scientists to build city simulations which are driven by people – and not just the data they produce.

Nick Malleson, Associate Professor of Geographical Information Systems, University of Leeds and Alison Heppenstall, Professor in Geocomputation, University of Leeds

This article was originally published on The Conversation. Read the original article.

 

The University of Leeds currently has a number of PhD opportunities, in collaboration with The Alan Turing Institute, which are supervised by Prof Alison Heppenstall.

Project 1: Understanding the inner-workings of city-level agent-based models

Project 2: Uncovering hidden patterns and processes in social systems

Find out more and apply online.