Home » News

Hiscox University Data Challenge

Lawrence Ning Lu, CDRC PhD Student, is leading a team from the University of Leeds in the Hiscox University Data Challenge.

The ‘Hiscox University Data Challenge’ sees three teams from LSE, York and Leeds Universities compete to solve real world problems, giving Hiscox the opportunity to tap into creative and analytical minds with a different viewpoint to those currently working in the industry. For the students, they get to experience real world problem solving, access to industry experts, networking opportunities with current graduates on the scheme and potential Hiscox sponsorship.

The Challenges

The team have completed the first challenge, which saw them modelling the causes of railroad accidents in America and identifying factors that may increase liability.

They have now moved on to the second challenge which asks them to consider  what factors of success would look like for a start-up company.  For instance, ‘is there a ‘Silicon Valley effect?’  Does the age of the CEO influence a start-up’s success and how?

The Experience

The team, which includes four students from our MSc Consumer Analytics and Marketing Strategy, recently visited Hiscox offices in York, to present the results of their first challenge to a senior team and also had the opportunity to network with the underwriters and analysts.

It’s not all hard work though, there was time for team bonding at Five Guys too.

Good luck to the team, we look forward to seeing their progress in the competition.

To find out more about Leeds Data Science Society, visit their website or follow them on Twitter.

CDRC’s Deputy Director James Cheshire wins National Travel Publishing Prize

CDRC’s Deputy Director James Cheshire has won the London Book Fair Innovation in Travel Publishing Prize in the 2016 Edward Stanford Travel Writing Awards for his publication ‘Where the Animals Go’
co-written with Oliver Uberti.

The book demonstrates how data and technology have changed our understanding of animal movement, and showcases the latest research and data visualisation techniques.

The scientific content, and focus on animals rather than humans, make the book very different from what readers may think of as travel writing, so it was a real surprise to James and Oliver that they were nominated for the prize in a shortlist of six.  Well done James!

Commenting on the winners, Tony Maher, Managing Director of Edward Stanford Limited, writes:

“As the world grows smaller and in many cases more dangerous, travel writing in all its forms keeps us in touch with our global family. These disparate shortlists have one unifying feature – they are all marvellous examples of what travel writing and publishing does best, which is to show the reader a world far from our own doorsteps, made reachable by these glorious, powerful and unforgettable books.”

CDRC’s Dr James Cheshire will be speaking at Data Science Festival

Dr James Cheshire will be speaking about Spatial Analysis at this years Data Science Festival on 29th April.

The Data Science Festival is a free, week long, annual, celebration of all things data science. The festival consists of lectures, workshops, demos, code sprints, panel discussions and social events, spread across London and runs from 24 to 30 April 2017

Further details here:    http://www.datasciencefestival.com/speaker/london/2017/james-cheshire/

The Meetup page:  https://www.meetup.com/Data-Science-Festival-London/events/237959852/

UCL will be hosting the next London R workshop and networking event

The timings for the 28 March event will be:

  • 2.30pm – 3pm – Registration of workshop attendees
  • 3pm – 5.30pm – Workshop in DLT
  • 5.30pm – 6.20pm – Registration for evening meeting and arrival networking drinks in rooms B05 and B15
  • 6.20pm – 8.00pm – Presentations in DLT
  • 8.00-9.30pm – Networking drinks in rooms B05 and B15

This workshop is free but spaces are limited; please request a place in advance via email to londonR@mango-solutions.com

Shop Direct deliver guest lecture on MSc Consumer Analytics and Marketing Strategy

Students on our MSc in Consumer Analytics and Marketing Strategy were delighted to receive an interactive and informative guest lecture delivered by three speakers from Shop Direct. Shop Direct are the second largest pure play digital/e-commerce retailer in the UK and are the company behind the Very.co.uk and Littlewoods.com brands. We were joined by Tony Birch (Business Modelling Manager) and Nicola Dunford (Data Scientist) from their data science and analytics functions, and Louise Utton (Talent Partner) from their HR team. They introduced Shop Direct and their target customer demographic, ‘Miss Very’.

Tony and Nicola gave an excellent overview of the analytics, data science and modelling functions across their business and brands. They gave a clear distinction between day-to-day analytics and reporting versus larger scale innovative data mining and model building. They outlined the use of self-service systems to enable colleagues from across the business to access routine data (e.g. product sales) and gave specific examples of some of their larger model building projects, particularly in relation to assessing their marketing campaigns. They also discussed some of their novel ‘user experience’ and lab testing, all carried out in-house.

Louise introduced students to the working environment at their ‘Skyways House’ Head Office in Liverpool, including their new training and wellbeing venue ‘The Cube’, and discussed the MSc Dissertation Projects that Shop Direct are offering as part of the CRDC Masters Research Dissertation Programme. Students commented incredibly positively on the usefulness of the session and a number of students have been in further contact with the team in relation to MSc projects and future careers.

Guest lectures are an exceptionally important part of our MSc in Consumer Analytics and Marketing Strategy, giving students direct exposure to the application of their skills in a commercial setting. The interactive nature of this lecture enabled students to have a direct dialogue with managers and data scientists at Shop Direct. We very much look forward to welcoming Shop Direct back for further guest lectures in the future and hope that a number of our students will explore further opportunities to work with Shop Direct.

Webinar: Bringing new forms of data to the study of cities

This introductory webinar is for anyone with an interest in computers, cities and data. Participants learn about the explosion of new sources of data and about changes in urban cities that are currently taking place, as well as the main opportunities and challenges this represents. Special attention is paid to the need to re-think how we approach data in this new landscape to be able to reap the benefits without running into (already solved) problems.

Download Presentation

Tableau Course Review – Nick Malleson

We will be running this course again on Thursday 22 February 2018 – find out more and book online.

 


I recently attended a 1-day course to learn how to use Tableau visualisation software, hosted by the Consumer Data Research Centre (CDRC) in the Leeds Institute for Data Analytics (LIDA).

On its website, Tableau says that it “helps the world’s largest organizations unleash the power of their most valuable assets: their data and their people”. The shorter version of that is, basically, that Tableau is software to visualise and analyse data. And mostly to visualise at that (for serious analysis you’re probably going to use something else like R or Python). But as for using Tableau as a data visualisation tool, I was very impressed!

The course took us through some examples of how to use Tableau for some increasingly difficult problems. These were interesting and a good way to get the handle of using the software, but I spent most of the time using it on some other data that I’m interested in at the moment. In particular, Leeds City Council have released a load of footfall data from a few cameras that they have dotted around Leeds, and I was interested in trying to look at the flows of people around the city.

It was easy to load the camera data in (just by dragging) and to link it to the camera locations that I had stored as a separate file. Tableau works out which columns represent coordinates and then lets you map the data. The screenshot below shows the camera locations, the the colour and size of the dots determined by the total footfall over the whole time period. The map is pretty rubbish at that scale, it is designed for regional or national mapping, but you can link to MapBox which will give you full control over the basemap. I didn’t do this, but imagine that it is a very useful feature (MapBox is great).

I then began to explore the changes in footfall over time, and this is when I was most impressed. Tableau parsed the time data properly (i.e. by not confusing dates and times for something like text), which was nice, but more importantly it made it incredibly easy to either look at trends over time (e.g. footfall per week over the last few years) or to aggregate to specific times (e.g. total counts on Mondays, Tuesdays, etc.). The figure below shows two examples of this. OK, you could do this with lots of other tools, but I was very impressed at how easy it was. There is also a ‘dashboard’ function that lets you combined plots and make images.

Visualising data by different time periods, and creating nice outputs, was really easy.

To summarise: I was very impressed with Tableau as a data visualisation tool. The one-day course is probably overkill for people who are fairly confident with modelling/visualisation tools already as it was generally pretty easy to use. But it was still nice to have a day messing around with some data. I don’t know what the price of Tableau is – as a lecturer I am lucky enough to have been given a free license – but if it is affordable then I would certainly recommend it as software to quickly do some useful visualisations of data.

Dr Nick Malleson is an Associate Profressor in Geographical Information Science and a member of the Centre for Spatial Analysis and Policy (CSAP) at the University of Leeds. His primary research interest is in developing spatial computer models of social phenomena with a particular focus on crime simulation. 

Using novel types of data to detect illness caused by contaminated food or drink

CDRC PhD Student Rachel Oldroyd is one of the UK Data Service Data Impact Fellows. Rachel is a quantitative human geographer based at the Consumer Data Research Centre (CDRC)at the University of Leeds, and here discusses how novel types of data are used to detect illness caused by contaminated food or drink.

Affecting an estimated 1 million people at a cost of around £1.5 billion per year (Food Standards Agency, 2011), foodborne illness remains an unacceptably high burden on the UK population and economy. As many victims choose to recover at home without visiting their GP, the number of cases is difficult to measure and severely under reported in national data.

But what is foodborne illness? The World Health Organisation defines it as an Infectious Intestinal Disease caused by the ingestion of a harmful parasite, virus or bacteria, known as a pathogen. A pathogen can infiltrate any part of the food supply chain and can be hard to detect, but will result in symptoms ranging from mild nausea to death. With around 500 annual deaths in the UK attributed to food poisoning, the Food Standards Agency (FSA) are continually developing methods to support their key objective to reduce its incidence.

I currently teach Geographic Information Systems in the School of Geography and I’m studying towards a PhD in the Consumer Data Research Centre, both at the University of Leeds. My research is focused around data analytics for food safety. In particular, exploring the landscape of food safety in the UK and investigating the utility of novel types of data. For the first part of my research I plan to extract geo-demographic variables from the 2011 Census, investigating relationships between these variables and food safety measures (restaurant hygiene scores, hospital admissions and mortality). The second part of the research will focus on the use of novel types of data and Natural Language Processing to detect cases of foodborne illness reported through online reviews and social media. It is hoped that these datasets may provide additional information missed by the traditional GP reporting process.

Many US studies have researched the use of novel types of data for disease detection, reporting timeliness and the inclusion of additional information as key advantages compared to traditional GP data. For example, in a restaurant review, customers may comment on the cleanliness of the restaurant, the quality of the service and/or describe the food they ate. These user-generated comments are extremely useful and are not available from traditional data sources. However, extracting reviews within which customers report illnesses can be difficult. It is not as simple as looking for specific keyword matches, as these will often return false results; for example searching for ‘sick’ may return ‘I’ve never known anyone get sick here’. This is where Natural Language Processing plays its part. A model can be trained to identify sequences of words which refer to illness and return relevant reviews; ignoring those which do not indicate illness.

It’s hoped that this research will have a strong policy impact and will be used to inform and improve the current restaurant inspection process in the UK. Throughout the research I plan to liaise with key industry professionals, including those from the FSA and the local authority to keep the research relevant and timely. I’m delighted to be named as one of the UK Data Service Data Impact Fellows and plan to take full advantage of the scholarship by developing impact through presentations at national and international conferences, disseminating the research through suitable publications and holding stakeholder events and public seminars. Watch this space!