Home » Summer School – Causal inference with observational data: challenges and pitfalls

Summer School – Causal inference with observational data: challenges and pitfalls

Date(s) - 08/07/2018 - 13/07/2018
All Day

Categories No Categories

This five-day summer school offers state-of-the-art training in the analysis of observational data for causal inference. By exploring the philosophy and utility of directed acyclic graphs (DAGs), participants will learn to recognise and avoid a range of common pitfalls in the analysis of complex causal relationships, including the longitudinal analyses of change, mediation, nonlinearity and statistical interaction.

This school is run by Prof Mark S Gilthorpe (Leeds Institute for Data Analytics, LIDA, & School of Medicine) and Dr Peter WG Tennant (LIDA, School of Medicine), with input from Dr George TH Ellison (LIDA, School of Medicine), and is based on materials prepared in conjunction with Dr Johannes Textor (Radboud University Medical Center, Nijmegen).

Through a mix of lectures, discussions, and interactive workshops – blending theory with real-world examples – the school aims to provide an essential introduction to the analysis of ‘big data’. Although the examples are primarily taken from the biomedical literature, the topics are relevant to any discipline where non-experimental data is analysed using linear regression models. We therefore welcome researchers from fields across the social sciences, not just those from the health and medical sciences. Please get in touch if you have any questions about the suitability of the course.


The school will cover the following subjects:

  • Prediction vs causal inference
  • Advanced use of directed acyclic graphs (DAGs);
  • The role and relevance of covariates in multiple regression;
  • Collider bias in sample selection (including reversal paradox and the Table 2 Fallacy)
  • Conditioning on the outcome (including regression to the mean);
  • Compositional data errors (including mathematical coupling and composite variable bias)
  • Analysis of change and statistical evaluation of longitudinal data
  • Statistical interaction and model parameterisation issues.
  • Time-varying exposures, time-varying confounding, and G-methods


Learning Objectives:

By the end of the school, participants will be able to:

  • adopt a ‘causal perspective’ for the analysis of observational data, with the aid of directed acyclic graphs (DAGs);
  • adopt a systematic approach to specifying, using, and interpreting DAGs for planning, conducting, and appraising observational research;
  • recognise common, yet poorly recognised, pitfalls and challenges in modelling observational data;
  • understand how various routine analytical approaches can introduce bias, leading to spurious research findings;
  • appreciate the importance of data generation in the building and selection of appropriate statistical models;
  • understand how alternative and emerging methods (such as mediation analysis, g-methods, and the E-step method) can be used to conduct more robust observational analyses;
  • critically-appraise the modelling strategies of other researchers; and
  • recognise the importance of, and begin the practice of, THINKING before DOING any statistical modelling of observational data.

Previous participants of the course have been impressed with the breadth and carefully-considered nature of the content and fed back that it has caused them to think differently and more constructively about their respective research areas.

One participant said: “The course was a paradigm shift for me in terms of thinking around causal inference and gave me the tools to think about some important pitfalls in analysis that I would otherwise have missed.”


Tutor Biographies

Mark S Gilthorpe, Professor of Statistical Epidemiology

Mark Gilthorpe is Professor of Statistical Epidemiology at the University of Leeds. For most of his career he has focussed on ending poor statistical practices, developing new analytical methods, and promoting the use of robust methods for causal inference with observational data. More recently, this has involved progressing the application of ‘directed acyclic graphs’ (DAGs) to identify and summarise assumptions, examine theory-data consistency, and evaluate causal effects. In addition to his research, Mark has extensive experience of student education, and in 2006 set up the University of Leeds MSc in Statistical Epidemiology (now the MSc in Health Data Analytics) to help train the next generation of health data scientists. The Summer School combines this long record with new teaching innovations to offer a unique summary of his knowledge and perspectives.

Peter WG Tennant, University Academic Fellow in Applied Health Data Analytics

Peter Tennant is University Academic Fellow in Health Data Analytics at the University of Leeds. After spending many years conducting ‘traditional data analysis’, he has developed an interest in methodological research and teaching since moving to Leeds and working alongside Mark. With a background in applied health research, he recently branched into student education, joining the team delivering the University of Leeds MSc in Health Data Analytics. Also new to ‘causal inference’ methods, he provides a relatable perspective on the challenges and lessons within the Summer School syllabus.



Fees include tuition, refreshments and lunches throughout the 5-day school. Travel, breakfast and accommodation are not included in the fee.

£295 (postgraduate student rate)
£595 (researchers, academic staff, and public and charitable sector employees)


Accommodation on campus has been reserved for the summer school – we will provide a link to book this accommodation when you are notified of the success of your application. Individuals are free to arrange alternative accommodation if preferred.  Please do not book accommodation until you have received confirmation of your place on the course.



Applications for this School are now closed.