Data Science School: Machine learning applications for life sciences (Online) Special£
THIS EVENT IS NOW FULLY BOOKED!
PLEASE NOTE The Bioinformatics Team are presently teaching as many courses live online, with tutors available to help you work through the course material on a personal copy of the course environment. We aim to simulate the classroom experience as closely as possible, with opportunities for one-to-one discussion with tutors and a focus on interactivity throughout.
This School aims to familiarise biomedical students and researchers with principles of Data Science. Focusing on utilising machine learning algorithms to handle biomedical data, it will cover: effects of experimental design, data readiness, pipeline implementations, machine learning in Python, and related statistics, as well as Gaussian Process models.
Providing practical experience in the implementation of machine learning methods relevant to biomedical applications, including Gaussian processes, we will illustrate best practices that should be adopted in order to enable reproducibility in any data science application.
This event is sponsored by Cambridge Centre for Data-Driven Discovery (C2D3).
Please note that if you are not eligible for a University of Cambridge Raven account you will need to book or register your interest by linking here.
- Students and researchers from life-sciences or biomedical backgrounds, who have, or will shortly have, the need to apply the techniques presented during the course to biomedical data.
- The course is open to Graduate students, Postdocs and Staff members from the University of Cambridge, Affiliated Institutions and other external Institutions or individuals
- Please note that all participants attending this course will be charged a registration fee. Non-members of the University of Cambridge to pay £400. All Members of the University of Cambridge to pay £200. A booking will only be approved and confirmed once the fee has been paid in full.
- Further details regarding eligibility criteria are available here
- The course is intended for those who have basic familiarity with the Python scripting language.
- We recommend either attending An Introduction to Solving Biological Problems with Python or working through the materials of Basic Python before attending this course.
- We suggest refreshing some basic knowledge of probability distributions and linear algebra.
- Recommended reference resources for future reading (not necessarily before the course) are: Pattern Recognition and Machine Learning by Christopher M. Bishop (Chapter 2: Probability distributions) and Mathematics for Machine Learning (Chapter 2: Linear Algebra)
Number of sessions: 4
# | Date | Time | Venue | Trainers |
---|---|---|---|---|
1 | Thu 17 Sep 2020 10:00 - 17:30 | 10:00 - 17:30 | Bioinformatics Training Facility - Online LIVE Training | Marta Milo, Dr John C. Thomas |
2 | Fri 18 Sep 2020 09:30 - 17:30 | 09:30 - 17:30 | Bioinformatics Training Facility - Online LIVE Training | Marta Milo, Mario Guarracino, Ichcha Manipur |
3 | Mon 21 Sep 2020 09:30 - 17:30 | 09:30 - 17:30 | Bioinformatics Training Facility - Online LIVE Training | Javier Gonzalez Hernandez |
4 | Tue 22 Sep 2020 09:30 - 13:30 | 09:30 - 13:30 | Bioinformatics Training Facility - Online LIVE Training | Magdalena Strauss, Neil Lawrence |
Bioinformatics, Data handling, Machine learning
After this course you should be able to:
- Identify optimal machine learning methodologies for data analysis
- Apply principles of experimental design to your research project
- Visualise data and apply dimensionality reduction/clustering
- Evaluate the use of Gaussian processes in life science applications
During this course you will learn about:
- Introduction to Data Science and the role of Machine Learning in this field
- Principles of experimental design and impact on downstream data analysis
- Data readiness and its implications in collating, processing and curating data
- Reproducible machine learning workflows
- Learning methods for modelling biomedical data, including Gaussian Processes and latent factors models
- Effective data visualisation and interpretation
Presentations, demonstrations, and practicals
Day 1 | |
10:00 - 10:15 | Introduction of workshop and technical setup |
10:15 - 12:00 | Introduction of Data Science in Life Sciences |
Principles of experimental design | |
Q&A | |
12:00 - 13:00 | Lunch (not provided) |
13:00 - 17:30 | Python recap |
17:30 - 18:00 | Q&A (optional) |
Day 2 | |
9:30 - 10:30 | Introduction to Machine Learning for biomedical data analysis in Python |
10:30 - 12:00 | Data Preparation: sources of data, cleaning up your data and preparing data structure |
12:00 - 13:00 | Lunch (not provided) |
13:00 - 14:30 | Case Study continuing from morning session |
14:30 - 17:30 | Introduction to Random Forest Classifiers, Support Vector Machines and Decision trees |
Case Study on classification | |
17:30 - 18:00 | Q&A (optional) |
Day 3 | |
9:30 - 12:00 | Tutorial in Causal Inference |
12:00 - 13:00 | Lunch (not provided) |
13:00 - 17:30 | Model based experimental design, optimization - practical application with Emukit |
17:30 - 18:00 | Q&A (optional) |
Day 4 | |
09:30 - 11:30 | Introduction to latent factor models and Gaussian processes with applications to scRNA-seq data |
11:30 - 13:00 | Future of AI in biomedical research |
13:00 - 13:30 | Q&A and wrap-up |
- All participants attending this course will be charged a registration fee.
- Non-members of the University of Cambridge to pay 400.00 GBP
- All Members of the University of Cambridge to pay 200.00 GBP.
- A booking will only be approved and confirmed once the fee has been paid in full.
- Further details regarding the charging policy are available here
4
Booking / availability