Analysis of expression Proteomics data in R (IN-PERSON) Prerequisites
This workshop focuses on expression proteomics, which aims to characterise the protein diversity and abundance in a particular system. You will learn about the bioinformatic analysis steps involved when working with these kind of data, in particular several dedicated proteomics Bioconductor packages, part of the R programming language. We will use real-world datasets obtained from label free quantitation (LFQ) as well as tandem mass tag (TMT) mass spectrometry. We cover the basic data structures used to store and manipulate protein abundance data, how to do quality control and filtering of the data, as well as several visualisations. Finally, we include statistical analysis of differential abundance across sample groups (e.g. control vs. treated) and further evaluation and biological interpretation of the results via gene ontology analysis. By the end of this workshop you should have the skills to make sense of expression proteomics data, from start to finish.
If you do not have a University of Cambridge Raven account please book or register your interest here.
- ♿ The training room is located on the first floor and there is currently no wheelchair or level access.
- Our courses are only free for registered University of Cambridge students. All other participants will be charged according to our charging policy.
- Attendance will be taken on all courses and a charge is applied for non-attendance, including for University of Cambridge students. After you have booked a place, if you are unable to attend any of the live sessions, please email the Bioinfo Team.
- Further details regarding eligibility criteria are available here.
- Guidance on visiting Cambridge and finding accommodation is available here.
- The course is targeted to either proteomics practitioners or data analysts/bioinformaticians that would like to learn how to use R to analyse proteomics data.
- Familiarity with mass spectrometry or proteomics in general is desirable, but not essential as we will walk through a MS typical experiment and data as part of learning about the tools.
- Basic understanding of mass spectometry.
- Watch this iBiology video for an excellent overview.
- A working knowledge of R and the tidyverse (course registration page).
- If you are not able to attend this prerequisite course, please work through our R materials ahead of the course.
- Familiarity with other Bioconductor data classes, such as those used for RNA-seq analysis, is useful but not required.
Number of sessions: 2
# | Date | Time | Venue | Trainers | |
---|---|---|---|---|---|
1 | Wed 26 Jun 09:30 - 17:30 | 09:30 - 17:30 | Bioinformatics Training Room, Craik-Marshall Building | map | Dr Lisa Breckels, Charlotte Hutchings, Charlotte Dawson, Dr Bajuna Salehe |
2 | Thu 27 Jun 09:30 - 17:30 | 09:30 - 17:30 | Bioinformatics Training Room, Craik-Marshall Building | map | Dr Lisa Breckels, Charlotte Hutchings, Tom Smith, Charlotte Dawson, Ruhina Laskar, Dr Bajuna Salehe |
Bioinformatics, Biology, Data handling, Data visualisation, Proteomics, Bioconductor
During this course you will learn about:
- How mass spectrometry can be used to quantify protein abundance and some of the methods used for peptide quantitation.
- The bioinformatics steps involved in processing and analysing expression proteomics data.
- How to assess the quality of your data, deal with missing values and summarise peptide-level data to protein-level.
- How to perform differential expression analysis to compare protein abundances between different groups of samples.
After this course, you should be able to:
- Import data into R/Biocondutor, starting from the files produced by third party software such as Proteome Discoverer, MaxQuant and FragPipe.
- Manipulate protein expression data using dedicated data structures that are used to store these multi-dimensional datasets.
- Produce several visualisations to help assess the quality of your data and explore and communicate your results.
- Recognise the importance of data normalisation and the methods used to achieve it.
- Find differentially expressed proteins between groups of samples and annotate the results using gene ontology analysis.
Presentations and practicals
Day | Time | Topics |
---|---|---|
Day 1 | 9:30 - 09:40 | Welcome |
9:40 - 10:15 | Introduction | |
10:15 - 11:15 | Import and infrastructure | |
11:15 - 11:30 | Break | |
11:30 - 12:30 | Data cleaning: filtering | |
12:30 - 13:30 | Lunch (not provided) | |
13:30 - 15:00 | Data cleaning: FDR and missing data | |
15:00 - 15:15 | Break | |
15:15 - 17:00 | Data normalisation and aggregation | |
Day 2 | 9:30 - 11:00 | Exploration and visualisation of protein data |
11:00 - 11:15 | Break | |
11:15 - 12:30 | Statistical analysis | |
12:30 - 13:30 | Lunch (not provided) | |
13:30 - 15:00 | Statistical analysis: diagnostics, interpretation and visualisation | |
15:00 - 15:15 | Break | |
15:15 - 17:00 | [if time allows] GO analysis |
- Free for registered University of Cambridge students
- £ 60/day for all University of Cambridge staff, including postdocs, temporary visitors (students and researchers) and participants from Affiliated Institutions. Please note that these charges are recovered by us at the Institutional level
- It remains the participant's responsibility to acquire prior approval from the relevant group leader, line manager or budget holder to attend the course. It is requested that people booking only do so with the agreement of the relevant party as costs will be charged back to your Lab Head or Group Supervisor.
- £ 60/day for all other academic participants from external Institutions and charitable organizations. These charges must be paid at registration
- £ 120/day for all Industry participants. These charges must be paid at registration
- Further details regarding the charging policy are available here
2
twice per year
Booking / availability