skip to navigation skip to content
- Select training provider - (Equality & Diversity)
Fri 26 May 2023
09:30 - 17:30

Venue: Bioinformatics Training Room, Craik-Marshall Building

Provided by: Bioinformatics


Booking

Bookings cannot be made on this event (Event is completed).


Other dates:

No more events

[ Show past events ]



Register interest
Register your interest - if you would be interested in additional dates being scheduled.


Booking / availability

Building Computational Pipelines with Snakemake (IN PERSON)
PrerequisitesNew

Fri 26 May 2023

Description

High-throughput data analyses usually involve many data processing steps, including the use of a range of command line tools and scripts to transform, filter, aggregate and visualise data. Each tool may require a specific set of inputs and options to be defined and, as we chain multiple tools together, this can become challenging to manage. As analyses pipelines become more complex and with the ever-increasing amounts of data being collected in research, reproducible and scalable automatic workflow management becomes increasingly important.

The Snakemake workflow management system is a tool to create reproducible and scalable data analyses pipelines/workflows. Workflows are described via a human-readable, Python-based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition. Finally, Snakemake workflows can entail a description of the required software, which will be automatically deployed to any execution environment.

With over 500k downloads on Bioconda, and over 2k citations, Snakemake is a widely used and accepted standard for reproducible data science that has powered numerous research goals and publications.

This 1-day workshop will cover the principles for building workflows using Snakemake, as well as more advanced strategies to fully customise, automate and scale your analysis.

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.

Target audience
  • Graduate students, Postdocs and Staff members from the University of Cambridge, Affiliated Institutions and other external Institutions or individuals
  • Please be aware that these courses are only free for registered University of Cambridge students. All other participants will be charged a registration fee in some form. Registration fees and further details regarding the charging policy are available here.
  • After you have booked a place, if you are unable to attend any of the live sessions and would like to work in your own time, please email the Bioinfo Team as Attendance will be taken on all courses. A charge is applied for non-attendance, including for registered university students.
  • Further details regarding eligibility criteria are available here
Prerequisites
  • Basic experience in Python programming.
  • Familiarity with the Unix command line.
  • This course is not aimed at novice analysts. It is most suited for participants who currently run their analysis using ad hoc scripts, and would like to learn how to fully automate their analysis into a more flexible and reproducible workflow.
  • Although the examples used in this course come from the field of bioinformatics, prior knowledge of bioinformatics is not required.
Sessions

Number of sessions: 1

# Date Time Venue Trainer
1 Fri 26 May 2023   09:30 - 17:30 09:30 - 17:30 Bioinformatics Training Room, Craik-Marshall Building map Johannes Köster
Topics covered

Bioinformatics, Data Handling, Workflow Management System

Objectives

After this course you should be able to:

  • Create an analysis pipeline using Snakemake.
  • Specify software and computational resource needs for your analysis steps.
  • Customise your pipeline to accept user-defined configurations.
  • Create reproducible analyses that can be adapted to new data with little effort.
Aims

During this course you will learn about:

  • The syntax of the Snakemake workflow language.
  • How to define a step in a pipeline using rules and defining its inputs, outputs and execution statements.
  • How to generalise rules using wildcards and create a chain of dependency across multiple rules.
  • Advanced pipeline customisation using configuration files and custom-made functions.
  • How to scale workflows to compute servers and clusters while adapting to hardware-specific constraints.
Format

Presentations, demonstrations

Timetable

Day 1 Topics
09:30 - 10:00 Presentation: Introduction to Snakemake
10:00 - 12:00 Tutorial: Basic practical session
12:00 - 13:00 Lunch (not provided)
13:00 - 13:30 Presentation: Advanced usage of Snakemake
13:30 - 17:30 Tutorial: Advanced practical session
Registration fees
  • Free for registered University of Cambridge students
  • £ 50/day for all University of Cambridge staff, including postdocs, temporary visitors (students and researchers) and participants from Affiliated Institutions. Please note that these charges are recovered by us at the Institutional level
  • It remains the participant's responsibility to acquire prior approval from the relevant group leader, line manager or budget holder to attend the course. It is requested that people booking only do so with the agreement of the relevant party as costs will be charged back to your Lab Head or Group Supervisor.
  • £ 50/day for all other academic participants from external Institutions and charitable organizations. These charges must be paid at registration
  • £ 100/day for all Industry participants. These charges must be paid at registration
  • Further details regarding the charging policy are available here
Duration

1

Frequency

Once a year

Related courses
Theme
Bioinformatics

Booking / availability