Monday 9 December 2024, 1:00 pm - 5:00 pm (AEDT), Online
With the rise in high-throughput sequencing technologies, the volume of omics data has grown exponentially. A major issue is to mine useful knowledge from these heterogeneous collections of data. The analysis of complex high-volume data is not trivial and classical tools cannot be used to explore their full potential. Machine Learning (ML), a discipline in which computers perform automated learning without being programmed explicitly and assist humans to make sense of large and complex data sets, can thus be very useful in mining large omics datasets to uncover new insights that can advance the field of bioinformatics.
This hands-on workshop will introduce participants to the ML taxonomy and the applications of common ML algorithms to health data. The workshop will cover the foundational concepts and common methods being used to analyse omics data sets by providing a practical context through the use of basic but widely used R libraries. Participants will acquire an understanding of the standard ML processes, as well as the practical skills in applying them on familiar problems and publicly available real-world data sets.
By the end of the workshop you should be able to:
Dr Fotis Psomopoulos, Senior Researcher
Institute of Applied Biosciences (INAB)
Center for Research and Technology, Hellas (CERTH)
The workshop will be run as a series of code-along sessions, with some additional activities for participants to complete throughout the sessions. All participants will stay in the main room, unless they are experiencing technical difficulties and require 1:1 support from a trainer.
Date/Time: Monday 9 December 2024, 1 - 5 pm AEDT / 12 - 4 pm AEST / 12:30 - 4:30 pm ACDT / 10 am - 2 pm AWST
Location: Online
This workshop is for Australian researchers who are or will apply ML to the analysis of omics data as part of their projects. It is suitable for beginners in ML. You must be associated with an Australian organisation for your application to be considered.
No previous knowledge of ML is required or expected (please note, that this will be an introductory course to ML) Familiarity with the R programming language. If you need a refresher on R/RStudio try the Introduction to R and RStudio section of this online tutorial
This workshop is free but participation is subject to application with selection.
Applications close at 11:59pm AEST, 24 November 2024.
Applications are reviewed by the organising committee and all applicants will be informed of the status of their application (successful, waiting list, unsuccessful). Successful applicants will be provided with a Zoom meeting link closer to the date. More information on the selection process is provided in our Advice on applying for Australian BioCommons workshops.
Note: this schedule is fairly tentative and will adapt to the trainees needs and questions, with the expection of start, stop, and break time which will be scrupulously respected.
Time | Details |
---|---|
13:00 - 13:10 | Course Introduction. - Welcome. - Introduction and CoC. - Way to interact - Practicalities (agenda, breaks, etc). - Setup Link to material |
13:10 - 13:20 | Introduction to Machine Learning (theory) Link to material and Link to material |
13:20 - 14:20 | What is Exploratory Data Analysis (EDA) and why is it useful? (hands-on) - Loading omics data - PCA Link to material |
14:20 - 14:30 | Introduction to Unsupervised Learning (theory) |
14:30 - 15:30 | Agglomerative Clustering: k-means (practical) Link to material |
15:30 - 15:40 | Introduction to Supervised Learning (theory) |
15:40 - 16:40 | Building a classifier: decision trees (practical) Link to material |
16:40 - 17:00 | Wrap-up and closing of workshop |
If you finish all the exercises and wish to practice on more examples, here are a couple of good examples to help you get more familiar with the different ML techniques and packages.
The material in the workshop has been based on the following resources:
Relevant literature includes:
This material is made available under the Creative Commons Attribution 4.0 International license. Please see LICENSE for more details.
Wandrille Duchemin, Crhistian Cardona, Pedro L. Fernandes, & Fotis E. Psomopoulos. (2021). Introduction to Machine Learning (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.5752486
Additionnaly, we would like to acknowledge that this training materials draws heavily from :
Shakuntala Baichoo, Wandrille Duchemin, Geert van Geest, Thuong Van Du Tran, Fotis E. Psomopoulos, & Monique Zahn. (2020, July 23). Introduction to Machine Learning (Version v1.0.0). Zenodo. http://doi.org/10.5281/zenodo.3958880