Our funded research projects
Listing our funded projects here
SYNTHIA
- Duration: 2024-2029
- Funder: HORIZON-JU-IHI-2023-05-04
- Description: SYNTHIA is an ambitious collaboration between public and private institutions to facilitate the responsible use of Synthetic Data (SD) in healthcare applications. The project will improve the methodological and technical aspects of SD Generation (SDG) by developing new techniques and advancing established ones for different data modalities, including genomics and imaging, to improve the generation of realistic multimodal and longitudinal data. This project will provide the research community with approaches for transparent benchmarking of alternative SDG methods for specific applications, identify and establish evaluation metrics and methodologies, and contribute to the standardisation of an evaluation assessment framework for SD. Robust evidence of SD applicability in a set of use cases across a broad spectrum of medical conditions will be crucial to demonstrate the potential of SD to accelerate data-driven solutions of equivalent quality to those derived from real patient data. Furthermore, legal and regulatory implications of SD use will be analysed with the aim of delivering an assurance framework to guide secure SD utilization in healthcare. These significant breakthroughs will be implemented through the open SYNTHIA federated platform, facilitating responsible SD use by the health research community. The platform will facilitate users´ long-term access to extensively validated, reusable synthetic datasets, as well as to SDG workflows and SD assessment frameworks. The federated infrastructure will rely on extended open-source frameworks for interoperability with other data-sharing infrastructures in the context of the European Health Data Space. A multidisciplinary collaboration of SDG developers, FAIR data experts, clinical researchers, developers of therapies and data-based tools, legal experts, socio-economic analysts, regulatory, policy advocacy, and communication experts will provide a 360º vision on how to advance healthcare applications through SD use.
EVERSE
European Virtual Institute for Research Software Excellence
- Duration: 2024-2027
- Funder: HORIZON-INFRA-2023-EOSC-01-02
- Description: The EVERSE project aims to create a framework for research software and code excellence, collaboratively designed and championed by the research communities across five EOSC Science Clusters and national Research Software Expertise Centres, in pursuit of building a European network of Research Software Quality and setting the foundations of a future Virtual Institute for Research Software Excellence. This framework for research software excellence will incorporate aspects involving community curation, quality assessment, and best practices for research software. This collective knowledge will be captured in the Research Software Quality toolkit (RSQkit), a knowledge base to gather and curate expertise that will contribute to high-quality software and code across different disciplines.
EOSC4Cancer
A European-wide foundation to accelerate Data-driven Cancer Research
- Duration: 2023-2026
- Funder: HORIZON-INFRA-2021-EOSC-01-06
- Description: Cancer complex nature requires integration of advanced research data across national boundaries to enable progress. Indeed, the Horizon Europe mission board for cancer has identified access to data, knowledge and digital services - accessible across the European Research Area through federated infrastructures - as a key enabling condition for success. The better we organise cancer data across Europe, the better and faster we can bring the fruits of new biological and technical innovations to the benefit of EU citizens/patients. EOSC4Cancer will make cancer genomics, imaging, medical, clinical, environmental and socio-economics data accessible, using and enhancing existing federated and interoperable systems for securely identifying, sharing, processing and reusing FAIR cancer data across borders, and it will offer them via community-driven analysis environments. EOSC4Cancer provision of well curated datasets will be essential for advanced analytics and computational methods to be reproducible and robust, including machine learning and artificial intelligence approaches. EOSC4Cancer use-cases will cover the patient journey from cancer prevention to diagnosis to treatment, laying the foundation of data trajectories and workflows for future cancer mission projects. EOSC4Cancer brings together a comprehensive consortium of cancer research centres, research infrastructures, leading research groups, hospitals and supercomputing centres from 14 European countries. To make the developments sustainable, these will be offered as part of the research infrastructures partners services portfolio, in connection with the EOSC ecosystem and to serve the European Cancer Mission, which will be possible via the engagement with large international coalitions, e.g. ICGC-Argo, GA4GH, 1+MG/B1MG, Cancer Core Europe, European Cancer Information System, European Network of Cancer Registries, Innovative Partnership for Action Against Cancer Joint Action and patients/survivors associations.
SciLake
Democratising and making sense out of heterogeneous scholarly content
- Duration: 2023 – 2026
- Funder: HORIZON-INFRA-2021-EOSC-01-04
- Description: SciLake’s mission is to build upon the OpenAIRE ecosystem and EOSC services to (a) facilitate and empower the creation, interlinking and maintenance of SKGs and the execution of data science and graph mining queries on top of them, (b) contribute to the democratization of scholarly content and the related added value services implementing a community-driven management approach, and (c) offer advanced, AI-assisted services that exploit customised perspectives of scientific merit to assist the navigation of the vast scientific knowledge space. In brief, SciLake will develop, support, and offer customisable services to the research community following a two-tier service architecture. First, it will offer a comprehensive, open, transparent, and customisable scientific data-lake-as-a-service (service tier 1), empowering and facilitating the creation, interlinking, and maintenance of SKGs both across and within different scientific disciplines. On top of that, it will build and offer a tier of customisable, AI-assisted services that facilitate the navigation of scholarly content following a scientific merit-driven approach (tier 2), focusing on two merit aspects which are crucial for the research community at large: impact and reproducibility. The services in both tiers will leverage advanced AI techniques (text and graph mining) that are going to exploit and extend existing technologies provided by SciLake’s technology partners. Finally, to showcase the value of the provided services and their capability to address current and anticipated needs of different research communities, four scientific domains (neuroscience, cancer research, transportation, and energy) have been selected to serve as pilots. For each, the developed services will be customised, to accommodate differences in research procedures, practices, impact measures and types of research objects, and will be validated and evaluated through real-world use cases.
ML4NGP
CA21160 - Non-globular proteins in the era of Machine Learning
- Duration: 2022 - 2026
- Funder: Cost - European Cooperation in Science & Technology
- Description: Protein structure prediction has long been considered the “Holy Grail” of structural biology. The recent success of AlphaFold has ushered in a new era of highly accurate structure prediction, bringing to light the secrets hidden in the three-dimensional structures of globular proteins, increasing our understanding about their structural features and molecular function. However, a large proportion of the proteomes from all domains of life are rich in sequences that do not fold into regular structures, commonly known as non-globular proteins (NGPs). NGPs comprise intrinsically disordered regions, repeats, low-complexity sequences, aggregation-prone and phase-separating sequences, and are implicated in a range of age-related diseases. Their heterogeneous structural states and low sequence complexity challenge current experimental structure determination techniques and machine learning (ML) methods for structure prediction, making the molecular understanding of their sequence-structure-dynamics-function relationship difficult. The recent improvements of ML approaches and advances in determining NGP structural ensembles call for a timely re-assessment of the interplay between experiments and computation. The ML4NGP Action aims to establish an interdisciplinary pan-European network to favour this interplay, fostering experimental frameworks designed to provide information to computational methods, and novel computational methods developed, trained and benchmarked with experimental data. ML4NGP will enhance the primary experimental data generation (WG1), promote integrative structural biology approaches (WG2), benchmark the state-of-the-art ML methods (WG3) and improve the functional characterization of NGPs (WG4). The Action will support its scientific objectives through policies that sustain free knowledge exchange, inclusiveness and training of young researchers who will lead future innovations in this field.
GenOptics
Large Scale bio-data Visual analytics platform
- Duration: 2020 - 2023
- Funder: General Secretariat for Research and Technology (GSRT)
- Description: GenOptics is a 30-month national project (2020-2023) that will implement a new platform for integrating, analyzing and visualizing multi-omics and other clinico-biological data, by extending the functionality of established bioinformatic analysis tools through an interactive visual dashboard. Visual analytics, namely the simultaneous use of computational data analysis tools with interactive visualizations, is a powerful way of combining algorithms with an intuitive, user-friendly interactive environment in order to extract knowledge from large scale bio-data. Although visual analytics could contribute to significantly reducing the gap between available data and discovery of new knowledge through targeted studies, there is no widely used platform that provides this functionality to the scientific community. GenOptics will allow visual analytics of multi-omics data through a modular system that will integrate multiple interoperable analysis tools into a software platform, leveraging established international technologies and software suites such as Docker and R/Bioconductor.
Gallantries
When Galaxy meets Carpentries to develop and deliver open and scalable training in life sciences
- Duration: 2020 - 2023
- Funder: Erasmus+
- Description: The Gallantries project has been funded by Erasmus+ to develop modular, broadcast-ready bioinformatics lessons over the next 3 years, aiming in particular for a hybrid form of training delivery. Partners and associated members of the project include several ELIXIR Nodes and Communities, and are dedicated to developing high quality bioinformatics training by combining the experience of Carpentries trainers with the Galaxy platform. Saskia Hiltemann & Helena Rasche (ErasmusMC, ELIXIR-NL), Fotis Psomopoulos (INAB, CERTH, ELIXIR-GR), Bérénice Batut (University of Freiburg, ELIXIR-DE), Anthony Bretaudeau (INRAE, ELIXIR-FR), and Yvan Le Bras (MNHN, ELIXIR-FR) and their teams will be working together to develop, improve, and deliver quality, reusable, modular, and pandemic-friendly lessons.
ELIXIR Commissioned Services
Co-leading the following ELIXIR activities as commissioned services:
- Tools Platform Task 1 “Packaging, containerisation & deployment”
- Tools Platform Task 2 “Performance benchmarking & technical monitoring”
- Tools Platform Task 4 “Software Best Practices”
- Training Platform Task 2 “Gap analysis, training materials development and training delivery”
- Strategic Implementation Study “Making container services integratable, sustainable and widely used”
- Community-led Implementation Study “Improving IDP tools interoperability and integration into ELIXIR”