| Name: | Description: | Size: | Format: | |
|---|---|---|---|---|
| 2.64 MB | Adobe PDF |
Advisor(s)
Abstract(s)
Many bioinformatics problems pertain to large, highly complex amounts of biological data that are
often modelled in a graph-like arrangement to allow for a systemic-level analysis of data. Graphs provide a means of encoding biological knowledge into a formal structure that is useful for modelling
and analysing relationships in biological systems. Several machine learning approaches have been developed to deal with data represented as graphs, namely by using graph neural networks, end-to-end
representation-learning methods that can directly learn from graph-structured data.
A form of knowledge representation that allows for the conceptualization and specification of domains of interest is the use of ontologies. These can have a biological application into representing and
structuring existing knowledge by its meaning and relationships, allowing for the organisation of large
volumes of biological entities into knowledge graphs. With both biological data and biological knowledge represented as graphs, an opportunity to directly enrich the data graph with knowledge about its
entities arises.
Thus, the aim of this work was to explore different approaches into combining a PPI network with
information pertaining to the Gene Ontology to leverage the additional biological knowledge in a protein
function prediction setting. Two methodologies were proposed for the development of said approaches.
The first comprehends the creation of approaches to merging a PPI network with protein function information and combination of said approaches with different ML models to test if they benefit from the
additional information during protein function prediction. The second methodology sees the construction
of a GNN -based method to learning PPI and GO representations in separate and combining them for a
global learning of protein function -related information. The evaluation of the different approaches and
different experimental conditions with a benchmark dataset showed an overall performance increase.
Description
Tese de Mestrado, Bioinformática e Biologia Computacional, 2023, Universidade de Lisboa, Faculdade de Ciências
Keywords
Grafos de conhecimento Ontologias Redes neuronais em grafos Redes de interação proteína-proteína Teses de mestrado - 2023
