About the Biograph service
This data mining service is one part of a broader Biograph project on text and data mining.
BioGraph: Discovering biomedical relations by unsupervised hypothesis generation
The BioGraph service integrates heterogeneous knowledge bases and allows for the successful automated formulation and ranking of comprehensible functional hypotheses relating biomedical contexts to candidate targets, e.g., for disease-gene relations.
Motivation
For the computational identification of suitable targets among candidate genes in a biomedical context, the intelligence and intelligibility of the method are of vital importance for evaluating the prioritizations. Protein-protein interaction networks are often adopted, but are limited in functional expressivity. The integration with multiple types of biomedical knowledge can enhance the quality of automatically generated functional hypotheses relating contexts, e.g., a disease, and target sets, e.g., a set of candidate genes.
Methods
We propose a data mining framework that allows for the automated formulation of comprehensible functional hypotheses relating a context to targets. The method is based on the integration of heterogeneous biomedical knowledge bases and yields intelligible and literature-supported indirect functional relations. By assessing the plausibility and specificity of these hypothetical functional paths within a user-provided research context, the unsupervised methodology is capable of appraising and ranking of research targets, without requiring prior domain knowledge from the user.
Results
Our proposed methodology offers a range of significant improvements over leading bioinformatics platforms for in silico identification of susceptibility genes: highly ranked targets are grounded in intelligible putative functional hypotheses with rich semantics, verifiable by their references in the literature. The method is unsupervised and does not require prior domain knowledge from the user. Beyond disease-gene applications, the method is applicable in various biological research settings requiring the intelligent and intelligible identification of promising research targets
Authors
Liekens A (1,*), De Knijf J (2), Daelemans W (3), Goethals B (2), De Rijk P (1), Del-Favero J (1).
(1) Applied Molecular Genomics group, VIB Department of Molecular Genetics, Universiteit Antwerpen, Antwerpen, Belgium, (2) Advanced Database Research and Modelling group, Department of Mathematics and Computer Science, Universiteit Antwerpen, Antwerpen, Belgium, (3) Computational Linguistics and Psycholinguistics Research Center, Universiteit Antwerpen, Antwerpen Belgium
