2020-DI-18
2020-DI-18

2020-DI-18
2020-DI-18
 : 
Do-calculus based and machine- learning-based causal analysis algorithms in healthcare-related settings
Do-calculus based and machine- learning-based causal analysis algorithms in healthcare-related settings

A Project coordinated by IIIA.

Web page:

Principal investigator: 

Collaborating organisations:

AQUAS

AQUAS

Funding entity:

Department d'Empresa i Coneixement. Generalitat de Catalunya
Department d'Empresa i Coneixement. Generalitat de Catalunya

Funding call:

Funding call URL:

Project #:

2020-DI-18
2020-DI-18

Funding amount:

33.960,00€
33.960,00€

Duration:

2020-10-01
2020-10-01
2023-10-01
2023-10-01

Extension date:

Many studies in the health field are
 observational, that is, empirical and non-
 experimental (no intervention by the
 researchers in the process of generating
data), and the corpus of observational data grows
 high speed. Traditionally, algorithms
 used in this type of study have been algorithms
 statistics that work by searching for
 correlations in the data. Recently,
algorithms based on machine learning, which also
 work by looking for correlations in
 data has gained popularity. This has meant
an increase in predictive capacity, and a shift in
 approach from observation to prediction.
These approaches, however, do not take into account
 explicitly a fundamental property of the process
 of data generation: causal relationships. These
relationships can be of great interest to
researchers, since, in fact, many studies
try to answer questions primarily
 causal: “Has the implementation of the
 protocol of interest a change in the variable
 interest?" “How will a specific individual react to
 the application of the protocol, or how you would have reacted
 an individual to whom it has been applied, assuming
 that hadn't been done? " “Do genes or
 eating habits this or that disease? " The
 approaches that obviate causal relationships
constitute an epistemological limitation, and try to
 answer causal questions using the
 correlation as an approximation of causality is,
 to this day, a limiting strategy.
 The objectives of this thesis are twofold: on the one hand,
 compare and benchmark algorithms of
 causal analysis based on do-calculus and machine
 learning, focusing on efficiency and versatility.
 On the other hand, develop a general-purpose algorithm
 (for healthcare) that uses a combination
 of the two types of algorithms mentioned
 above, under a series of assumptions and
 terms. This task will be carried out using
 open source programming languages, and libraries
specific such as the do-why library of
Python. The output will be tested and validated in several
datasets managed by AQuAS, and in the cohort
GCAT, and will try to answer causal questions
relevant to the health field.
Many studies in the health field are
 observational, that is, empirical and non-
 experimental (no intervention by the
 researchers in the process of generating
data), and the corpus of observational data grows
 high speed. Traditionally, algorithms
 used in this type of study have been algorithms
 statistics that work by searching for
 correlations in the data. Recently,
algorithms based on machine learning, which also
 work by looking for correlations in
 data has gained popularity. This has meant
an increase in predictive capacity, and a shift in
 approach from observation to prediction.
These approaches, however, do not take into account
 explicitly a fundamental property of the process
 of data generation: causal relationships. These
relationships can be of great interest to
researchers, since, in fact, many studies
try to answer questions primarily
 causal: “Has the implementation of the
 protocol of interest a change in the variable
 interest?" “How will a specific individual react to
 the application of the protocol, or how you would have reacted
 an individual to whom it has been applied, assuming
 that hadn't been done? " “Do genes or
 eating habits this or that disease? " The
 approaches that obviate causal relationships
constitute an epistemological limitation, and try to
 answer causal questions using the
 correlation as an approximation of causality is,
 to this day, a limiting strategy.
 The objectives of this thesis are twofold: on the one hand,
 compare and benchmark algorithms of
 causal analysis based on do-calculus and machine
 learning, focusing on efficiency and versatility.
 On the other hand, develop a general-purpose algorithm
 (for healthcare) that uses a combination
 of the two types of algorithms mentioned
 above, under a series of assumptions and
 terms. This task will be carried out using
 open source programming languages, and libraries
specific such as the do-why library of
Python. The output will be tested and validated in several
datasets managed by AQuAS, and in the cohort
GCAT, and will try to answer causal questions
relevant to the health field.
No publications uploaded yet
Borja Velasco
PhD Student
Jesus Cerquides
Scientific Researcher
Phone Ext. 228

Josep Lluís Arcos
Scientific Researcher
Phone Ext. 227