2020-DI-18
2020-DI-18

2020-DI-18
2020-DI-18
 : 
Do-calculus based and machine- learning-based causal analysis algorithms in healthcare-related settings
Do-calculus based and machine- learning-based causal analysis algorithms in healthcare-related settings

A Project coordinated by IIIA.

Web page:

Principal investigator: 

Collaborating organisations:

AQUAS

AQUAS

Funding entity:

Department d'Empresa i Coneixement. Generalitat de Catalunya
Department d'Empresa i Coneixement. Generalitat de Catalunya

Funding call:

Funding call URL:

Project #:

2020-DI-18
2020-DI-18

Total funding amount:

33.960,00€
33.960,00€

IIIA funding amount:

33.960,00€
33.960,00€

Duration:

01/Oct/2020
01/Oct/2020
01/Oct/2023
01/Oct/2023

Extension date:

Many studies in the health field are observational, that is, empirical and non-experimental (no intervention by the researchers in the process of generating data), and the corpus of observational data grows high speed. Traditionally, algorithms used in this type of study have been algorithms statistics that work by searching for correlations in the data. Recently, algorithms based on machine learning, which also work by looking for correlations in data has gained popularity. This has meant
an increase in predictive capacity, and a shift in approach from observation to prediction.
These approaches, however, do not take into account explicitly a fundamental property of the process of data generation: causal relationships. These relationships can be of great interest to researchers, since, in fact, many studies try to answer questions primarily causal: “Has the implementation of the protocol of interest a change in the variable interest?" “How will a specific individual react to the application of the protocol, or how you would have reacted an individual to whom it has been applied, assuming that hadn't been done? " “Do genes or eating habits this or that disease? " The approaches that obviate causal relationships constitute an epistemological limitation, and try to answer causal questions using the correlation as an approximation of causality is, to this day, a limiting strategy.
The objectives of this thesis are twofold: on the one hand, compare and benchmark algorithms of causal analysis based on do-calculus and machine learning, focusing on efficiency and versatility. On the other hand, develop a general-purpose algorithm (for healthcare) that uses a combination of the two types of algorithms mentioned above, under a series of assumptions and terms. This task will be carried out using open source programming languages, and libraries specific such as the do-why library of Python. The output will be tested and validated in several datasets managed by AQuAS, and in the cohort GCAT, and will try to answer causal questions relevant to the health field.
Many studies in the health field are observational, that is, empirical and non-experimental (no intervention by the researchers in the process of generating data), and the corpus of observational data grows high speed. Traditionally, algorithms used in this type of study have been algorithms statistics that work by searching for correlations in the data. Recently, algorithms based on machine learning, which also work by looking for correlations in data has gained popularity. This has meant
an increase in predictive capacity, and a shift in approach from observation to prediction.
These approaches, however, do not take into account explicitly a fundamental property of the process of data generation: causal relationships. These relationships can be of great interest to researchers, since, in fact, many studies try to answer questions primarily causal: “Has the implementation of the protocol of interest a change in the variable interest?" “How will a specific individual react to the application of the protocol, or how you would have reacted an individual to whom it has been applied, assuming that hadn't been done? " “Do genes or eating habits this or that disease? " The approaches that obviate causal relationships constitute an epistemological limitation, and try to answer causal questions using the correlation as an approximation of causality is, to this day, a limiting strategy.
The objectives of this thesis are twofold: on the one hand, compare and benchmark algorithms of causal analysis based on do-calculus and machine learning, focusing on efficiency and versatility. On the other hand, develop a general-purpose algorithm (for healthcare) that uses a combination of the two types of algorithms mentioned above, under a series of assumptions and terms. This task will be carried out using open source programming languages, and libraries specific such as the do-why library of Python. The output will be tested and validated in several datasets managed by AQuAS, and in the cohort GCAT, and will try to answer causal questions relevant to the health field.
2023
Borja Velasco-Regulez,  & Jesus Cerquides (2023). Hydranet: A Neural Network for the Estimation of Multi-Valued Treatment Effects. Artificial Intelligence Research and Development (pp 16--27). IOS Press. https://doi.org/10.3233/FAIA230655. [BibTeX]  [PDF]
2022
Borja Velasco-Regulez,  Jose L. Fernandez-Marquez,  Nerea Luqui,  Jesus Cerquides,  Josep Analia Fukelman,  & Josep Perelló (2022). Is the phase of the menstrual cycle relevant when getting the covid-19 vaccine?. American Journal of Obstetrics and Gynecology, 227, 913-915. https://doi.org/10.1016/j.ajog.2022.07.052. [BibTeX]  [PDF]
Borja Velasco,  Jesus Cerquides,  & Josep Lluis Arcos (2022). Multi-valued Treatment Effect Estimation for Health Technology Assessment with a Neural Network. NeurIPS 2022 Workshop on Causality for Real-world Impact . [BibTeX]  [PDF]
Josep Lluís Arcos
Scientific Researcher
Jesus Cerquides
Scientific Researcher
Phone Ext. 431859

Borja Velasco
Industrial PhD Student