Estimador de la curva roc en dos etapas para estudios de cohorte
- Díaz Coto, Susana
- Norberto Octavio Corral Blanco Doktorvater/Doktormutter
Universität der Verteidigung: Universidad de Oviedo
Fecha de defensa: 23 von April von 2021
- María Angeles Gil Alvarez Präsident/in
- Enrique Artime Carlos Sekretär/in
- María José Rodríguez Álvarez Vocal
- Alba María Franco Pereira Vocal
- Juan Carlos Pardo Fernández Vocal
Art: Dissertation
Zusammenfassung
The Receiver Operating Characteristic (ROC) curve has become one of the most popular graphical tools for the assessment of the classification accuracy of continuous markers with respect to a binary characteristic. For each possible cut-off point, it plots the sensitivity, or proportion of subjects with the characteristic of interest and correctly classified as positive, against the complement to the specificity, that is, the proportion of subjects without the characteristic, but declared positive as well. The ROC curve has been routinely used in both diagnosis and prognosis studies. Given their different nature, the ROC curve estimation has been separately considered for binary results (diagnosis) and time-dependent outcomes (prognosis), even when the mentioned studies have been conducted through the same type of study design. When dealing with binary outcomes, the ROC curve estimation has been tackled from parametric, semi-parametric and non-parametric approaches. However, the case-control design is commonly assumed, although the outcome of interest is defined ad hoc, for instance, through a subsidiary measure in cohort studies. The main difference between these situations is that, while in the former the researcher decides about the type of subject to be included (positive/negative), in the latter that selection is at random. This issue could have an important effect on the estimation of the global distribution of the marker. When the characteristic of interest is a time-dependent outcome, it is not clear when the subjects are positive or negative; that depends on the moment over the follow-up in which they are evaluated. Properly ROC curve extensions have been proposed to fit this scenario. In their estimation, the main handicap to address is the presence of incomplete information. This issue (censorship) may be caused by drop-offs of subjects from the study or because it is over before they have experienced the outcome of interest. Although there exist several procedures considering the right censoring case, the statistical literature is scarce in other censoring patterns, like the interval censorship. In this memory we present the so-called two stages Mixed-Subjects (sMS) estimator, which allows to link both, diagnosis and prognosis scenarios through a general predictive model (first stage) and the weighted empirical estimator of the cumulative distribution function of the marker (second stage). The predictive model depicts the relationship between the marker and the characteristic under study. It is approximated, in the first stage, through a suitable probabilistic model, using the available information from the sample. In the second stage, the remainder unknown parameters are replaced for their empirical estimators.