dc.description.abstract |
ABSTRACT Binary logistic regression is a statistical model used to predict the probability of an event, a useful way to clarify the relationship between a independent variables and the binary response variable. Such type of model is widely used in modeling various real problems. In this study, we review the theoretical framework for explaining the logistic regression model and the mathematical equations related to it. As other types of statistical data, logistic regression data. Subjected to the existence of outliers and influential observations. This study reviews six methods of detecting outliers in the binary logistic regression, namely: Pearson residual, Standardized Pearson residual, Deviance residual DFFIT, Cook distance and Hat value. Furthermore, their performances in the cases of existing single and multiple outliers examined via extensive simulation studies, based on three different logistic models. Regardless of the nature of the model with a given sample sizes and contamination levels, the results of the simulation study showed that both DFFIT and CD had the best performance compared to other methods. The results showed that of both the Deviance residual and Hat value methods are the weakest. The results also showed an inverse relationship between the contamination levels and the CDO at different sample sizes. For illustration purposes, a real data of 30 patients with leukemia modelled by binary logistic regression, and the six detection methods implemented to detect possible outliers, the analysis showed an agreement with the findings of the simulation study. |
en_US |