Search In this Thesis
   Search In this Thesis  
العنوان
A Proposed Statistical Model to Study the factors affecting the Recurrence of an Individual’s stroke by Myocardial Infarction with Application on Ain Shams University Hospitals/
الناشر
Ain Shams university.
المؤلف
Hassan ,Abeer Ragheb Azazy.
هيئة الاعداد
مشرف / عمرو إبراهيم عبد الرحمن الأتربى
مشرف / دينا حسن عبد الهادى
مناقش / سهير فهمى حجازى
مناقش / طلبة زين الدين
باحث / عبير راغب عزازى حسن
الموضوع
Categorical Data Analysis. Logistic Regression Model. Discriminant Analysis. Cross-Validation.
تاريخ النشر
2013.
عدد الصفحات
P.201:
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الإحصاء والاحتمالات
تاريخ الإجازة
1/1/2013
مكان الإجازة
جامعة عين شمس - كلية التجارة - Statistics, Mathematics and Insurance.
الفهرس
Only 14 pages are availabe for public view

from 175

from 175

Abstract

In this thesis, we use some classification methods to determine the risk factors in addition to treatment by Catheter which affected the second Myocardial Infarction occurrence and its probability for patients who were initially treated for first Myocardial Infarction that have at least two years Myocardial Infarction -free after first Myocardial Infarction treatment. We consider about this case because they have high risk to develop a second Myocardial Infarction. Treatment of Catheter and risk factors such: Age at first Myocardial Infarction, Gender, Angina, Chronic Lung Disease, Hypertension, Peripheral Vascular Disease, Hypercholesterolemia, Diabetes, Cerebro vascular Disease, Cardiogenic Shock, Congestive Heart Failure, Smoking, Have a Family history and Renal Failure will be study using the Logistic Regression Model and the Discriminant Analysis Model.
We applied the logistic regression model (LR) to estimate the probability of having second Myocardial Infarction. The odds ratio analysis is using to compare whether the probability of having a second Myocardial Infarction is the same for the two groups for each factor. For testing the significance of the coefficient, we used Wald test and Likelihood ratio test. Hosmer and Lemeshow test and Cross Validation is considered to assess the fit of the Model. Linear Discriminant Analysis (LDA) is using as a comparative method with logistic regression model results.
Nature of the Problem:
Sometimes after the therapy, the patient is exposing to develop a second Myocardial Infarction that may make the patient feel depressed and hopeless at the therapy. We may assume that the Age and Smoking are the factors causing the Occurrence of second Myocardial Infarction but we need to test this kind of assumptions and to know how far these factors are responsible for causing the second Myocardial Infarction.
Objectives of the Study:
Early detection and evaluation of the risk factors, which might cause the occurrence of second Myocardial Infarction is very important. The prediction of risk factors is an important pivot of the war against Myocardial Infarction; that may help doctors to focus on these affected factors and inform patients to avoid it, and give more care for second Myocardial Infarction predicted patients. The usage of statistical methods to identify risk factors would help to identify the probability of second Myocardial Infarction occurrence.
This Study Proposes to:
a. Identifying the independent impact variables, the second Myocardial Infarction occurrence group membership and propose a statistical model to explain the association between the studied covariates and second Myocardial Infarction occurrence.
b. Establishing a classification system using the logistic model to determine group membership; depending on the estimated probability and the used cut-point; that at 0.5 cut-point; when the estimated probability exceeded 0.5 the patient will classify as an expected second Myocardial Infarction patient.
Logistic regression analysis method is using to estimate the optimal model, which helps us to estimate the probability of second Myocardial Infarction. We used Wald test, likelihood ratio test, Hosmer-Lemeshow test, cross validation and Roc curve verify the model.
Linear Discriminant Analysis (LDA) is using for comparing with the Logistic Regression (LR) results. We show the results of (LDA) and (LR) are close even if the normality assumptions did not exist and set some guidelines for recognizing these situations.
Source of Data and Variables:Data for the case study considered of 2692 patients discharged with a diagnosis of acute myocardial infarction (AMI or heart attack) from Ain Shams University hospitals, Cairo, Egypt between 2006, and 2007. All variables used in the current study were either continuous or dichotomous. Dichotomous variables denoted the present absence of a specific condition or risk factor. We assumed that the condition was absent unless it was explicitly documented in the patient’s medical record as being present. We excluded as candidate variables those continuous variables that were missing for more than 10% of the subjects. We then excluded those subjects who had missing data for any of the remaining continuous variables. This reduced our sample size to 1500 patients meet the study assumptions as follows:
a) Patients have a first Myocardial Infarction.
b) Patients are at least two years Myocardial Infarction -free after first Myocardial Infarction treatment.
Dependent variable: have a second Myocardial Infarction.
Independent variables:
a) Patient’s Age at first Myocardial Infarction.
b) Patient’s Gender.
c) Patient’s Cardiogenic Shock.
d) Patient’s Angina.
e) Patient’s Chronic Lung Disease.
f) Patient’s Hypertension.
g) Patient’s Cerebrovascular Disease.
h) Patient’s Have a Family history.
i) Patient’s Peripheral Vascular Disease.
j) Patient’s Congestive Heart failure.
k) Patient’s Smoking.
l) Patient’s Renal Failure.
m) Patient’s Hypercholesterolemia .
n) Patient’s Diabetes.
Most of medical risk factors (i.e., Previous Percutaneous Coronary Intervention (PCI), Previous Coronary Artery Bypass Graft (CABG)) were not available at the Hospital records when the research conducted.
Results
The Logistic Regression model and the Discriminant Analysis show that the Patients who have a Congestive Heart failure, Patients who have a Family History with Myocardial Infarction and Patients with Smoking history are more exposed to a second Myocardial Infarction occurrence.
The hit rate for the binary Logistic regression full model classification 93.1% and Discriminant analysis full model classification is 92.3%, and 93.1% for Cross Validation Classification for the binary Logistic Regression full model and 92.1% for Discriminant analysis full model. This means that; the model has high classification accuracy and it is fit for predication; so we can depend on the Logistic regression model for estimating the probability of the second Myocardial Infarction occurrence and the significant factors are important and play a role in determining this probability.
The hit rate for the forward Likelihood stepwise Binary Logistic Regression method classification 93.1% and Wilk’s lambda Stepwise Discriminant Analysis method classification is 92.5%, and 93.1% for Cross Validation Classification for the forward Likelihood stepwise binary logistic regression method classification and 92.5% for Wilk’s lambda stepwise Discriminant analysis method classification.
This means that; the model has high classification accuracy more than the full model with only the significant factors and it is fit for predication; so we recommended that depending on the Stepwise Logistic regression model for estimating the probability of the second Myocardial Infarction occurrence and the significant factors are important and play a role in determine this probabilityChapter one overview second Myocardial Infarction occurrence, causes, and types. In addition, this chapter views the importance and the objectives of the study.
In chapter two, we introduced an introduction to categorical data analysis, definition, types, properties, and methods for analyzing.
In chapter three, we present the logistic regression model; the Wald test, likelihood ratio test, Hosmer and Lemeshow test, cross validation methods and Roc curve.
In chapter four, we present an introduction to the Discriminant analysis function; walk’s lambda statistic, the Eigenvalue, The Canonical Correlation and press’s Q statistic.
In chapter five, we apply the binary logistic regression analysis to estimate the probability of the occurrence of the second Myocardial Infarction; and Discriminant analysis to estimate the probability of the occurrence of the second Myocardial Infarction, and a summarized comparison between the two methods are using SPSS v.19.0 software.
Summary, conclusions and recommendations, for future research are given in this chapter.