Secondary abstract: |
In the present thesis I introduce and evaluate a new machine learning method for estimating survival functions from survival analysis data.
Firstly, I describe the field of survival analysis and the problems it deals with. I introduce and define the basic terms of survival analysis, like survival function and survival curve. I also define censored data, a speciallity of survival analysis data, and explain their importance and the learning problems they cause. As a reference method I describe the Kaplan-Meier estimator, a well-known statistical method for estimating survival curves, that serves as a conceptual basis for the new proposed method. I close the introduction with a short overview of the advances of machine learning in the field of survival analysis, concluding that so far there are no well established meachine learning methods in this field.
I continue with an in depth description of the proposed method and it's potential advantages. To test the new method thoroughly I start with a series of tests on artificially generated data from a physics domain. The new method proves itself useful and can match the accuracy of the Kaplan-Meier estimator. I discuss the problem of nonmonotonic survival curve estimations, that can be obtained using the proposed method. All the tests are repeated on a set of real medical data describing the prognostic value of protein markers for survival of metastatic breast cancer patients. The results further confirm the proposed method as useful. In conclusion I present the possibilities of improving the proposed method and suggest other prospects of using machine learning techniques in survival analysis. |