兰州大学机构库 >数学与统计学院
基于核范数的深度生存分析研究
Alternative TitleDeep Survival Analysis Based on Nuclear Norm
童剑阳
Subtype硕士
Thesis Advisor赵学靖
2021-05-15
Degree Grantor兰州大学
Place of Conferral兰州
Degree Name理学硕士
Degree Discipline概率论与数理统计
Keyword生存分析 深度神经网络 缺失数据补全 核范数
Abstract生存分析主要是研究生存状况和生存时间以及它们与各类协变量之间的统计关系的一门学科。生存分析的研究数据存在不同类型的删失(左删失,区间删失以及右删失)。本文选择右删失类型的医学统计领域生存分析类数据进行研究,主要对原始数据存在的数据缺失以及分类变量表示问题进行了处理,并将深度学习方法应用于生存分析模型中,将DeepSurv算法应用于各类数据之上,并与其他应用广泛且性能较好的算法进行对比,且在该算法基础之上进行了改进。 关于生存分析大规模数据,本文寻找的是医学统计领域经典MIMIC-III数据。在处理数据方面,首先是该组数据中缺失数据较多,本论文将在图像去噪领域应用较多的核范数方法应用于生存分析协变量矩阵数据补全之上,以寻求更好的数据补全效果;其次在该组数据的协变量当中存在着较多分类变量,本论文考虑使用引入虚拟变量以及基于经验贝叶斯的表示方法对其进行处理,以寻求更合理的可解释性。 使用深度神经网络方法训练得出生存分析参数模型时,可能会存在训练速度慢、过拟合等问题。论文基于此在算法当中使用了扩展指数线性单元(Scaled Exponential Linear Unit, SELU)激活函数、自适应矩估计(Adaptive moment estimation, Adam)优化算法、学习率衰减、神经网络单元Dropout等方法,以得出最终模型。基于以上各方面的考量,本文最终提出基于核范数的深度生存分析算法(NN-DeepSurv),并将其应用于包含模拟数据以及MIMIC-III 在内的各组数据之上,以检验算法的效果与实用性。
Other AbstractSurvival analysis is a discipline which mainly focuses on survival status, survival time and their statistical relationship with different covariates. The data of survival analysis has different kinds of censoring, including left censoring, interval censoring and right censoring. The data studied in this paper is survival data in medical statistics with right cencoring. Problems of data missing and categorical attributes representation are solved in this paper, and deep learning methods are used in survival analysis model to propose a kind of new algorithm to study the regression of survival data. This paper devotes to apply DeepSurv algorithm to different kinds of data, and compare it with other well-performed algorithm to see the performance of DeepSurv, then propose new method based on it. As for the large scale survival data in this paper, the classical MIMIC-III data in medical statistics is selected. While processing the data, the missing part is really large, so nuclear norm method, a method usually used in image denoising, is used to complete the missing data;then there exist lots of discrete categorical variables in the data, so dummy variable method and empirical Bayes method are taken into consideration to find better interpretability. When deep neural network is used to train the survival parameter model, there may exist problems like slow converging speed, overfitting and so on. To solve these problems, this paper applies Scaled Exponential Linear Unit(SELU) activation function, adaptive moment estimation(Adam) optimization algorithm, learningrate decay and Dropout method to get the final model. In summary, this paper proposes DeepSurv algorithm based on nuclear norm(NN-DeepSurv), and applies it to simulated data and actual data(including MIMIC-III) to illustrate the performance of the proposed algorithm.
Pages44
URL查看原文
Language中文
Document Type学位论文
Identifierhttps://ir.lzu.edu.cn/handle/262010/459644
Collection数学与统计学院
Affiliation
数学与统计学院
First Author AffilicationSchool of Mathematics and Statistics
Recommended Citation
GB/T 7714
童剑阳. 基于核范数的深度生存分析研究[D]. 兰州. 兰州大学,2021.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[童剑阳]'s Articles
Baidu academic
Similar articles in Baidu academic
[童剑阳]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[童剑阳]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.