兰州大学机构库 >数学与统计学院
应用智能组合模型预测中国肺结核月发病人数
Alternative TitlePrediction on Monthly Prevalence of Tuberculosis in China Using Intelligent Combination Model
乔贺倩
Thesis Advisor李维德
2018-03-20
Degree Grantor兰州大学
Place of Conferral兰州
Degree Name硕士
Keyword肺结核月发病人数 小波分析 奇异谱分析 STL分解 Elman网络
Abstract

本文主要研究中国肺结核月发病人数的预测,以期为我国肺结核的控制工作提供相应的数据支持。 根据月发病人数序列波动的周期特征,本文从序列分解的角度进行预测。首先,采用小波分析(WA)和奇异谱分析(SSA)对原始序列进行分解,提取序列周期。小波分析得到趋势序列、两个周期序列和残差序列。趋势序列和两个周期序列分别用极限学习机(ELM)和非线性滤波器(NAR)预测;残差序列用支持向量回归(SVR)预测。由此建立两个组合预测模型:WA-ELM-SVR和WA-NAR-SVR。SSA分解得到重构序列和残差序列,重构序列用ELM和NAR分别预测,残差序列用SVR预测,同样得到两个组合预测模型:SSA-ELM-SVR和SSA-NAR-SVR。根据组合预测模型,将小波分析的趋势序列、周期序列和残差序列的预测值加总或者将重构序列和残差序列的预测值加总,就得到肺结核月发病人数预测值。和基于原始序列建立的ELM和NAR模型相比,四个组合模型的预测效果较好,表明小波分析和SSA可以提取原始序列的周期,提高序列预测准确度。然后,考虑到肺结核月发病人数受到季节因素的影响,原始序列波动具有季节周期,具体在每一年中,1月~6月肺结核发病人数不断上升,7月~12月开始逐渐下降。因此采用Seasonal-Trend Decomposition using LOESS(STL)分解,将原始序列分解为季节指数、长期趋势和残差项。季节指数在不同年份保持固定不变,无需建立模型预测。用反馈神经网络(Elman网络)预测长期趋势,SVR预测残差项,建立STL-Elman-SVR预测模型。根据STL-Elman-SVR模型,同样将季节指数、长期趋势和残差项的预测值相加就得到最终月发病人数预测值。比较预测值和真实值,并建立对比模型STL-ARIMA和STL-GM(1,1),比较模型可知,STL分解可以准确提取稳定的季节指数,保证预测的准确性。 本文将小波分析、SSA和STL分解方法引入发病人数预测中,运用机器学习算法预测分解分量,建立组合预测模型。根据组合模型所预测的肺结核发病人数高发月份,可以为我国完善肺结核预防措施提供定量依据。

Other Abstract

The main purpose of this paper is to predict the monthly prevalence of tuberculosis (TB) in China. The prediction result can provide data support for the control of TB in China. According to the periodic characteristics of the monthly prevalence series, this paper predicts the prevalence using sequence decomposition method. 
First, Wavelet Analysis (WA) and Singular Spectrum Analysis (SSA) are introduced to decompose original series respectively to extract periodic component. The trend series, two periodic sequences and residual sequences are obtained by wavelet analysis. The trend series and two periodic series are predicted by Extreme Learning Machine (ELM) and Nonlinear Autoregressive filter (NAR) respectively. The residual sequence is predicted by Support Vector Regression (SVR). Thus, two combination forecasting models named WA-ELM-SVR and WA-NAR-SVR are trained. The reconstruction sequence and the residual sequence are decomposed by SSA. The reconstruction sequence is predicted by ELM and NAR respectively and the residual sequence is predicted by SVR. Two combined forecasting models named SSA-ELM-SVR and SSA-NAR-SVR are also obtained. According to the combined prediction model, the predicted values of trend series, periodic sequences and residual sequences are added or the predicted values of the reconstruction sequence and residual sequence are added to get the predicted value of the monthly prevalence of TB. Compared with the ELM and NAR based on the original sequence, the prediction results of the four combined models are better. It is shown that wavelet analysis and SSA can effectively extract the periodic information and improve the accuracy of prediction. Then, the monthly prevalence of TB is affected by seasonal factors, and the original sequence fluctuates with seasonal cycle. In every year, the prevalence of TB is increasing from January to June, and gradually declined from July to December. Therefore, the original series is decomposed into seasonal index, long-term trend and error term using  Seasonal-Trend Decomposition using LOESS (STL) method. The seasonal index remains fixed in different years. The long-term trend is predicted using local regression neural network (Elman neural network) and SVR is trained to predict the error term. The STL-Elman-SVR model is established. According the prediction model, summing up the predicted values of seasonal index, long-term trend and error term, the prediction of monthly prevalence of TB can be obtained. By comparing prediction prevalence with actual prevalence and setting up contrast models (STL-ARIMA and STL-GM(1,1)), we get the conclusion that STL decomposition method can extract the periodic information of original series effectively, and guarantee the accuracy of prediction. In this paper, wavelet analysis, SSA and STL are introduced into the prediction of prevalence. The machine learning algorithm is used to predict the decomposed components and the combined forecasting model is established. The month of high prevalence of TB can be predicted by the combined model, which can provide a quantitative basis for the improvement of preventive measures in China.

URL查看原文
Language中文
Document Type学位论文
Identifierhttps://ir.lzu.edu.cn/handle/262010/224390
Collection数学与统计学院
Recommended Citation
GB/T 7714
乔贺倩. 应用智能组合模型预测中国肺结核月发病人数[D]. 兰州. 兰州大学,2018.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[乔贺倩]'s Articles
Baidu academic
Similar articles in Baidu academic
[乔贺倩]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[乔贺倩]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.