| 基于数据预处理和K-均值聚类的支持向量回归预测模型 |
Alternative Title | Data Preprocessing and K-Means Clustering based Support Vector Regression Model
|
| 赵伟刚 |
Thesis Advisor | 王建州
|
| 2012-06-02
|
Degree Grantor | 兰州大学
|
Place of Conferral | 兰州
|
Degree Name | 硕士
|
Keyword | 数据预处理方法
最小二乘支持向量回归机
K-均值聚类
数据突变
奇异值处理
基于EMD的信号滤波方法
|
Abstract | 在人们生产生活的实践中,对某些量进行预测是一项非常富有现实意义的工作,而准确度是预测的生命线,如何提高预测的准确性,一直是科研人员研究的重点,通常所采取的手段是尽量提高预测模型对原始序列的拟合精度,但是如果数据本身存在问题而不能正确反映序列的变化趋势,拟合精度再好的模型也有可能得到很差的预测精度。针对这种情况,本文通过对数据进行预处理来提高预测的准确性,具体而言,就是在预测之前预先对原始序列检测突变情况、剔除奇异值或者降噪。在预测模型的选择上,由于内部相似度高的训练集可以被更有效地模拟,本文提出来一种新的算法,即基于K-均值聚类的最小二乘支持向量回归机(简记为K-LSSVR),它首先利用K-均值聚类将训练集根据输入向量的欧几里德距离分成数类,然后分别用它们对LSSVR模型进行训练,在预测的时候,根据每个输入向量所属的类别来选择相应的LSSVR模型进行预测。通过三个模拟实验的检验,可以发现K-LSSVR的预测精度相较于LSSVR通常是有所提高的(尤其是在数据存在突变或奇异值时),而对数据进行预处理使预测精度有了进一步地提高。 |
Other Abstract | In the practice of people's production and life, forecasting is a work which is very rich in practical significance, where the accuracy is its lifeblood. How to improve the forecasting accuracy has been the focus of the study researchers. They usually take the means of improving the fitting accuracy of the prediction model to the original series, but if the data itself is a problem and thus can not correctly reflect the trend of the series, no matter how good the fitting accuracy is, the model is also likely to have a poor forecasting accuracy. In view of this situation, this paper attempts to improve the forecasting accuracy through data preprocessing, specifically, that is pre-detection of data jumps, excluding the outliers or noise reduction prior to forecasting. Furthermore, since the training set with high internal similarity can be more effectively simulated, this paper introduces a new algorithm, that is K-means clustering based least squares support vector regression. |
URL | 查看原文
|
Language | 中文
|
Document Type | 学位论文
|
Identifier | https://ir.lzu.edu.cn/handle/262010/225119
|
Collection | 数学与统计学院
|
Recommended Citation GB/T 7714 |
赵伟刚. 基于数据预处理和K-均值聚类的支持向量回归预测模型[D]. 兰州. 兰州大学,2012.
|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.