|Table of Contents|

Interpolation method of traffic volume missing data based on improved low-rank matrix completion(PDF)

《交通运输工程学报》[ISSN:1671-1637/CN:61-1369/U]

Issue:
2019年05期
Page:
180-190
Research Field:
交通信息工程及控制
Publishing date:

Info

Title:
Interpolation method of traffic volume missing data based on improved low-rank matrix completion
Author(s):
CHEN Xiao-bo1 CHEN Cheng1 CHEN Lei2 WEI Zhong-jie1 CAI Ying-feng1 ZHOU Jun-jie3
(1. School of Automobile and Traffic Engineering, Jiangsu University, Zhenjiang 212013, Jiangsu, China;2. Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210003, Jiangsu, China; 3. Chery Automobile Co., Ltd., Wuhu 241009, Anhui, China)
Keywords:
intelligent transportation least square regression missing value interpolation low-rank matrix completion hierarchical clustering interpolation error
PACS:
U491.1
DOI:
-
Abstract:
An improved low-rank matrix completion method was proposed to study the interpolation problem of road traffic volume missing data. The missing data in the traffic volume data matrix were interpolated in the first round by the low-rank matrix completion based on the nuclear norm. Hierarchical clustering algorithm was applied to classify traffic volume data into different clusters so that the data in the same cluster had strong correlation, while the data in different clusters had weak correlation. Low-rank matrix completion method was applied to each cluster to complete the second round interpolation for missing data. In order to reduce the impact of clustering number, the least square regression ensemble learning approach was proposed to combine the interpolation results under different clustering numbers, so as to obtain the final traffic volume data interpolation results. The interpolation errors of five methods were compared based on the highway traffic volume data in Portland, Oregon, USA, and the influences of different clustering numbers and distance metrics methods were analyzed. Analysis result shows that under the completely random missing pattern, when the missing rate is 10%-60%, the interpolation error reduces by 5.93%-9.11% compared with the traditional low-rank matrix completion model. Under the random and mixed missing patterns, the interpolation errors reduce by 8.32%-9.55% and 8.14%-9.20%, respectively. The integration of multiple interpolations under different clustering numbers can reduce the interpolation error by 2.62%-4.76% compared with the results under single clustering number. Therefore, under three data missing modes, the improved low-rank matrix completion method can reduce the interpolation error of traffic volume data effectively, and improve the effectiveness of traffic volume data after interpolation. 3 tabs, 11 figs, 30 refs.

References:

[1] LI Lin-chao, ZHANG Jian, WANG Yong-gang, et al. Missing value imputation for traffic-related time series data based on a multi-view learning method[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(8): 2933-2943.
[2] WANG Hai, DAI Lei, CAI Ying-feng, et al. Salient object detection based on multi-scale contrast[J]. Neural Networks, 2018, 101: 47-56.
[3] LYU Yi-sheng, DUAN Yan-jie, KANG Wen-wen, et al.
Traffic flow prediction with big data: a deep learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(2): 865-873.
[4] 许岩岩,翟 希,孔庆杰,等.高速路交通流短时预测方法[J].交通运输工程学报,2013,13(2):114-119.
XU Yan-yan, ZHAI Xi, KONG Qing-jie, et al. Short-term prediction method of freeway traffic flow[J]. Journal of Traffic and Transportation Engineering, 2013, 13(2): 114-119.(in Chinese)
[5] 陈小波,刘 祥,韦中杰,等.基于 GA-LSSVR 模型的路网短时交通流预测研究[J].交通运输系统工程与信息,2017,17(1):60-66,81.
CHEN Xiao-bo, LIU Xiang, WEI Zhong-jie, et al. Short-term traffic flow forecasting of road network based on GA-LSSVR model[J]. Journal of Transportation Systems Engineering and Information Technology, 2017, 17(1): 60-66, 81.(in Chinese)
[6] 雷定猷,马 强,徐新平,等.基于非线性主成分分析和GA-RBF的高速公路交通量预测方法[J].交通运输工程学报,2018,18(3):210-217.
LEI Ding-you, MA Qiang, XU Xin-ping, et al. Forecasting method of expressway traffic volume based on NPCA and GA-RBF[J]. Journal of Traffic and Transportation Engineering, 2018, 18(3): 210-217.(in Chinese)
[7] ZHANG Jun-ping, WANG Fei-yue, WANG Kun-feng, et al. Data-driven intelligent transportation systems: a survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(4): 1624-1639.
[8] 孟鸿程,陈淑燕.交通流缺失数据处理方法比较分析[J].交通信息与安全,2018,36(2):61-67.
MENG Hong-cheng, CHEN Shu-yan. A comparative analysis of data imputation methods for missing traffic flow data[J]. Journal of Transport Information and Safety, 2018, 36(2): 61-67.(in Chinese)
[9] 李林超,曲 栩,张 健,等.基于特征级融合的高速公路异质交通流数据修复方法[J].东南大学学报(自然科学版),2018,48(5):972-978.
LI Lin-chao, QU Xu, ZHANG Jian, et al. Missing value imputation method for heterogeneous traffic flow data based on feature fusion[J]. Journal of Southeast University(Natural Science Edition), 2018, 48(5): 972-978.(in Chinese)
[10] 陆化普,孙智源,屈闻聪.基于时空模型的交通流故障数据修正方法[J].交通运输工程学报,2015,15(6):92-100,117.
LU Hua-pu, SUN Zhi-yuan, QU Wen-cong. Repair method of traffic flow malfunction data based on temporal-spatial model[J]. Journal of Traffic and Transportation Engineering, 2015, 15(6): 92-100, 117.(in Chinese)
[11] 孙 玲,刘 浩,牛树云.考虑时空相关性的固定检测缺失数据重构算法[J].交通运输工程学报,2010,10(5):121-126.
SUN Ling, LIU Hao, NIU Shu-yun. Reconstructive method of missing data for location-specific detector considering spatio-temporal relationship[J]. Journal of Traffic and Transportation Engineering, 2010, 10(5): 121-126.(in Chinese)
[12] CHEN Xiao-bo, CHEN Cheng, CAI Ying-feng, et al. Kernel sparse representation with hybrid regularization for on-road traffic sensor data imputation[J]. Sensors, 2018, 18: 1-20.
[13] CHEN Xiao-bo, CAI Ying-feng, YE Qiao-lin, et al. Graph regularized local self-representation for missing value imputation with applications to on-road traffic sensor data[J]. Neurocomputing, 2018, 303: 47-59.
[14] LI Yue-biao, LI Zhi-heng, LI Li. Missing traffic data: comparison of imputation methods[J]. IET Intelligent Transport Systems, 2014, 8(1): 51-57.
[15] QU Li, LI Li, ZHANG Yi, et al. PPCA-based missing data imputation for traffic flow volume: a systematical approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2009, 10(3): 512-522.
[16] 钱 超,陈建勋,罗彦斌,等.基于随机森林的公路隧道运营缺失数据插补方法[J].交通运输系统工程与信息,2016,16(3):81-87.
QIAN Chao, CHEN Jian-xun, LUO Yan-bin, et al. Random forest based operational missing data imputation for highway tunnel[J]. Journal of Transportation Engineering and Information Technology, 2016, 16(3): 81-87.(in Chinese)
[17] AHN J, KO E, KIM E Y. Highway traffic flow prediction
using support vector regression and Bayesian classifier[C]∥IEEE. 2016 International Conference on Big Data and Smart Computing. New York: IEEE, 2016: 239-244.
[18] ASIF M T, MITROVIC N, GARG L, et al. Low-dimensional
models for missing data imputation in road networks[C]∥IEEE. The 38th IEEE International Conference on Acoustics, Speech, and Signal Processing. New York: IEEE, 2013: 3527-3531.
[19] TAN Hua-chun, FENG Guang-dong, FENG Jian-shuai, et al. A tensor-based method for missing traffic data completion[J]. Transportation Research Part C: Emerging Technologies, 2013, 28: 15-27.
[20] CHEN Xin-yu, HE Zhao-cheng, CHEN Yi-xian, et al. Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model[J]. Transportation Research Part C: Emerging Technologies, 2019, 104: 66-77.
[21] 刘园园.快速低秩矩阵与张量恢复的算法研究[D].西安:西安电子科技大学,2013.
LIU Yuan-yuan. Algorithm research of fast low-rank matrix and tensor recovery[D]. Xi'an: Xidian University, 2013.(in Chinese)
[22] CANDÈS E J, RECHT B. Exact matrix completion via convex optimization[J]. Foundations of Computational Mathematics, 2009, 9(6): 717-772.
[23] CAI Jian-feng, CANDÈS E J, SHEN Zuo-wei. A singular
value thresholding algorithm for matrix completion[J]. SIAM Journal on Optimization, 2010, 20(4): 1956-1982.
[24] HASTIE T, MAZUMDER R, LEE J D, et al. Matrix
completion and low-rank SVD via fast alternating least squares[J]. Journal of Machine Learning Research, 2015, 16: 3367-3402.
[25] MA Shi-qian, GOLDFARB D, CHEN Li-feng. Fixed point and Bregman iterative methods for matrix rank minimization[J]. Mathematical Programming, 2009, 128(1/2): 321-353.
[26] CHEN Xiao-bo, XIAO Yan, CAI Yin-feng, et al. Structural max-margin discriminant analysis for feature extraction[J]. Knowledge-Based Systems, 2014, 70: 154-166.
[27] MURTAGH F, LEGENDRE P.Ward's hierarchical agglomerative clustering method: which algorithms implement ward's criterion?[J]. Journal of Classification, 2014, 31: 274-295.
[28] FARHANGFAR A, KURGAN L A, PEDRYCZ W. A novel framework for imputation of missing values in databases[J]. IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans, 2007, 37(5): 692-709.
[29] CHAI T, DRAXLER R R. Root mean square error(RMSE)or mean absolute error(MAE)?—Arguments against avoiding RMSE in the literature[J]. Geoscientific Model Development, 2014, 7: 1247-1250.
[30] ZHENG Zu-duo, SU Dong-cai, et al. Short-term traffic volume forecasting: a k-nearest neighbor approach enhanced by constrained linearly sewing principle component algorithm[J]. Transportation Research Part C: Emerging Technologies, 2014, 43: 143-157.

Memo

Memo:
-
Last Update: 2019-11-13