[1]何黎,陈磊,纪莎莎,等.基于K-shape聚类的连续液位监测数据异常检测方法[J].中国给水排水,2023,39(11):56-61.
HELi,CHENLei,JISha-sha,et al.Abnormal Detection of Continuous Water Level Monitoring Data Based on K-shape Clustering[J].China Water & Wastewater,2023,39(11):56-61.
点击复制
HELi,CHENLei,JISha-sha,et al.Abnormal Detection of Continuous Water Level Monitoring Data Based on K-shape Clustering[J].China Water & Wastewater,2023,39(11):56-61.
基于K-shape聚类的连续液位监测数据异常检测方法
中国给水排水[ISSN:1000-4062/CN:12-1073/TU]
卷:
第39卷
期数:
2023年第11期
页码:
56-61
栏目:
出版日期:
2023-06-01
- Title:
- Abnormal Detection of Continuous Water Level Monitoring Data Based on K-shape Clustering
- 摘要:
- 受制于排水管网监测起步较晚、监测环境恶劣等因素,目前城市排水管网运行数据质量不容乐观,直接影响其有效应用及价值挖掘。而异常检测作为数据有效应用的第一步,在排水系统中尚未有效开展。以K-shape聚类算法为基础,提出了一种排水监测数据异常检测流程。首先,对特征序列进行提取并进行聚类分析,以确定描述时间序列集合的整体特征或平均特征的序列,从而降低异常检测的误报和漏报率。然后对识别的异常序列进行整体性判断,以提高异常检测算法的查全率。结果表明,基于K-shape的排水监测数据异常检测算法的查全率和查准率分别可以达到0.891 7和0.812 7。此外,与暴力算法(BF)的对比显示,采用固定长度的时间序列切分方式会导致误报和漏报率增加,其效果劣于K-shape聚类算法。
- Abstract:
- Due to factors such as the late start of drainage network monitoring and the harsh monitoring environment, the current quality of urban drainage network operation data is not optimistic, which directly affects its effective application. However, abnormal detection, as the first step in the effective application of data, has not been effectively carried out in the drainage system. Based on the K-shape clustering algorithm, an abnormal detection process of drainage monitoring data was proposed. First, the feature sequence was extracted and clustered to determine the sequence describing the overall feature or average feature of the time series, thereby reducing the false positive and false negative rates of abnormal detection. Then, a holistic judgment was made on the identified abnormal sequences to improve the recall rate of abnormal detection algorithms. The experimental results showed that the recall rate and precision rate of the drainage monitoring data abnormal detection algorithm based on K-shape could reach 0.891 7 and 0.812 7 respectively. In addition, through a comparative study with the brute force algorithm (BF), it was found that the use of a fixed-length time series segmentation method would lead to an increase in false positive rates, and its effect was inferior to the K-shape clustering algorithm.
更新日期/Last Update:
2023-06-01