A disk-aware algorithm for time series motif discovery

Time series motifs are sets of very similar subsequences of a long time series. They are of interest in their own right, and are also used as inputs in several higher-level data mining algorithms including classification, clustering, rule-discovery and summarization. In spite of extensive research i...

Description complète

Détails bibliographiques
Publié dans:	Data mining and knowledge discovery. - 2003. - 22(2011), 1-2 vom: 10. Jan., Seite 73-105
Auteur principal:	Mueen, Abdullah (Auteur)
Autres auteurs:	Keogh, Eamonn, Zhu, Qiang, Cash, Sydney S, Westover, M Brandon, Bigdely-Shamlo, Nima
Format:	Article en ligne
Langue:	English
Publié:	2011
Accès à la collection:	Data mining and knowledge discovery
Sujets:	Journal Article Bottom-up search Pruning Random references Time series motifs

Description
Résumé:	Time series motifs are sets of very similar subsequences of a long time series. They are of interest in their own right, and are also used as inputs in several higher-level data mining algorithms including classification, clustering, rule-discovery and summarization. In spite of extensive research in recent years, finding time series motifs exactly in massive databases is an open problem. Previous efforts either found approximate motifs or considered relatively small datasets residing in main memory. In this work, we leverage off previous work on pivot-based indexing to introduce a disk-aware algorithm to find time series motifs exactly in multi-gigabyte databases which contain on the order of tens of millions of time series. We have evaluated our algorithm on datasets from diverse areas including medicine, anthropology, computer networking and image processing and show that we can find interesting and meaningful motifs in datasets that are many orders of magnitude larger than anything considered before
Description:	Date Revised 28.09.2020 published: Print-Electronic Citation Status PubMed-not-MEDLINE
ISSN:	1384-5810
DOI:	10.1007/s10618-010-0176-8