New Algorithms for Finding Approximate Frequent Item Sets
Type of publication: | Article |
Citation: | |
Journal: | Soft Computing - A Fusion of Foundations, Methodologies and Applications |
Volume: | 16 |
Number: | 2 |
Year: | 2012 |
Month: | April |
Pages: | 903-917 |
ISSN: | 1432-7643 (Print), 1433-7479 (On |
Abstract: | In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains. |
Keywords: | |
Authors | |
Added by: | [] |
Total mark: | 0 |
Attachments
|
|
Notes
|
|
|
|
Topics
|
|