![]() | يوجد فقط 14 صفحة متاحة للعرض العام |
المستخلص Real life transactional databases usually contain both item information and hierarchical information in a form of taxonomy. Mining can take different approaches through the hierarchy: generalized/ specialized approaches. The classical frequent pattern mining algorithms, such as Apriori and FP-growth, can produce a huge number of redundant rules if they are applied at the primitive level. A concept-level hierarchy mining algorithm is needed to mine multi-level concept hierarchy. In this thesis, two main issues are tackled: proposing a hybrid approach to mine concept level hierarchy and applying this approach in the biological domain. Adaptive-H-Struct algorithm is proposed to attack the problem of mining concept level hierarchy. Adaptive-H-Struct is a pattern growth method,which avoids Apriori candidate generate-and-test method for the candidate generation approach. It produces frequent patterns to be used for rule generations. The efficiency of this algorithm is guaranteed by the high flexibility of FP-growth. To prove high scalability, an extensive performance study has been implemented using synthetic data, which showed that Adaptive-H-Struct is efficient and scalable. It outperforms the previous proposed algorithms for mining generalized association rules: Cumulate, Prutax, and Ready-and-GO algorithms. Adaptive-H-Struct has proved to be a 38 times faster (on average) than Cumulate algorithm on all experiments of the performance study. Also, Adaptive-H-Struct is 237 times faster than Prutax and 48 times faster than Ready-and-GO. However, data itself should have correlation within it. |