الفهرس | Only 14 pages are availabe for public view |
Abstract At this current time, data stream classification plays a key role in big data analytics due to its enormous growth. Most of the existing classification methods used ensemble learning, which is trustworthy but these methods are not effective to face the issues of learning from imbalanced big data, it also supposes that all data are pre-classified. Another weakness of current methods is that it takes a long evaluation time when the target data stream contains a high number of features. In this thesis, we provide an overview of big data mining techniques. The main objective of this thesis is to develop a new model for incremental learning based on the proposed ant lion fuzzy-generative adversarial network. The proposed model is implemented in spark architecture. For each data stream, the class output is computed at slave nodes by training a generative adversarial network with the back propagation error based on fuzzy bound computation. The proposed model is implemented using Python programming. The required software for implementing the proposed model are Python version 3.7, Pycharm version 2020.3.2, Anaconda version 3, and Microsoft visual studio redistributable 2019. The proposed model is implemented and tested using WebKB dataset, 20 Newsgroup dataset, and Reuter dataset. The results clarify that this model overcomes the limitations of existing models as it can classify data streams that are slightly or completely unlabeled data and providing high scalability and efficiency. The results show that the proposed model outperforms state-of-the-art performance in terms of accuracy (0.861) precision (0.9328) and minimal Mean Square Error (0.0416). |