The value and size of information exchanged through dark-web pages are remarkable. Recently Many researches showed values and interests in using machine-learning methods to extract security-related useful knowledge from those dark-web pages. In this scope, our goals in this research focus on evaluating best prediction models while analyzing traffic level data coming from the dark web. Results and analysis showed that feature selection played an important role when trying to identify the best models. Sometimes the right combination of features would increase the model’s accuracy. For some feature set and classifier combinations, the Src Port and Dst Port both proved to be important features. When available, they were always selected over most other features. When absent, it resulted in many other features being selected to compensate for the information they provided. The Protocol feature was never selected as a feature, regardless of whether Src Port and Dst Port were available.
Digital Object Identifier (DOI)
Alhussan, Andrew; Alsmadi, Izzat; Wahbeh, Abdullah; Al-Ramahi, Mohammad A.; and Al-Omari, Ahmad, "Dark Web Analytics : A Comparative Study of Feature Selection and Prediction Algorithms" (2022). Computer Information Systems Faculty Publications. 15.
This is the prepublication (SSRN) version of the conference proceeding:
A. Al-Omari, A. Allhusen, A. Wahbeh, M. Al-Ramahi and I. Alsmadi, "Dark Web Analytics: A Comparative Study of Feature Selection and Prediction Algorithms," 2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA), San Antonio, TX, USA, 2022, pp. 170-175, doi: 10.1109/IDSTA55301.2022.9923042. https://ieeexplore.ieee.org/document/9923042