Comparing Support Vector Machine and Naive Bayes Classifiers for detecting malware

Näytä kaikki kuvailutiedot



Pysyväisosoite

http://urn.fi/URN:NBN:fi-fe201804208658
Julkaisun nimi: Comparing Support Vector Machine and Naive Bayes Classifiers for detecting malware
Tekijä: Davoudi, Amin
Muu tekijä: Helsingin yliopisto, Matemaattis-luonnontieteellinen tiedekunta, Tietojenkäsittelytieteen laitos
Julkaisija: Helsingin yliopisto
Päiväys: 2018
Kieli: eng
URI: http://urn.fi/URN:NBN:fi-fe201804208658
http://hdl.handle.net/10138/273485
Opinnäytteen taso: pro gradu -tutkielmat
Oppiaine: Computer science
Tietojenkäsittelytiede
Datavetenskap
Tiivistelmä: In the Internet age, malware poses a serious threat to information security. Many studies have been conducted on using machine learning for detecting malicious software. Although major breakthroughs have been achieved in this area, the problem has not been completely eradicated. In this thesis, we are going through the concept of utilizing machine learning for malware detection and conduct several experiments with two different classifiers (Support Vector Machine and Naive Bayes) to compare their ability to detect malware based on Port-able Executable (PE) file format headers. A malware classifier dataset built with header field values of portable executable files was obtained from GitHub and used for experimental part of the thesis. We conducted 5 different experiments with several different trial settings. Various statistical methods have been used to assess the significance of the results. The first and second experiment show that using SVM and Naive Bayes classification methods for our dataset can result in high sensitivity rate. In the rest of the experiments, we focus on ac-curacy rate of both classifiers with different settings. The results show that although there were no big differences in the accuracy rates of the classifiers, the value of variance of ac-curacy rates is greater in Naive Bayes than in SVM. The study investigates ability of two different methods to classify information in their distinctive way. It also provides evidences that show that the learning-based approach provides a means for accurate automated analysis of malware behavior which helps in the struggle against malicious software.


Tiedostot

Tiedosto(t) Koko Formaatti Näytä

Tähän julkaisuun ei ole liitetty tiedostoja

Viite kuuluu kokoelmiin:

Näytä kaikki kuvailutiedot