lecture: Comparing Malicious Files
How many freaking names can we have for one malware family?
One critical step one must take during the malware analysis process is to attempt to determine the malware family the sample may belong to. Even if one cannot link a file to a family, one must at least try to find files that are similar and extrapolate information about the sample from comparison with these similar files. This talk reviews a variety of methods for comparing files from simple to complex.
The malware analysis process should leave one with answers, but at the same time, more questions than one started with. An important avenue of investigation is to try and identify which malware family a particular sample may be a member of. This is an important step that can potentially lead the analyst to a wealth of previous research and courses of action.
Even if one is unable to determine malware family membership, it is then important to find other files that are at least similar to the sample. Performing full analysis on sets of similar files can reveal additional information about the identity of the sample in question.
This talk will review various methods that I and my team use throughout our malware analysis processes. Starting with the most simple, we will cover examining virus scanner results for commonalities and other clues. In addition to this, there are various freely accessible systems one can submit a file to for comparison or identification analysis. These include Icewater and Malpedia. The talk will also cover leveraging features collected during static and dynamic malware analysis and using these features along with easily implemented machine learning algorithms to determine file relatedness. Further, using collected threat intelligence, one can apply the Diamond Model of Intrusion Analysis to find overlap of infrastructure as well as the potential for overlap of Adversary and/or Victim. Finally, a bit higher on the learning curve, we will look at control flow graph analysis. The attendee will leave this talk with an understanding of the problem at hand as well as a set of easily implemented and usable solutions.
Start time: 18:15