Monday, January 28, 2008

you can't win for trying

first, waves upon waves of people complain that signatures are the wrong way to go, that heuristics are too weak and need to be stronger

then, in reaction to that the vendors make the heuristics more strict and guess what happens - false alarms increase and people think anti-virus is getting 'dumb'...

the truth is anti-virus is no more dumb than it ever was, it's simply less tolerant...

now i'm sure there will be waves upon waves of people calling for other things - among them probably smarter heuristics.. to those people i'd like to say this: show me your solution to the halting problem and i'll show you a smarter heuristic...

sure you could make heuristics that are better able to distinguish between past malware and legitimate files without solving the halting problem, and no doubt that is one of the many things designers of heuristic analysis engines are constantly trying to do... unfortunately future malware is the class of malware that most needs alternative detection technologies like heuristics (because past malware is where signatures actually shine)... future malware is, by definition, different from past malware and since malware is created by intelligent adversaries, optimizing for accuracy on past malware doesn't help with future malware... in fact, there's no guarantee stricter heuristics will help either (at least not against those malware writers who use scanners to help optimize their malware for non-detection), but it does have a better chance and frankly, for a business the cost to of a false positive is generally a lot less than the cost of a false negative...

as for making heuristics that are better able to distinguish between future malware and legitimate files, not only would such a heuristic engine need to be better able to determine what a file does by analyzing it's contents (a necessity due to not knowing beforehand what future malware will look like, and a pursuit where the halting problem has real and significant relevance) but also be better able to determine factors that are actually beyond the scope of computation such as the context in which such code would be executed (ie. at what point do you definitively label format.com with a possibly different filename as good or bad)... as such, even if we could solve the halting problem we still wouldn't have a perfect heuristic engine, only a smarter one - and without the contextual understanding, smarter might not even turn out to be that much better...

0 comments: