Jump to content

How Does Anti Virus Work?


Recommended Posts

I have shellcode written in a simple C program.

Before that program is run and it is scanned by anti-virus, what exact is happening?

Is the antivirus listening at a firewall level for recently initiated outgoing traffic? I would imagine they aren't doing that... yet.

Is the antivirus somehow scanning the original code? If so, what would it be looking for?

TO my understanding certain ones look for certain packers, like UPX, and deny any program with this packing?

Edited by bobbyb1980
Link to comment
Share on other sites

well USUALLY an AV looks at two things: all know past virus's and comparing it to the file it's scanning and it's behavior (like deleteing files etc). most AVs scan files when they are written to and when they are executed

Edited by flyingpoptartcat
Link to comment
Share on other sites

OK but what do you mean scan files?

If I have a .exe, is it scanning the "outside" of the .exe for certain types of packing?

Or is it scanning the .asm file that was used to compile for certain behaviour?

Or is it looking in the binary for certain characteristics? Or in the actual code written in C or whatever?

Or searching through pasted in shellcode?

Link to comment
Share on other sites

OK but what do you mean scan files?

If I have a .exe, is it scanning the "outside" of the .exe for certain types of packing?

Or is it scanning the .asm file that was used to compile for certain behaviour?

Or is it looking in the binary for certain characteristics? Or in the actual code written in C or whatever?

Or searching through pasted in shellcode?

Ok, so obviously the AV scanner doesn't have the original C code or assembly code to look at. All it can see is the binary result in front of it. AV scanners actually use a combination of techniques to determine if a file is a threat.

One technique is to search the actual binary for specific patterns (signatures) that indicate known viruses or shell-code.

In the process of searching for these signatures, if it finds that the binary is compresses or packed by a known packer it will try to decompress or unpack the binary (or binaries) and scan the decompressed files (and will recurse in this way to a vendor-specific depth). Certain packers are rather closely associated with viruses and malware so seeing evidence of such a packer might be enough, in some cases, to flag the file as a potential threat.

In addition to de-compressing binaries, the AV scanner will try to decode (to some extent) binaries that appear to be encrypted. Compression is really just a special case of encryption.

Beyond just looking for signatures of known viruses, scanners will also look for specific behaviors which may be associated with viruses or malware (such as code to escalate privileges, hide processes, pack other binaries, etc.). In some cases this may involve some mild disassembly, but mostly they search for matching patterns at the machine-code level.

On top of these, the major AV vendors like to add their own secret sauce techniques to try and catch threats both known and unknown.

For essentially the same reason as the halting problem, no AV scanner will ever be able to catch all viruses. Alan Turing and Alonzo Church proved this in 1936.

Link to comment
Share on other sites

Thank you again for an insightful reply Sitwon.

Ok, so obviously the AV scanner doesn't have the original C code or assembly code to look at. All it can see is the binary result in front of it. AV scanners actually use a combination of techniques to determine if a file is a threat.

Why not? If I can just open a .exe in Immunity and see the assembly code in a few clicks what's to stop an anti virus from doing the same?

One technique is to search the actual binary for specific patterns (signatures) that indicate known viruses or shell-code.

If I understand you correctly, is this where the antivirus would be searching for nops or /x in the shellcode? And how is the antivirus seeing the shellcode if it can't see the original C code? Can you reccomend any ways to obfuscate the shellcode (which I assume must be done in the C code before compiling?)

For essentially the same reason as the halting problem, no AV scanner will ever be able to catch all viruses. Alan Turing and Alonzo Church proved this in 1936.

I'm starting to see that. I'm also starting to see the flip side of the coin in that no one payload is going to successfully evade all the major ant virus vendors.

I'm having problems evading the same 5 or 6 anti virus vendors and they must be using the same technique and based on experimentation I can conclude that UPX or any packing to do with metasploit will set them off. I can also conclude that there are 5 or 6 that appear to be able to see shellcode coming from a mile away, irregardless of the type. I made shellcode to add a user and that was triggered. This leads me to believe it's somehow reading the assembly instructions? Any ideas for ofuscating that?

Also the current shellcode I'm using (custom written oo : ) : ) :) has a lot of '/x' characters in it and I'm currently searching for a way to make it alpha numerical and compile that directly. I mean, the anti virus can't possibly be triggered by alpanumeric characters, if so then I'd imagine many typos would trigger av's?

Must try harder.

Edited by bobbyb1980
Link to comment
Share on other sites

Why not? If I can just open a .exe in Immunity and see the assembly code in a few clicks what's to stop an anti virus from doing the same?

That's not the "original" assembly code. That's just the nearest estimation the disassembler could managed based on the machine language.

In any case, assembly is little more than a human-readable notation for machine language. Since the AV scanner is not a human it doesn't care about human readability. It can analyze the actual binary machine language just as efficiently (if not more efficiently) than the assembly representation of the same code. So doing it at the machine language level skips the unnecessary disassembly step.

If I understand you correctly, is this where the antivirus would be searching for nops or /x in the shellcode? And how is the antivirus seeing the shellcode if it can't see the original C code? Can you reccomend any ways to obfuscate the shellcode (which I assume must be done in the C code before compiling?)

One heuristic that many AV scanners use is to look for nop sleds or other common patterns of bytes that are more likely to occur in viruses than in legitimate applications. However most of the time they are just looking for sequences of bytes that are unique or identifying of particular viruses or shellcode. In your C source code all those /x numbers get converted into raw bytes in the file and it's those sequences of raw bytes that it's searching for. If it's a shellcode that it's seen before (or substantially similar to one that it's seen before) it will flag it as a possible match.

I'm starting to see that. I'm also starting to see the flip side of the coin in that no one payload is going to successfully evade all the major ant virus vendors.

Actually, it's tough but generally believed to be possible. The AV vendors all tend to have similar batches of signatures and similar heuristics so if you have something that won't be flagged by a signature check and slides by most of the heuristic techniques then you might be able to evade everyone's detection (at least until someone finds a sample of it and write a new signature for it).

I'm having problems evading the same 5 or 6 anti virus vendors and they must be using the same technique and based on experimentation I can conclude that UPX or any packing to do with metasploit will set them off. I can also conclude that there are 5 or 6 that appear to be able to see shellcode coming from a mile away, irregardless of the type. I made shellcode to add a user and that was triggered. This leads me to believe it's somehow reading the assembly instructions? Any ideas for ofuscating that?

UPX and Metasploit are known tools that are highly associated with malware packing. The could potentially be used for legitimate software, but because of the relatively high rate of malware that uses them they are a strong indicator of potential malware. Basically, with UPX you are assumed malicious unless specifically whitelisted.

Also, UPX compresses and thereby encrypts your binary, but it's relatively trivial to unpack as well (UPX doesn't try hard to make that difficult, it's optimizing for size and speed). There are other packers out there that will make it harder for the AV to unpack and analyze your code. There are also other techniques for breaking up how your shellcode is stored in the binary so that it won't match existing signatures (and then you would reassemble it at run-time). There are lots of sneaky tricks you can do. Some of those trick the AV vendors are wise to and have heuristics that will catch them. I'm not going into detail for two reasons.

1. I don't actually have first-hand experience trying to evade AV heuristics. It's just not something that I've ever needed to do.

2. I'm not so sure I really want to discuss these kinds of techniques on a public forum. I'm just not sure if it's ethically sound to lower the bar for creating better malware. It's one of those things that anyone can learn to do, but if you're going to play in that arena you should probably earn your place by learning and understanding all the background and theory. (In general, you're playing with fire so you had better have an intimate understanding of exactly what you're doing. I'm not gonna give you short-cuts to shooting yourself in the face.)

Also the current shellcode I'm using (custom written oo : ) : ) :) has a lot of '/x' characters in it and I'm currently searching for a way to make it alpha numerical and compile that directly. I mean, the anti virus can't possibly be triggered by alpanumeric characters, if so then I'd imagine many typos would trigger av's?

Must try harder.

I think you need a refresher on the fundamentals of computer science and how code works at the machine level. "The Elements of Computing Systems" is a great book/course for introducing these topics.

http://www1.idc.ac.il/tecs/

http://sitwon.github.com/learnproglang/Home.html

http://wiki.hacdc.org/index.php/TECS

Additionally I would also recommend reading "Art of Assembly".

http://nostarch.com/assembly2.htm

Link to comment
Share on other sites

Well, just for the record, I can evade anti-virus successfully w/java. As someone who does pentesting, not every machine in the wild has java installed IE not every pentest is as easy as getting someone to click a link. Have to go custom here, not trying to take over the internet, just a pentest ; )

EDIT - probably shouldn't have pasted that here : )

I would love to get my hands on the book for "The Elements of Computing Systems", can you reccomend any online training courses that would perhaps serve the same purpose? As always, your responses are greatly apreciated.

Edited by bobbyb1980
Link to comment
Share on other sites

Well, just for the record, I can evade anti-virus successfully w/java. As someone who does pentesting, not every machine in the wild has java installed IE not every pentest is as easy as getting someone to click a link. Have to go custom here, not trying to take over the internet, just a pentest ; )

EDIT - probably shouldn't have pasted that here : )

I would love to get my hands on the book for "The Elements of Computing Systems", can you reccomend any online training courses that would perhaps serve the same purpose? As always, your responses are greatly apreciated.

You can download free PDF versions of most of the chapters online. The book itself is about $20, so probably the cheapest textbook you'll ever buy.

Other resources (mostly free) are listed on that second link I posted. For years, MIT students uses the "Structure and Interpretation of Computer Programs" to essentially learn many of the same lessons as TECS (although from a slightly different perspective). TECS can be thought of as computer science from the perspective of Turing Machines (Alan Turing), where as SICP is computer science from the perspective of Lambda Calculus (Alonzo Church). SICP is a pretty rigorous and math-heavy course.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...