-=( ---------------------------------------------------------------------- )=- -=( Natural Selection Issue #1 -------------- Using Advanced MASM Features )=- -=( ---------------------------------------------------------------------- )=- -=( 0 : Contents --------------------------------------------------------- )=- 0 : Contents 1 : Why Use MASM 2 : Debugging with SoftICE 3 : Using Anonymous Labels 4 : Flow Control Directives 5 : Proper Procedure Usage 6 : Calling API with INVOKE 7 : Final Thoughts On MASM -=( 1 : Why Use MASM ----------------------------------------------------- )=- The choice of MASM over TASM is a personal one, however there are various advanced features that MASM provides (some similar functions are found in TASM) that can sway your choice, depending on your programming style. For those that prefer SoftICE as a debugger, MASM debugging symbols allow you to do source-level debugging. It has advanced support for some high level commands like IF and WHILE, and it can do automatic type checking when invoking API and PROCs, along with stack management to create and destroy local variables automatically. Finally, it doesn't require a .DATA section, and it is FREE! -=( 2 : Debugging with SoftICE ------------------------------------------- )=- SoftICE can do source-level debugging on MASM executables compiled with Codeview information, by using the below command line switches. It's so much easier stepping through your own .ASM than staring at a screen full of opcodes and no labels. For Compiling /c : Compile only we link ourself /Cp : Preserve case on all symbols /coff : COFF format /Zi : Include symbolic debugging information For Linking /debug : Include symbolic debugging information /debugtype:cv : Codeview format /pdb:none : Do not generate a PDB, it's not needed /subsystem:windows : Win32 application Start SoftICE nmsym.exe /translate:source,package,always /source:Virus.ASM /load:execute,break Virus.EXE If your virus relocates in memory, it's possible to map your source and symbols to a new address with the SYMLOC command. From within SoftICE, type MAP32 Virus. It will print a list of sections for that executable and assign a number to each. Now type SYMLOC CS NUMBER BASE. NUMBER is next to the name of the main Virus Code section, BASE is the delta offset of your virus start. Now type SRC a few times and you'll be source level debugging again. -=( 3 : Using Anonymous Labels ------------------------------------------- )=- MASM doesn't support TASM's Local Labels, instead it has Anonymous Labels that have no tag name. You declare one as @@:, and use @F to reference the next @@:, and @B to reference the last @@:. The only problem is that they obviously cannot be nested, and if you need to get their offset, you may have to put it into a register separately, MASM doesn't do arithmetic on them. Example: XOR EAX, EAX MOV ECX, 10 @@: CMP EAX, 5 JE @F INC EAX LOOP @B JMP DROP_OUT @@: ... -=( 4 : Flow Control Directives ------------------------------------------ )=- MASM allows you to block code within directives like a HLL, reducing the need for many obscure local within small loops and tests. .IF blocks take simple conditional expressions and execute code up to an .ENDIF directive. It's limited to register tests, it cannot do tests on conditions involving memory operands. .ELSEIF and .ELSE directives can be used inside an .IF block. Note that the end of each block has a jump to the .ENDIF automatically inserted. Example: .IF EAX == 1 MOV EBX, 1 .ELSEIF EAX == 2 MOV EBX, 2 .ELSE MOV EBX, 3 .ENDIF .WHILE/.WEND and .REPEAT/.UNTIL and .REPEAT/.UNTILCXZ are directives for loop blocks. Each evaluates the condition at the beginning or end of the loop, while .UNTILCXZ will also exit the loop if ECX == 0. Example: .WHILE EAX <> 12 INC EAX .WEND .REPEAT INC EAX .UNTIL EAX == 12 MOV ECX, 12 .REPEAT INC EAX .UNTILCXZ Loop directives also accept another directive, .BREAK and .CONTINUE, which are necessary to exit the loop as there are no labels available. .BREAK will completely exit the loop, .CONTINUE will return to the condition test of the loop. Both allow an alternate form, with an .IF as their parameter with a condition. No .ENDIF is necessary. Example: .WHILE TRUE INC EAX .BREAK .IF EAX == 10 .WEND Example: .REPEAT INC EAX .CONTINUE .IF EAX == 10 INC EBX .UNTIL EAX == 12 Conditional expressions for these directives can be more complex using OR (||), AND (&&), and NOT (!) symbols. You can also do single tests on the flags register using CARRY?, OVERFLOW?, PARITY?, SIGN?, ZERO?. Example: DEC EAX .IF (ZERO?) DEC EAX .ELSEIF (EAX == 1 && EBX == 0) DEC EBX .ENDIF -=( 5 : Proper Procedure Usage ------------------------------------------- )=- PROCs have been extended. Firstly, a USES clause lets you specify any of the registers your PROC uses, to save on entry and restore on exit. Note that restoration code replaces ALL RET instructions. Example: MYPROC PROC USES EBX ECX .IF EBX == 1 RET .ELSE MOV ECX, 2 RET .ENDIF MYPROC ENDP Compile: MYPROC PROC PUSH EBX PUSH ECX CMP EBX, 1 JNE @F POP ECX POP EBX RET @@: MOV ECX, 2 POP ECX POP EBX RET MYPROC ENDP Secondly, you can specify names for parameters passed on the stack to your PROC. References to these parameters are transparently converted to [ebp] [offset] by MASM. Example: MYPROC PROC USES EBX ECX, P1:DWORD, P2:DWORD MOV EBX, [P1] MOV ECX, [P2] SUB EBX, ECX MOV EAX, EBX RET MYPROC ENDP Compile: MYPROC PROC USES EBX ECX, P1:DWORD, P2:DWORD PUSH EBP MOV EBP, ESP PUSH EBX PUSH ECX MOV EBX, [EBP][8] MOV ECX, [EBP][0CH] SUB EBX, ECX MOV EAX, ECX POP ECX POP EBX LEAVE RET 8 MYPROC ENDP Next, you can further set up the stack frame by declaring local variables onto the stack. This is done with the LOCAL directive. Note that these variables are not initialized, you must do it manually. Example: MYPROC PROC USES EBX ECX, P1:DWORD, P2:DWORD LOCAL CAT:DWORD MOV EAX, [CAT] RET MYPROC ENDP Compile: MYPROC PROC PUSH EBP MOV EBP, ESP ADD ESP, -4 PUSH EBX PUSH ECX MOV EAX, [EBP][-4] POP ECX POP EBX LEAVE RET 8 MYPROC ENDP -=( 6 : Calling API with INVOKE ------------------------------------------ )=- INVOKE pushes a list of arguments onto the stack then CALLs a PROC. It's very similar to the extended CALL in TASM. Example: INVOKE MYPROC, [ECX], ADDR CAT Compile: LEA EAX, [EBP][-4] PUSH EAX PUSH [ECX] CALL MYPROC There are a few caveats with INVOKE. To forward reference a procedure, it requires a PROTOtype declaration earlier on in the file. Secondly, if you want to forward reference a symbol as an argument, it will need to be put into a register first and then use the register in the INVOKE. Example: MYPROC PROTO :DWORD, :DWORD ... MOV EAX, [KITTEN] INVOKE MYPROC, [ECX], EAX ... KITTEN DD 0 ... MYPROC PROC USES EBX ECX, P1:DWORD, P2:DWORD Also, you cannot use an OFFSET directive in an INVOKE, instead use ADDR, which will result in EAX (and EDX if you provide something that is not a DWORD size) being overwritten as shown below. MASM will warn you if EAX is used in an argument and has been overwritten. Example: INVOKE MYPROC, EAX, ADDR CAT Compile: LEA EAX, [EBP][-4] PUSH EAX PUSH EAX CALL MYPROC Error A2133: register value overwritten by INVOKE However, if a register has been overwritten, and it's the argument to the CALL section of the Compile code, then MASM will not warn you even though your code will be incorrect. It's a small bug to keep in mind. Also, if the TMYPROC PTR bit confuses you, keep reading, it's explained next. Example: LEA EAX, [MYPROC] INVOKE TMYPROC PTR EAX, EBX, ADDR CAT Compile: LEA EAX, [EBP][-4] PUSH EAX PUSH EBX CALL EAX MASM will not give you an error for this bad code INVOKE can't be used 'as is' on Pointers to API/PROC, because it performs type checking on the arguments and doesn't know what function it needs to compare against. You cannot use a PROTO in this case, instead you need to create a TYPEDEF. In this way, INVOKE can be used to call API in a virus through Pointers, and still do type checking. Example: TMYPROC TYPEDEF PROTO :DWORD, :DWORD LEA EAX, [MYPROC] INVOKE TMYPROC PTR EAX, [ECX], ADDR CAT ... MYPROC PROC USES EBX ECX, P1:DWORD, P2:DWORD LOCAL CAT:DWORD ... Our final lesson with INVOKE is using it to pass data structures. If you think about it, it's difficult to pass a massive 300B structure using the stack. Instead, it passes a Pointer to the STRUCT. Note that inside the PROC, you will need to load up and dereference properly, as shown below. Example: TMSPROC TYPEDEF PROTO :DWORD, :DWORD DSTRUCT STRUCT ONE DD 0 TWO DD 0 ENDS DATA DSTRUCT {} ... LEA EBX, [MSPROC] INVOKE TMSPROC PTR EBX, ADDR DATA, NULL ... MSPROC PROC DATA:PTR DSTRUCT, OTHER:DWORD MOV EAX, [DATA] INC [EAX][DSTRUCT.TWO] ... -=( 7 : Final Thoughts on MASM ------------------------------------------- )=- Using a TYPEDEF on every API call reduces time spent chasing bugs caused by mismatched parameters. It also forces good commenting technique by declaring what each invocation refers to. People are generally reluctant to use .IF and .REPEAT directives as they are not pure assembler code. However, their use expresses the purpose of your code more clearly than the most explicit comment could, and they do compile down to a fairly simple format. We've seen the scene go through lots of changes for favourite assemblers, from MASM, to TASM, to A86, to NASM and TASM. There's no reason to hold back on MASM anymore. It's everything you want and more. Think about it. -=( ---------------------------------------------------------------------- )=- -=( Natural Selection Issue #1 --------------- (c) 2002 Feathered Serpents )=- -=( ---------------------------------------------------------------------- )=-