Getting Started
Contents
Words of warning
Sizecoding assumes a basic level of assembler knowledge. You should have at least a few regular (non-optimized) assembler programs under your belt before you attempt sizecoding. Also, don't assume sizecoding is normal -- shaving bytes is a black art that should be kept far, far away from normal programming targets. People sizecode for fun, not profit!
Know your environment
Most sizecoders choose to write to mode 13h, a chunky 320x200 graphics mode located at segment A000:0000. Each byte is a pixel, and the graphics buffer is linear, so it is extremely easy to program for. Because it is contained to a single segment, you can be sloppy, as overwriting or underwriting the offset value won't damage anything. A naive way to initialize
.COM file defaults
Knowing what register values are initialized at program start can save you the trouble of having to set them in your code. On most (but not all) DOS environments, the following registers have these default values:
AX=0000 BX=0000 CX=00FF DX=Same as CS register SI=0100 DI=FFFE SP=FFFC (DOS) or FFFE (Windows)
Because .COM files only support 64K executables, DS
, ES
, and SS
are all set to the same value as CS
. The rest can't be counted on for any specific value, except that BP
is mostly 09??h so you can usually count on the high byte being 09h
.
Boot sector defaults
Boot sector tinyprogs are occasionally explored, but The BIOS changes every register value as it executes before the boot sequence, so there's not much to count on other than what occurs directly before execution of the boot sector:
- The boot sector is loaded at 0000:7C00
- DL holds the drive number that was booted from, so if booted from a floppy disk in drive A:, it will be 00
- The stack pointer is 512 bytes beyond the end of the boot sector, so SP is likely 7E00h
This is why most sizecoders target .COM files, and is also why Toledo Atomchess is 9 bytes larger if loaded from boot sector than from a .COM file.
1-byte opcodes
The 80x86 family was originally a CISC design, which is a design philosophy that intentionally attempts to create many instructions that perform multiple steps. In sizecoding, you are trying to perform as much work in as little space as possible, so it is helpful to know (or memorize!) every 1-byte instruction in the 80x86 family. Here's a handy chart (segments and prefixes omitted):
Opcode | Mnemonic | Arch | Description | Notes |
---|---|---|---|---|
37 | AAA | ASCII adjust AL (carry into AH) after addition | ||
3F | AAS | ASCII adjust AL (borrow from AH) after subtraction | ||
98 | CBW | Convert byte into word (AH = top bit of AL) | ||
99 | CDQ | 80386+ | Convert dword to qword (EDX = top bit of EAX) | |
F8 | CLC | Clear carry flag | ||
FC | CLD | Clear direction flag so SI and DI will increment | ||
FA | CLI | Clear interrupt enable flag; interrupts disabled | ||
F5 | CMC | Complement carry flag | ||
A6 | CMPS mb,mb | Compare bytes [SI] - ES:[DI], advance SI,DI | ||
A7 | CMPS mv,mv | Compare words [SI] - ES:[DI], advance SI,DI | ||
A6 | CMPSB | Compare bytes DS:[SI] - ES:[DI], advance SI,DI | ||
A7 | CMPSD | 80386+ | Compare dwords DS:[SI] - ES:[DI], advance SI,DI | |
A7 | CMPSW | Compare words DS:[SI] - ES:[DI], advance SI,DI | ||
99 | CWD | Convert word to doubleword (DX = top bit of AX) | ||
98 | CWDE | 80386+ | Sign-extend word AX to doubleword EAX | |
27 | DAA | Decimal adjust AL after addition | ||
2F | DAS | Decimal adjust AL after subtraction | ||
F4 | HLT | Halt | Resumes operation if an interrupt occurs; could use this for pacing effects that run too fast | |
EC | IN AL,DX | Input byte from port DX into AL | ||
ED | IN eAX,DX | Input word from port DX into eAX | ||
6C | INS rmb,DX | 80186+ | Input byte from port DX into [DI], advance DI | |
6D | INS rmv,DX | 80186+ | Input word from port DX into [DI], advance DI | |
6C | INSB | 80186+ | Input byte from port DX into ES:[DI], advance DI | |
6D | INSD | 80386+ | Input dword from port DX into ES:[DI], advance DI | |
6D | INSW | 80186+ | Input word from port DX into ES:[DI], advance DI | |
CC | INT 3 | Interrupt 3 (trap to debugger) | If performing very many CALLs to a single procedure, could make it INT 3 | |
CE | INTO | Interrupt 4 if overflow flag is 1 | ||
CF | IRET | Interrupt return (far return and pop flags) | ||
CF | IRETD | 80386+ | Interrupt return (pop EIP, ECS, Eflags) | |
9F | LAHF | Load: AH = flags SF ZF xx AF xx PF xx CF | ||
C9 | LEAVE | 80186+ | Set SP to BP, then POP BP (reverses previous ENTER) | |
AC | LODS mb | Load byte [SI] into AL, advance SI | ||
AD | LODS mv | Load word [SI] into eAX, advance SI | ||
AC | LODSB | Load byte [SI] into AL, advance SI | ||
AD | LODSD | 80386+ | Load dword [SI] into EAX, advance SI | |
AD | LODSW | Load word [SI] into AX, advance SI | ||
A4 | MOVS mb,mb | Move byte [SI] to ES:[DI], advance SI,DI | ||
A5 | MOVS mv,mv | Move word [SI] to ES:[DI], advance SI,DI | ||
A4 | MOVSB | Move byte DS:[SI] to ES:[DI], advance SI,DI | ||
A5 | MOVSD | 80386+ | Move dword DS:[SI] to ES:[DI], advance SI,DI | |
A5 | MOVSW | Move word DS:[SI] to ES:[DI], advance SI,DI | ||
90 | NOP | No Operation | ||
EE | OUT DX,AL | Output byte AL to port number DX | ||
EF | OUT DX,eAX | Output word eAX to port number DX | ||
6E | OUTS DX,rmb | 80186+ | Output byte [SI] to port number DX, advance SI | |
6F | OUTS DX,rmv | 80186+ | Output word [SI] to port number DX, advance SI | |
6E | OUTSB | 80186+ | Output byte DS:[SI] to port number DX, advance SI | |
6F | OUTSD | 80386+ | Output dword DS:[SI] to port number DX, advance SI | |
6F | OUTSW | 80186+ | Output word DS:[SI] to port number DX, advance SI | |
1F | POP DS | Set DS to top of stack, increment SP by 2 | ||
07 | POP ES | Set ES to top of stack, increment SP by 2 | ||
17 | POP SS | Set SS to top of stack, increment SP by 2 | ||
61 | POPA | 80186+ | Pop DI,SI,BP,x ,BX,DX,CX,AX (SP value is ignored) | |
61 | POPAD | 80386+ | Pop EDI,ESI,EBP,x,EBX,EDX,ECX,EAX (ESP ign.) | |
9D | POPF | Set flags register to top of stack, increment SP by 2 | ||
9D | POPFD | 80386+ | Set eflags reg to top of stack, incr SP by 2 | |
0E | PUSH CS | Set [SP-2] to CS, then decrement SP by 2 | ||
1E | PUSH DS | Set [SP-2] to DS, then decrement SP by 2 | ||
06 | PUSH ES | Set [SP-2] to ES, then decrement SP by 2 | ||
16 | PUSH SS | Set [SP-2] to SS, then decrement SP by 2 | ||
60 | PUSHA | 80186+ | Push AX,CX,DX,BX,original SP,BP,SI,DI | |
60 | PUSHAD | 80386+ | Push EAX,ECX,EDX,EBX,original ESP,EBP,ESI,EDI | |
9C | PUSHF | Set [SP-2] to flags register, then decrement SP by 2 | ||
9C | PUSHFD | 80386+ | Set [SP-4] to eflags reg, then decr SP by 4 | |
C3 | RET | Return to caller (near or far, depending on PROC) | ||
CB | RETF | Return to far caller (pop offset, then seg) | ||
C3 | RETN | Return to near caller (pop offset only) | ||
9E | SAHF | Store AH into flags SF ZF xx AF xx PF xx CF | ||
AE | SCAS mb | Compare bytes AL - ES:[DI], advance DI | ||
AF | SCAS mv | Compare words eAX - ES:[DI], advance DI | ||
AE | SCASB | Compare bytes AL - ES:[DI], advance DI | ||
AF | SCASD | 80386+ | Compare dwords EAX - ES:[DI], advance DI | |
AF | SCASW | Compare words AX - ES:[DI], advance DI | ||
36 | SS | Use SS segment for the following memory reference | ||
F9 | STC | Set carry flag | ||
FD | STD | Set direction flag so SI and DI will decrement | ||
FB | STI | Set interrupt enable flag, interrupts enabled | ||
AA | STOS mb | Store AL to byte [DI], advance DI | ||
AB | STOS mv | Store eAX to word [DI], advance DI | ||
AA | STOSB | Store AL to byte ES:[DI], advance DI | ||
AB | STOSD | 80386+ | Store EAX to dword ES:[DI], advance DI | |
AB | STOSW | Store AX to word ES:[DI], advance DI | ||
9B | WAIT | Wait until floating-point operation is completed | ||
D7 | XLAT | Set AL to memory byte DS:[BX + unsigned AL] |
Tools and Workflows
A sample framework
Want to just dive in and see what happens? Here's a skeleton that sets up Mode 13h, loops until a keypress is detected, then exits.
start:
mov ax,0013
int 10h
mainloop:
;this is where you do your mega-amazing tiny program
jmp mainloop
leaveit:
ret
Where to go from here?
Tips, Tricks, and Techniques can help you with ideas on optimizing your next production, or help you design while you're writing it.
Some Case Studies are provided that illustrate and explain some of the choices made when sizecoding.
Can't find what you need? Check our list of external resources.