Difference between revisions of "Floating-point Opcodes"

From SizeCoding
Jump to: navigation, search
(Created page with " The FPU offers a lot of operations not available to classic x86 CPU, like <code>SIN</code>, <code>COS</code>, <code>TAN</code>, <code>EXP</code> and so on. Usage and communic...")
 
Line 1: Line 1:
  
The FPU offers a lot of operations not available to classic x86 CPU, like <code>SIN</code>, <code>COS</code>, <code>TAN</code>, <code>EXP</code> and so on. Usage and communication with the FPU is a bit iuncommon and takes a bit to get used to. It's recommended to read the creation of the [[Output#Outputting_in_mode_13h_.28320x200.29|snippet we want to modify]] first, this is how it looks like originally :
+
The FPU offers a lot of operations not available to classic x86 CPU, like <code>SIN</code>, <code>COS</code>, <code>TAN</code>, <code>EXP</code>, <code>SQRT</code>, <code>LN</code> and so on. Usage and communication with the FPU is a bit uncommon and takes a bit to get used to. It's recommended to read the creation of the [[Output#Outputting_in_mode_13h_.28320x200.29|snippet we want to modify]] first, this is how it looks like originally :
  
 
<syntaxhighlight lang="nasm">cwd            ; "clear" DX for perfect alignment
 
<syntaxhighlight lang="nasm">cwd            ; "clear" DX for perfect alignment
Line 42: Line 42:
 
ret ; quit program</syntaxhighlight>
 
ret ; quit program</syntaxhighlight>
  
(explanation follows, dinner first ^^)
+
The usual interaction with the FPU is as follows
 +
* <code>F(N)INIT</code> : Initialization of the FPU
 +
* store register content in memory location(s)
 +
* transfer from memory location onto FPU stack
 +
* actual calculations on the FPU (more on this soon)
 +
* transfer from FPU stack into memory location(s)
 +
* get register from memory location
 +
 
 +
That would be a lot for a single integer addition, but once more complex floating point operations are involved, it starts to pay off. For more advanced FPU operation, let's start from scratch with an unoptimized program which plots the distance of each pixel to the screen center as color, in 49 bytes.
 +
 
 +
[[File:Distance to center example.png|thumb]]
 +
 
 +
<syntaxhighlight lang="nasm">push 0a000h
 +
pop es ; get start of video memory in ES
 +
mov al,0x13 ; switch to video mode 13h
 +
int 0x10 ; 320 * 200 in 256 colors
 +
fninit ; -
 +
; it's useful to comment what's on the
 +
; stack after each FPU operation
 +
; to not get lost ;) start is : empty (-)
 +
X:
 +
xor dx,dx ; reset the high word before division
 +
mov bx,320 ; 320 columns
 +
mov ax,di ; get screen pointer in AX
 +
div bx ; construct X,Y from screenpointer into AX,DX
 +
sub ax,100 ; subtract the origin
 +
sub dx,160 ; = (160,100) ... center of 320x200 screen
 +
mov [si],ax ; move X into a memory location
 +
fild word [si] ; X
 +
fmul st0 ; X²
 +
mov [si],dx ; move Y into a memory location
 +
fild word [si] ; Y X²
 +
fmul st0 ; Y² X²
 +
fadd st0,st1 ; Y²+X²
 +
fsqrt ; R
 +
fistp word [si] ; -
 +
mov ax,[si] ;
 +
stosb ; write to screen (DI) and increment DI
 +
jmp short X ; next pixel</syntaxhighlight>

Revision as of 13:27, 15 August 2016

The FPU offers a lot of operations not available to classic x86 CPU, like SIN, COS, TAN, EXP, SQRT, LN and so on. Usage and communication with the FPU is a bit uncommon and takes a bit to get used to. It's recommended to read the creation of the snippet we want to modify first, this is how it looks like originally :

cwd             	; "clear" DX for perfect alignment
mov 	al,0x13
X: 		int 0x10	; set video mode AND draw pixel
mov 	ax,cx		; get column in AH
add		ax,di		; offset by framecounter	          <-- REPLACE THIS WITH FPU CODE
xor 	al,ah		; the famous XOR pattern
and 	al,32+8		; a more interesting variation of it
mov 	ah,0x0C		; set subfunction "set pixel" for int 0x10
loop 	X			; loop 65536 times
inc 	di			; increment framecounter
in 		al,0x60		; check keyboard ...
dec 	al			; ... for ESC
jnz 	X			; rinse and repeat
ret					; quit program

and this is how it looks if we replace the instruction with FPU code :

cwd             	; "clear" DX for perfect alignment
mov 	al,0x13
X: 		int 0x10	; set video mode AND draw pixel
mov 	ax,cx		; get column in AH

fninit				; init FPU first
mov		[si],ax		; write first addend to a memory location
fild	word [si]	; F(pu) I(nteger) L(oad)D a WORD from memory location to the FPU stack
mov		[si],di		; write second addend to a memory location
fiadd	word [si]	; Directly add the word in the memory location to the top FPU stack
fist	word [si]	; F(pu) I(nteger) ST(ore) the result into a memory location
mov		ax,[si]		; Get the word from the memory location into AX

xor 	al,ah		; the famous XOR pattern
and 	al,32+8		; a more interesting variation of it
mov 	ah,0x0C		; set subfunction "set pixel" for int 0x10
loop 	X			; loop 65536 times
inc 	di			; increment framecounter
in 		al,0x60		; check keyboard ...
dec 	al			; ... for ESC
jnz 	X			; rinse and repeat
ret					; quit program

The usual interaction with the FPU is as follows

  • F(N)INIT : Initialization of the FPU
  • store register content in memory location(s)
  • transfer from memory location onto FPU stack
  • actual calculations on the FPU (more on this soon)
  • transfer from FPU stack into memory location(s)
  • get register from memory location

That would be a lot for a single integer addition, but once more complex floating point operations are involved, it starts to pay off. For more advanced FPU operation, let's start from scratch with an unoptimized program which plots the distance of each pixel to the screen center as color, in 49 bytes.

Distance to center example.png
push 	0a000h			
pop 	es				; get start of video memory in ES
mov 	al,0x13			; switch to video mode 13h
int 	0x10			; 320 * 200 in 256 colors
fninit					; -	
						; it's useful to comment what's on the
						; stack after each FPU operation
						; to not get lost ;) start is : empty (-)
X:
xor 	dx,dx			; reset the high word before division
mov 	bx,320			; 320 columns
mov 	ax,di			; get screen pointer in AX
div 	bx				; construct X,Y from screenpointer into AX,DX
sub 	ax,100			; subtract the origin
sub 	dx,160			; = (160,100) ... center of 320x200 screen	
mov 	[si],ax			; move X into a memory location
fild 	word [si]		; X
fmul 	st0				; X²
mov 	[si],dx			; move Y into a memory location
fild 	word [si]		; Y X²
fmul 	st0				; Y² X²
fadd 	st0,st1			; Y²+X²
fsqrt					; R
fistp 	word [si]		; -
mov 	ax,[si]			; 
stosb					; write to screen (DI) and increment DI
jmp short X				; next pixel