subps - subtract packed singles (32 bit floating point)

dest[0] = dest[0] - source[0]
dest[1] = dest[1] - source[1]
dest[2] = dest[2] - source[2]
dest[3] = dest[3] - source[3]

The subps instruction subtracts the 4 source values (second operand) from the 4 values of the destination (an XMM register). The source can be an XMM register or a 32 bit memory location. There is also vsubps on CPUs with AVX instructions which allows using 3 XMM registers or 2 XMM registers and a memory location which can simplify coding and which subtracts 8 pairs of values.

        subps   xmm0, xmm1        ; subtract 4 pairs of values of xmm1 from xmm0
                                  ; leave the rest of ymm0 as is
        subps   xmm0, [x]         ; subtract 4 pairs of values of x from xmm0
                                  ; x is an array of floats
                                  ; leave the rest of ymm0 as is
        subps   xmm0, [x+4*rax]   ; subtract 4 pairs of values of x from xmm0
                                  ; x is an array of floats
                                  ; rax holds the index of the first element
                                  ; leave the rest of ymm0 as is
        vsubps  xmm3, xmm0, xmm15 ; subtract 4 pairs of values from xmm0 & xmm15
                                  ; store results in xmm3
        vsubps  ymm3, ymm0, [x]   ; subtract 8 pairs of values from ymm0 & x
                                  ; store results in ymm3
        vsubps  ymm3, ymm0, [rsi] ; subtract 8 pairs of values from ymm0 & [rsi]
                                  ; rsi contains the address of an array
                                  ; store results in ymm3

flags: none