divps - divide packed singles (32 bit floating point)

dest[0] = dest[0] / source[0]
dest[1] = dest[1] / source[1]
dest[2] = dest[2] / source[2]
dest[3] = dest[3] / source[3]

The divps instruction divides the 4 source values (second operand) into the 4 values of the destination (an XMM register). The source can be an XMM register or a 32 bit memory location. There is also vdivps on CPUs with AVX instructions which allows using 3 XMM registers or 2 XMM registers and a memory location which can simplify coding and which divides 8 pairs of values if you use YMM registers.

        divps   xmm0, xmm1          ; divide 4 pairs of values of xmm0 & xmm1
                                    ; leave the rest of ymm0 as is
        divps   xmm0, [x]           ; divide 4 pairs of values of xmm0 & x
                                    ; x is an array of floats
                                    ; leave the rest of ymm0 as is
        vdivps  xmm3, xmm0, xmm15   ; divide 4 pairs of values of xmm0 & xmm15
                                    ; store results in xmm3
        vdivps  ymm3, ymm0, [x]     ; divide 8 pairs of values of ymm0 & x
                                    ; store results in ymm3
        vdivps  ymm3, ymm0, [rsi]   ; divide 8 pairs of values of ymm0 & [rsi]
                                    ; rsi contains the address of an array
                                    ; store results in ymm3

flags: none