andnps - and not packed singles

dest[0] = ~dest[0] & source[0]
dest[1] = ~dest[1] & source[1]
dest[2] = ~dest[2] & source[2]
dest[3] = ~dest[3] & source[3]

The andnps instruction and-nots the 4 source values (second operand) to the 4 values of the destination (an XMM register). The source can be an XMM register or a 32 bit memory location. There is also vandnps on CPUs with AVX instructions which allows using 3 XMM registers or 2 XMM registers and a memory location which can simplify coding and which and-nots 8 pairs of values if you use YMM registers.

There is also andnpd which and-nots packed doubles. The result is the same so pick your favorite.

        andnps   xmm0, xmm1        ; and-not 4 pairs of values from xmm0 & xmm1
                                   ; leave the rest of ymm0 as is
        andnps   xmm0, [x]         ; and-not 4 pairs of values from xmm0 & x
                                   ; x is an array of floats
                                   ; leave the rest of ymm0 as is
        vandnps  xmm3, xmm0, xmm15 ; and-not 4 pairs of values from xmm0 & xmm15
                                   ; store results in xmm3
        vandnps  ymm3, ymm0, [x]   ; and-not 8 pairs of values from ymm0 & x
                                   ; store results in ymm3
        vandnps  ymm3, ymm0, [rsi] ; and-not 8 pairs of values from ymm0 & [rsi]
                                   ; rsi contains the address of an array
                                   ; store results in ymm3

flags: none