@Timotei Dolean: See the instruction tables in agner.org/optimize. A CPU textbook that discusses microcoding (and hopefully some do) will explain the reasoning.
Commented Oct 26, 2010 at 22:25@alexstrange: related: Why is the loop instruction slow? Couldn't Intel have implemented it efficiently? has some uop counts and throughput numbers for loop on various recent microarchitectures, and some of the history behind how we ended up in this catch-22 situation of: nobody uses it because it's slow / not worth making faster because nobody uses it. If it was fast, it would often save code size, and be great for adc loops (especially on CPUs with partial-flag stalls like Nehalem and earlier.)
Commented May 28, 2018 at 5:51LOOP decrements ecx and checks if ecx is not zero, if that condition is met it jumps at specified label, otherwise falls through.
LOOPE decrements ecx and checks that ecx is not zero and ZF is set - if these conditions are met, it jumps at label, otherwise falls through.
LOOPNE is same as LOOPE except that it requires ZF to be not set (i.e be zero) to do the jump.
answered Nov 18, 2009 at 14:20 sharptooth sharptooth 170k 105 105 gold badges 533 533 silver badges 1k 1k bronze badgesAlso not asked I'd like to point out that all LOOP instructions are much slower than the DEC ECX / JNZ counterpart. This is intended as LOOP should nowadays only be used for delay calibration loops used for hardware-drivers and the like.
Commented Nov 18, 2009 at 15:23 @NilsPipenbrinck: On which processors is it slower? What's your source? Commented May 18, 2013 at 19:56@JanusTroelsen, its slower from the 80486 onwards. On the lastest processors it's a lot slower. Source: agner.org/optimize manual #2.
Commented Oct 18, 2013 at 14:14@sharptooth, speaking of LOOPE, how after decrementing can ECX be non zero and ZF set? Does LOOPE not affect the ZF flag?
Commented Sep 4, 2015 at 18:35Answering my own question. After checking it in gdb I can confirm that none of the loop (LOOP, LOOPE, LOOPNE) instruction affect the ZF flag when they decrement the ECX counter. Now it makes sense.
Commented Sep 4, 2015 at 18:56EDIT: Synopsis from link: LOOPE and LOOPNE are essentially LOOP instructions with one additional check. LOOPE loops "while zero flag," meaning it will loop as long as zero flag ZF is one and the increment is not reached, and LOOPNE loops "while not zero flag," meaning it continues the loop as long as ZF is zero and the increment is not reached. Keep in mind that neither of these instructions inherently affect the status of ZF.
answered Nov 18, 2009 at 14:20 Matthew Jones Matthew Jones 26.1k 17 17 gold badges 104 104 silver badges 156 156 bronze badgesI believe that it is best to not only provide a link, but quote relevant material from the source, should the link ever become invalid.
Commented Nov 18, 2009 at 14:22The LOOP instructions, as well as JCXZ/JECXZ are a bit slow; however, they still have their place in modern code.
High speed is not always a concern in loops. For example, if we are executing a loop only once during program init and the iteration count is small, the time required will not be noticed.
Another example is a loop where Windows API functions are called; the time spent in the API call probably makes the LOOP execution time trivial. Again, this applies when the iteration count is small.
Consider these instructions as "another tool in your toolbox"; use the right tool for the job ;)