Prior to the advent of semiconductor main memory, the bulk of the cost of a computer system was the memory, not the central processor. For example, the Univac 1108 computer that I worked on between 1968 and 1976, if equipped with its maximum memory capacity of around one megabyte, had a cost of memory almost three times that of the CPU (based on 1968 prices).
Such a cost structure motivated designers to create instruction sets that minimised the size of programs, allowing more to fit into scarce and expensive memory. Further, memory was typically slow compared to the electronic speed of the processor, so providing instructions that performed complex operations in a single memory cycle improved performance. As a result, most mainframe computers had dozens of instructions for various purposes.
Minicomputers often radically simplified their instruction sets to reduce cost, often sacrificing performance. The Digital Equipment Corporation PDP-8, for example, had just eight instructions, and was famously baroque to program but, in its simplest implementation (the 8/S), was built from just 519 logic gates. It is even possible, as an academic exercise, to design a Turing-complete computer with just a single instruction: instructions don’t need an operation code because they’re all the same!
The Intel x86 architecture was introduced with the 8086 microprocessor in 1978. This was Intel’s first 16 bit processor, and had a 16 bit wide data path and addressed up to a megabyte of memory. The chip had 29,000 transistors, and a companion co-processor, the 8087, added support for IEEE 754 32- and 64-bit floating point instructions.
The 8086 was followed by successive generations of upward compatible processors: 80186, 80286, 80386, 80486, the Pentium and its successors, up to the recent announcement of the 12th generation “Alder Lake” processors. While the original 8086 instruction set was rudimentary, comparable (albeit messier) to contemporary 16 bit minicomputers, each successive generation added instructions, often in families targeted to applications such as multimedia, virtual machine support, vector and matrix operations, and cryptographic acceleration. The present-day x86 architecture is undoubtedly the most complicated ever created, with instructions that vary in length from a single byte to 15 bytes.
So, how many different instructions does the x86 have, anyway? Well, that depends…. Two attempts to answer this question may be found in the documents:
Intel publishes an x86 encoder/decoder library called XED, which defines 1,536 “iclasses” and their byte encodings. But do you consider ADD a different instruction than LOCK ADD? Intel does, and also considers instructions with the repeat prefix (REP MOVSD, for example) as different from the base instruction. But it doesn’t count an operand size prefix which, for example, causes 32 bit instructions to operate on a 16 bit half-register, as separate.
If you look at the byte encoding of instructions, it gets even more wild. While an assembler programmer may simply write a MOV instruction, what the assembler generates depends upon the source and destination, producing thirty-four distinct binary encodings depending on them. If you count all of these as different instructions, you end up with 6,000 unique x86 instructions. If you count only assembler mnemonic and operand type (register/memory/constant, width, etc.) you find “only” 3,683 instructions. Counting only mnemonic and width yields 2,034 instructions.
This is how you end up with instructions with mnemonics like:
- VFMSUBADD132PD (Fused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Values)
- PCMPESTRM (Packed comparison of string data with explicit lengths, generating a mask)
- MASKMOVDQU (Non-Temporal Store of Selected Bytes from an XMM Register into Memory)
- VPTERNLOGQ (Bitwise Ternary Logic)
- PCLMULLQHQDQ (Multiply the low half of the destination register by the high half of the source register)
Here is one programmer’s take on the “Top 10 Craziest [x86] Assembly Language Instructions”.