open poll what's your favorite weird cpu instruction

any architecture, any time period.

@tindall wait do you think any Jazelle DBX things can support invokedynamic

@tindall @videogame_hacker ARM kept changing their minds about it, jazelle barely got implemented before they jumped to thumb, then they dropped thumb. a shame honestly

@tindall popcount, but i am a dilettante in this subject and that is literally the only weird cpu instruction i know of

@tindall POWER ISA has an instruction called "Enforce In-order Execution of I/O", abbreviated to the mnemonic..."eieio".

@tindall the AGC’s “EDRUPT” instruction. Nobody is entirely sure what it was for and it never ended up in any flown software

“The EDRUPT instruction is so-called because it was requested by programmer Ed Smally, and was used only by him.”

@tindall maybe the 0x0F3F instruction that sets up the RISC pipeline in those weird embedded-RISC-core x86 VIA processors. that's a good one

@tindall (I mean to be clear the 432 is a bonkers machine that was DOA, but it is an amazing case study in "hold-my-beer-level CISC" -- it has a hardware garbage collector, transparent single-level store, hardware object-capability addressing and a substantial fraction of Ada's language semantics wired into it.)

@graydon The more I read about this the more convinced I am that there's a very fun alternative timeline where it became a mainstream architecture

@tindall Yes an extremely vaporwave space-lasers sort of timeline.

(My favourite aspect of it is I have one of those "this is the future, get ready for it" programming books from the time where the running example is about writing a stock-market portfolio management program. Because of course that's what your early 80s Ada object-capability mini-supercomputer-on-a-chip would be doing! Wall Street here I come! Pew-pew, lasers!)

@tindall yup, the movfuscator is a great time

@tindall the TMS9900 had an instruction, X, which would execute the contents of the given register as an instruction

if it needed any inline operands, they'd be taken from the instruction stream immediately after the X instruction

what could possibly go wrong...? ;-)

@tindall perhaps we should all be very grateful that TI failed to persuade IBM to build a personal computer around their shiny new microprocessor after all

@thamesynne @tindall the PIO cores in the RP2040 have this, too, simply called "exec"

@thamesynne @tindall I dunno; the 9900 was actually a nice machine to code for. idk, basically all of the cpu instructions i've been writing lately are weird

like what does
vblendmpd %zmm6{cdab},%zmm7,%zmm7{%k2} even mean (careful! this may look like x86, and it may look like avx512, but it's technically neither!) (for certain values of k2 it's the same as avx512's vunpckhpd instruction though) this particular incantation is even darker than it seems on the surface, i fear

@linear @tindall the hubris of meddling with registers beyond our control

@tindall slightly sarcastic answer

floating point javascript convert to signed fixed point rounding towards zero

ARM added an instruction specifically for the webshit

@tindall ENTER on x86
it's slower than doing the equivalent operations manually because of microcode bullshit (mov rbp, rsp; sub rsp, whatever) so compilers never emit it
but compilers do emit LEAVE

@tindall @Patashu rlwinm, every PPC dev's favorite hammer for every nail

@tindall I know it's not really real, but HCF

Just because it involves fire

@SigmaOne @tindall if we're allowed to invoke instructions from 'deployed purely in a virtual environment', Stationeers IC10 *implements* HCF. and it works.

@tindall Knuth's MMIX has a bunch of unusual instructions. The first that comes to mind:

MOR $X,$Y,$Z / MXOR $X,$Y,$Z


Assume that the 64 bits in $Y are numbered as follows (and similar for the bits in register $Z and $X):

[[y00, y01, …, y07] [y10, y11, …, y17], ..., [y70, y71, …, y77]]

Now bit xij in register register $X is computed as follows:

MOR: xij = (y0j & zi0) | (y1j& zi1) | ... | (y7j & zi7)
MXOR: xij = (y0j & zi0) ^ (y1j & zi1) ^ ... ^ (y7j & zi7)

@tindall CMPCx (VAX) - compare two character strings with optional filling

@tindall on the ARM7TDMI (and no other ARM chip afaik), in ARM mode, the second least-significant bit is ignored, but is used in Thumb mode, as Thumb instructions are 2 bytes wide, while ARM instructions are 4 bytes wide. So if you jump to 0x02000402 in ARM mode, the code at 0x02000400 will actually get executed.

However, there's one place where this ignored bit actually does pop up when it shouldn't: a pc-relative LDR instruction. On the ARM7TDMI, this will still use the ignored bit while it shouldn't, and thus read data from 2 bytes further than intended (ARM9 and up fix this). So if you jump to 0x02000402, it will execute the same code as when jumping to 0x02000400, *except* pc-relative loads will be messed up.

This behavior is actually used in a Pokémon Emerald speedrun (for the GBA).

@tindall For the name alone, I'm going with EIEIO - Enforce In-order Execution of I/O. It's what IBM Power architecture calls a memory barrier.

@tindall 6809's BRN - Branch Never. Effectively an elaborate NOP.

Pretty much every instruction on the 1802 is a why moment. It takes two instructions to transfer the whole accumulator. And short branches aren't relative, but have to be inside the same 8-bit page. No wonder they sent 1802s into space, maybe to get rid of them

@tindall I like hardware random number generator instructions, because they sure are a handy thing to have lying around for games. I seem to remember the Atari 8-bit having one lying around, although it was in a peripheral, not the 6502. Even x86 these days has

@tindall If names are just cause for consideration, the Motorola 6809 processor has a sign-extension instruction called SEX.

Speaking of SEX, the 1802A from RCA has one too, but it is used to SEt the X register.

Come to think of it, the entire 1802A instruction repertoir is just plain bizarro.

@tindall Just a couple of fun names, from the Burroughs processors:

HEYU: send interrupt to another processor
WHOI: get current processor id

@tindall A thing I like that has happened several times: there's a operation that programmers would ideally like to be able to do in one instruction, but, it's complicated to implement in hardware and CPUs of a particular generation can't fit all of the work into one instruction.

but they _can_ have an instruction that will do _part_ of the job

so there will be a library routine do_the_thing that is just N repeats of the do_part_of_the_thing instruction plus possibly some finishing touches

in 1986 with SPARCv7, "the thing" was multiplication (see e.g.;a=b)

in 2001, IA64 did it with division (frcpa "compute reciprocal approximation" followed by a series of fused multiply-add operations that basically use Newton's method to fix up the errors)

nowadays x86 and ARM both have "one round of AES encrypt/decrypt" and "part of a SHA2 hash computation" instructIons

a big part of what i like about this is that the concept keeps coming back, but with hairier and hairier operations, as we cram more transistors onto chips but also keep thinking of more complicated things for them to do

Sign in to participate in the conversation

cybrespace: the social hub of the information superhighway jack in to the mastodon fediverse today and surf the dataflow through our cybrepunk, slightly glitchy web portal support us on patreon or liberapay!