diff --git a/Makefile b/Makefile
index 33a705b..ff0be70 100644
--- a/Makefile
+++ b/Makefile
@@ -6,6 +6,7 @@ out/feeds.html \
out/our.html \
out/tui.html \
out/python_recursion.html \
+out/m8trix.html \
# posts end
out/index.html: src/index.sh src/feed.ass
diff --git a/src/feed.ass b/src/feed.ass
index c058348..bbd578e 100644
--- a/src/feed.ass
+++ b/src/feed.ass
@@ -1,5 +1,6 @@
# actually simple syndication
# https://tilde.town/~dzwdz/ass/
+2024-07-31 https://tilde.town/~dzwdz/blog/m8trix.html Dissecting m8trix
2023-08-19 https://tilde.town/~dzwdz/blog/tui.html On TUIs
2023-07-23 https://tilde.town/~dzwdz/blog/our.html /town/our, a tildebrained irc bot
2023-05-25 https://tilde.town/~dzwdz/blog/feeds.html Linear feeds are a dark pattern
diff --git a/src/m8trix.md b/src/m8trix.md
new file mode 100644
index 0000000..d8b0cd9
--- /dev/null
+++ b/src/m8trix.md
@@ -0,0 +1,363 @@
+title: Dissecting m8trix
+date: 2024-07-31
+is one of my favorite demos.
+It packs a pretty cool Matrix-style effect in only 8 bytes:
+<summary>animated gif (epilepsy warning)</summary>
+<img style="width: 100%" src="//tilde.town/~dzwdz/m8trix3.gif" />
+The author even provided the source with some comments:
+org 100h
+les bx,[si] ; sets ES to the screen, assume si = 0x100
+ ; 0x101 is SBB AL,9F and changes the char
+ ; without CR flag, there would be
+ ; no animation ;)
+lahf ; gets 0x02 (green) in the first run
+ ; afterwards, it is not called again
+ ; because of alignment ;)
+stosw ; print the green char ...
+ ; (is also 0xAB9F and works as segment)
+inc di ; and skip one row
+inc di ;
+jmp short S+1 ; repeat on 0x101
+...yeah, I didn't really get it at first either.
+Let's try to actually understand how it works
+(and learn some stuff about DOS along the way).
+Note that **I'll be using hexadecimal numbers "by default"** (without 0x)
+throughout this article to be consistent with DEBUG's output.
+The only tool I'll be using on DOS's side will be DEBUG.
+It's a
+little tool that ships with MS-DOS.
+I've personally used the FreeDOS version under DOSBox, as that's what I had handy.
+There's builtin help if you type in `?`, you can also check out
+[this more in-depth guide](https://montcs.bloomu.edu/Information/LowLevel/DOS-Debug.html),
+[this video of someone using it to assemble new binaries](https://www.youtube.com/watch?v=zc-W8xq7L5Q).
+There's a small issue, though.
+m8trix doesn't actually work as-is under DEBUG,
+for reasons I'll explain later.
+## a bad explanation of segmentation
+If you're a bit rusty on how real mode segmentation works, then here's a quick reminder.
+There are a few 16-bit segment registers (`CS`, `DS`, `SS`, `ES`).
+When you reference memory in real mode you always[^ithink] use one of those registers,
+even if it's implicit.
+If you reference `ES:BX`, the real address this maps to is computed as `ES * 0x10 + BX`.
+This means that there are multiple ways to reference one physical memory location
+(even if that is only slightly relevant here).
+As another example, `B800:1234` points to `B9234`.
+[^ithink]: At least I think so, but I'm not sure.
+## the first look
+My comments are prefixed with a semicolon.
+As mentioned, all numbers shown are in hexadecimal.
+<pre><code>C:\M8TRIX>debug M8TRIX.COM
+-U <i>; disassemble the beginning of the program</i>
+073D:0100 C41C <a href="//www.felixcloutier.com/x86/lds:les:lfs:lgs:lss">LES</a> BX,[SI]
+073D:0102 9F <a href="//www.felixcloutier.com/x86/lahf">LAHF</a>
+073D:0103 AB <a href="//www.felixcloutier.com/x86/stos:stosb:stosw:stosd:stosq">STOSW</a>
+073D:0104 47 <a href="//www.felixcloutier.com/x86/inc">INC</a> DI
+073D:0105 47 <a href="//www.felixcloutier.com/x86/inc">INC</a> DI
+073D:0106 EBF9 <a href="//https://www.felixcloutier.com/x86/jmp">JMP</a> 0101
+-U 101 <i>; disassemble the loop body</i>
+073D:0101 1C9F <a href="//https://www.felixcloutier.com/x86/sbb">SBB</a> AL,9F
+073D:0103 AB <a href="//www.felixcloutier.com/x86/stos:stosb:stosw:stosd:stosq">STOSW</a>
+073D:0104 47 <a href="//www.felixcloutier.com/x86/inc">INC</a> DI
+073D:0105 47 <a href="//www.felixcloutier.com/x86/inc">INC</a> DI
+073D:0106 EBF9 <a href="//https://www.felixcloutier.com/x86/jmp">JMP</a> 0101
+-R ; <i>look at the registers</i>
+AX=FFFF BX=0000 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0000 DI=0000
+DS=073D ES=073D SS=073D CS=073D IP=0100 NV UP EI PL ZR NA PE NC
+073D:0100 C41C LES BX,[SI] DS:0000=20CD
+Let's step through this.
+### LES BX,[SI]
+<pre><code>AX=FFFF BX=0000 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0000 DI=0000
+DS=073D ES=073D SS=073D CS=073D IP=0100 NV UP EI PL ZR NA PE NC
+073D:0100 C41C LES BX,[SI] DS:0000=20CD
+-T <i>; single step and show register state</i>
+AX=FFFF BX=<b>20CD</b> CX=0008 DX=0000 SP=FFFE BP=0000 SI=0000 DI=0000
+DS=073D ES=<b>9FFF</b> SS=073D CS=073D IP=0102 NV UP EI PL ZR NA PE NC
+`LES` loads a far pointer from memory.
+The first two bytes of `[SI]` will be loaded into `BX`,
+and the next two bytes will be loaded into `ES`.
+We're implicitly using the `DS` segment here, which is where DOS loaded our program into.
+To be more exact --
+our program was loaded into `DS:0100`,
+whereas `DS:0000` (which `[SI]` points at) contains the
+[Program Segment Prefix](https://en.wikipedia.org/wiki/Program_Segment_Prefix).
+Let's take a look at it:
+<pre><code>-d 0000
+073D:0000 CD 20 FF 9F 00 EA FF FF-AD DE BD 1D 94 01 00 00 . ..............
+-u 0000
+073D:0000 CD20 INT 20
+073D:0002 FF9F00EA CALL FAR [BX+EA00]
+The first two bytes always contain `INT 20`, the instruction that quits your program.
+This means that you can quit your program by jumping to `CS:0000` (`CS` = `DS` = `SP`).
+DOS also ensures that the word on top of the stack is `0000`, so you can quit with a `RET`.
+It also means that `BX` will always be set to `20CD`, but we don't actually really care about that.
+The next two bytes point to the segment of the first free byte in memory.
+So, by loading them into `ES`, we make it point to the first free area in memory.
+On most systems that will be `9FFF`.
+This is very convenient, as the
+[mode 13](https://en.wikipedia.org/wiki/Mode_13h)
+framebuffer begins at `A0000`, or `9FFF:0010`.
+This is a
+[well known sizecoding trick](http://www.sizecoding.org/wiki/General_Coding_Tricks#A_smaller_way_to_point_to_Mode_13.27s_screen_segment).
+...except mode 13 is a graphic mode.
+We're in mode 3[^mode3],
+a text mode,
+and the text buffer is located at `B800`,
+completely out of reach of `ES`.
+[^mode3]: [`MOV AH, 0F; INT 10`](https://en.wikipedia.org/wiki/INT_10H), and look at the registers. `AL` is the current mode.
+Well, DEBUG fooled us.
+When you start a program under DOS, `SI=0100`.
+However, for whatever reason, DEBUG zeroes it out instead.
+You can fix it by running `RSI 0100`[^rsi] before the first instruction.
+This is also why the page I've linked to uses `[BX]`, as you can count on it actually being zero.
+[^rsi]: No, `RSI` doesn't stand for the 64-bit register. `R` is the register command, which accepts `SI` as the argument.
+But let's get back to m8trix.
+If `SI=0100`, then `[SI]` points to the beginning of our program!
+-RSI 0100
+AX=FFFF BX=0000 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0000
+DS=073D ES=073D SS=073D CS=073D IP=0100 NV UP EI PL ZR NA PE NC
+073D:0100 C41C LES BX,[SI] DS:0100=1CC4
+AX=FFFF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0000
+DS=073D ES=AB9F SS=073D CS=073D IP=0102 NV UP EI PL ZR NA PE NC
+073D:0102 9F LAHF
+-U 100
+073D:0100 C41C LES BX,[SI]
+073D:0102 9F LAHF
+073D:0103 AB STOSW
+As you can see, this means that `BX=1CC4` (the `LES` instruction itself), and `ES=AB9F`.
+This means that `ES` spans `AB9F0-BB9F0`, which includes the entire text buffer!
+### LAHF
+<pre><code>AX=FFFF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0000
+DS=073D ES=AB9F SS=073D CS=073D IP=0102 NV UP EI PL ZR NA PE NC
+073D:0102 9F LAHF
+AX=<b>46</b>FF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0000
+DS=073D ES=AB9F SS=073D CS=073D IP=0103 NV UP EI PL ZR NA PE NC
+`LAHF` is pretty straightforward, it just loads the top byte of `FLAGS` into `AH`.
+Except, once again, `DEBUG` doesn't set the `FLAGS` register correctly.
+If we were to run m8trix
+[outside of DEBUG](https://www.fysnet.net/yourhelp.htm),
+the top byte of flags would be `02`, and thus this instruction would set `AH=02`.
+This can be fixed in the debugger by running `RAX 02FF`.
+### STOSW
+<pre><code>-rax 02FF
+-rax 02FF
+AX=<b>02FF</b> BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=<b>0000</b>
+DS=073D ES=<b>AB9F</b> SS=073D CS=073D IP=0103 NV UP EI PL ZR NA PE NC
+073D:0103 AB STOSW
+AX=02FF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=<b>0002</b>
+DS=073D ES=AB9F SS=073D CS=073D IP=0104 NV UP EI PL ZR NA PE NC
+`STOSW` -- "Store (word) string" -- is a bit more complex.
+It writes the word at `AX` to `ES:DI`, and increments[^df] `DI` by two --
+the amount of bytes written.
+This instruction will be run over and over again, with `DI` taking on
+every even value and overflowing every once in a while,
+overwriting everything in `ES` -- including the text buffer -- over and over again.
+Each character in the text buffer is represented by a word,
+so each `STOSW` writes a complete character to the screen.
+`AH=02` sets the color to dark green,
+and `AL` (which changes each iteration) chooses the character
+[^df]: If the direction flag was set, it would instead decrement it.
+### skipping a column, misaligned jump
+<pre><code>AX=02FF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0002
+DS=073D ES=AB9F SS=073D CS=073D IP=0104 NV UP EI PL ZR NA PE NC
+073D:0104 47 INC DI
+AX=02FF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=<b>0003</b>
+DS=073D ES=AB9F SS=073D CS=073D IP=0105 NV UP EI PL NZ NA PE NC
+073D:0105 47 INC DI
+AX=02FF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0004
+DS=073D ES=AB9F SS=073D CS=073D IP=0106 NV UP EI PL NZ NA PO NC
+073D:0106 EBF9 JMP 0101
+AX=02FF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0004
+DS=073D ES=AB9F SS=073D CS=073D IP=0101 NV UP EI PL NZ NA PO NC
+073D:0101 1C9F SBB AL,9F
+We don't want the columns to be packed too tightly together,
+so we skip every other character by adding two bytes to `DI`.
+We then jump to `0101`, uncovering a hidden `SBB`.
+### misaligned jump, SBB
+<pre><code>AX=02FF BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0004
+DS=073D ES=AB9F SS=073D CS=073D IP=0101 NV UP EI PL NZ NA PO NC
+073D:0101 1C9F SBB AL,9F
+AX=02<b>60</b> BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0004
+DS=073D ES=AB9F SS=073D CS=073D IP=0103 NV UP EI PL NZ NA PE NC
+This is the last instruction, and it's the one that modifies `AL` to animate the character.
+It subtracts `9F` from `AL` with borrow, which is pretty much the grade-school approach.
+That is --
+if it underflows,
+it will "borrow" a bit from the next byte by setting the carry flag.
+The next `SBB` will see that the carry flag is set,
+subtract an additional `1`,
+and unset the carry flag (unless it also underflowed).
+Let's see that in practice:
+<pre><code>-rax 028F
+AX=02<b>8F</b> BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0008
+DS=073D ES=AB9F SS=073D CS=073D IP=0101 NV UP EI PL NZ NA PO <b>NC</b>
+073D:0101 1C9F SBB AL,9F
+AX=02<b>F0</b> BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0008
+DS=073D ES=AB9F SS=073D CS=073D IP=0103 NV UP EI NG NZ NA PE <b>CY</b>
+-rip 0101 <i>; i don't care about the rest of the loop, just run the SBB again</i>
+AX=02<b>50</b> BX=1CC4 CX=0008 DX=0000 SP=FFFE BP=0000 SI=0100 DI=0008
+DS=073D ES=AB9F SS=073D CS=073D IP=0103 NV UP EI PL NZ AC PE <b>NC</b>
+Notice how the second `SBB` subtracted `A0` instead of `9F` because of the carry flag.
+Why does that matter?
+Let's imagine this was a regular `SUB` instead, without a borrow.
+`9F` is odd (coprime to `100`),
+so it would take `100` iterations for `AL` to loop around
+(remember, we're working with hexadecimal here).
+The loop runs for `10000/2=8000` iterations before `DI` repeats,
+and `8000` is divisible by `100`,
+so each pass would have the exact same `AL` values for each character.
+Instead of an animation we'd get a much less impressive static screen.
+Instead, `AL` repeats every `55` (decimal 85) `SBB` calls,
+which is coprime to `100`,
+so the `AL` values will differ from pass to pass.
+There's probably a way to determine the period by hand but I just used Python.
+Not all operands work for this, but `9F` seems to be one of the good ones.
+To quote the author,
+"without CR flag, there would be no animation :)".
+### ending remarks
+I think I've explained every aspect of how m8trix works by now.
+I don't think I need to tell you how brilliant it is.
+Notice how the third byte has three different meanings!
+At first it's read as the low byte of the segment offset,
+then it's part of the `LAHF` instruction,
+and then it's the operand for the `SBB`.
+`STOSW` is not only the perfect instruction for writing characters in text mode,
+it also works as the high byte of the segment offset that you need to write those
+characters in the first place.
+Everything fits together so nicely :)
+## m7trix
+Soon after m8trix was published,
+several people tried coming up with ideas to shrink it down even more.
+What follows is the final version HellMood published:
+<pre><code>C:\M8TRIX>debug M7TRIX.COM
+073D:0100 C41C <a href="//www.felixcloutier.com/x86/lds:les:lfs:lgs:lss">LES</a> BX,[SI]
+073D:0102 9F <a href="//www.felixcloutier.com/x86/lahf">LAHF</a>
+073D:0103 AB <a href="//www.felixcloutier.com/x86/stos:stosb:stosw:stosd:stosq">STOSW</a>
+073D:0104 91 <a href="//www.felixcloutier.com/x86/xchg">XCHG</a> AX,CX
+073D:0105 EBFA <a href="//www.felixcloutier.com/x86/jmp">JMP</a> 0101
+-U 101
+073D:0101 1C9F <a href="//www.felixcloutier.com/x86/sbb">SBB</a> AL,9F
+073D:0103 AB <a href="//www.felixcloutier.com/x86/stos:stosb:stosw:stosd:stosq">STOSW</a>
+073D:0104 91 <a href="//www.felixcloutier.com/x86/xchg">XCHG</a> AX,CX
+073D:0105 EBFA <a href="//https://www.felixcloutier.com/x86/jmp">JMP</a> 0101
+Not only is this version smaller, it also looks better, as it clears the screen!
+It's also simple enough that I won't bother tracing through it again.
+In short -- instead of skipping over every other column,
+we swap `AX` and `CX` back and forth.
+Both are running the same character animation, but, as `CH=00`,
+every other column is rendered as black or black,
+so the characters are invisible.
+This takes care both of skipping columns AND clearing the screen.
+The character cycle is apparently[^apparently] different
+because the carry flag gets reused between odd and even columns,
+but the period still works out to be 85 --
+which I find interesting but I don't really feel like researching why that is.
+[^apparently]: The Python script I'm using for testing says so, but I can't really tell if that's true by just looking at the output.
+## bonus: simplified version
+This is a slightly modified version
+that works under DEBUG and doesn't use misaligned jumps.
+It's easy to experiment with
+as you can just load it into DEBUG,
+use the assembler to change a single instruction,
+and see what happens.
+073D:0100 BB9FAB MOV BX,AB9F
+073D:0103 8EC3 MOV ES,BX
+073D:0105 B402 MOV AH,02
+073D:0107 AB STOSW
+073D:0108 47 INC DI
+073D:0109 47 INC DI
+073D:010A 1C9F SBB AL,9F
+073D:010C EBF9 JMP 0107