technical.markdown


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103

z680k: The Tricky Bits
======================

For those following along at home, here's some explanation of the
tricky bits of z680k.

Flags
-----

Computing flags is hard, and can take a long time.  So I avoid doing
it whenever possible.  Flag computation is usually done only when an
instruction asks for it, and then a minimum amount of work is done in
calculating the requested flag.

After every instruction that can influence flags, z680k notes down
what has changed.  It records this information in one of several ways:

1. Simulated F register

   The simplest is the simulated F register, which is
   composed of `flag_byte` and `flag_valid`.  `flag_valid` is a
   mask indicating what bits of `flag_byte` are part of the F
   register.  This storage space may be fully, partially, or not at
   all valid.  It is considered most authoritative.

   That is, if a particular bit is set in `flag_valid` then that
   corresponding bit of `flag_byte` is the correct value for this
   flag.

2. Saved 68k Condition Code Register

   After all operations that affect the Sign, Zero, Parity/oVerflow
   (in the oVerflow mode), or Carry flags, the 68k condition code
   register is saved in `f_host_ccr`.  As necessary, this is looked up
   in `lut_ccr` for a mapping to Z80 flags.  The validity mask of this
   table is at most `11000101`: Sign (Z80: Negative), Zero, oVerflow
   (Z80: oVerflow / Parity), and Carry.

3. Saved Operands

   After arithmetic operations that affect the half-carry (H) flag,
   the operands are saved in `f_tmp_src_b` and `f_tmp_dst_b`.

   Instructions that affect the half-carry flag and operate on words
   use `f_tmp_src_w` and `f_tmp_dst_w` instead.

4. Miscellany

   Parity may be recorded immediately; I haven't written it yet.
   `f_tmp_p_type` records whether it is parity or overflow, and
   whether it's been calculated or not.  Parity is looked up in
   `lut_parity`, a table that was stolen^Wborrowed from some other
   Z80 emulator.  No malice intended; I simply forget which it was.


Instruction Dispatch
--------------------

I'll be using [a technique from
Tezxas](http://tezxas.ticalc.org/technica.htm) to perform instruction
dispatch quickly.  It's the fastest I've seen, and deserves exposition
here.

I haven't yet worked it into the system; presently the instruction
fetch is at a fixed location which is jumped to after each instruction
routine is executed.  (This is just to make it easy to set a
breakpoint on every instruction fetch, so I can single-step through
emulated code.)

01BB80: 1B 5E B1 10 MOVE.B (A6)+,($01BB86)
01BB84: 4E E4 xx 04 JMP ($xx04,A5)

The Tezxas setup requires instruction routines to begin at 256-byte
intervals within a 64k long block.

The fetch-go routine is two instructions long, ending with an absolute
long jump to an immediate short (with an index by address register
A5).  On emulator initialization, all of these immediate short
addresses are initialized to 0x0004 and the MOVE targets are adjusted
to the appropriate locations.

The first instruction fetches the next byte to be executed and writes
it into the *second* least significant byte of the jump address
offset.  This has the effect of multiplying it by 256 and adding it to
the base address, but is much faster.

The second instruction takes this offset and jumps to it + A5,
yielding the start address of the next instruction's routine.

After the emulator jumps away to its next instruction, the opcode is
left in the JMP target field; this is acceptable because it will be
overwritten next time the emulator runs this instruction.

The purpose of the extra offset of 4 is for interrupt handling.  On an
interrupt, the host's interrupt handler will subtract 4 from A5 and
return immediately.  When the next instruction fetch occurs, the jump
will go 4 bytes earlier, hitting a shim put in place to catch
interrupts.  The shim performs the interrupt function, restores A5,
and jumps back whence it came to continue with the next instruction.

This ensures that an emulated instruction isn't suspended to handle an
interrupt, which is (1) disallowed by the Z80 hardware and (2) an easy
way to mess up registers.