[hatari-devel] m68k instruction decoder

Eero Tamminen eerot at users.berlios.de
Fri Aug 12 21:41:00 CEST 2011


Hi,

On torstai 11 elokuu 2011, Nicolas Pomarède wrote:
> I can see the purpose of this as a test to manipulate opcodes, but this
> seems over complicated to me to set breakpoint.

Err, it started from my unhappiness with Markus' code for disassembling
and a thought that I could try out what writing a table based instruction
decoder would need.  Adding command line and breakpoint output to
the table generation python code was kind of an afterthough.


> Also, it looks like "code" duplication, in the sense that all those
> opcodes are already defined in uae/winuae with more informations
> regarding possible address mode and size.
>
> For example, it would be much too generic with "move" if you just break
> on "move" without being able to specify source/dest.
> 
> I think there're 2 cases when you want to add a breakpoint on an
> instruction :
>   - you have an example so you can set (pc).w =xxxx

Most of the instructions have different kinds parameters encoded to them,
so unless the instruction doesn't have any (like RTD/RTE/RTR/RTS), or you
want just that specific version of the instruction, you need to know also
the generic form.


>   - you don't know the opcode, in that case I think it would be better
> to do it the other way : go from the text to the opcode
> 
> I mean, it would be interesting if the user could say "break when you
> encounter 'move.w d0,(a0)' at the disassembled address".
> 
> So, the user would enter some text to check "str1".
> On each instruction (when breakpoints are checked), internally call the
> disasm function to an internal char buffer "str2" (not stdin). And now
> the hard part : try to do a fuzzy match of "str1 = str2" (standardize
> space/tab, independant case matching, ...). If both strings match, then
> breakpoint is reached.

Monst's instruction search uses a substring match on the disassembler
output.  User just needs to give the substring in the current
disassembly format.

For example, one could search for "$14(A7" and match:
	move.l D0,$14(A7)

(I would have already added this to Hatari debugger if I were able to
touch Markus' code without an overwhelming urge to fix its indentation.)


> Of course, the user would need to enter the instruction by respecting
> the disassembler usual output convention, especially regarding how to
> express register, address mode and things like that (for simple
> instruction like rts, rol, jmp, ... that don't have a variety of
> parameters, matching should be rather easy to achieve).

If you sometimes would need to match e.g. all of .b, .w and .l formats,
I could add to the python code a helper for telling you how to mask given
parameters from the given instruction, e.g. with syntax like this:
	<instruction> ?=<parameter type letters>

In the move case, it would look like this:
------
m68k-instructions: move

Instruction:
  move - 00ssyyymmmmmmxxx
Bits:
  - s: size (byte/word/long)
  - y: destination register
  - m: effective address mode
  - x: source register

m68k-instructions: move ?=s
    $cfff
------

-> would give you value to mask the move instruction size out.
    $cfff

You could then use:
	b (pc).w & $cfff = "$3080 & $cfff"


Btw. Any suggestions for which letters to use to have separate ones for
source & destination register op-modes and effective address modes?

These are the letters I currently use:
    's': "size (byte/word/long)",
    'i': "dynamic, not immediate data",
    'x': "source register",
    'y': "destination register",
    'r': "data / address register",
    'v': "data value",
    'o': "op-mode",
    'm': "effective address mode",
    'a': "effective address register",
    't': "instruction type / direction",
    'c': "condition / count",
    'p': "displacement",
    'b': "breakpoint / trap vector",


> So breakpoint on complex opcodes would just be a matter of entering the
> text in a compatible way with the usual disasm output. Then all the
> disasm work would be done as it's already done today, if new opcodes
> (68030,...) are added, it's automatically available.

How many new instructions 680[2346]0 have compared to 680[01]0?
Does coldfire have new instructions?


> You can also adapt this to the dsp case, once you have a flexible enough
> fuzzy matching function, you don't need to know what cpu you're
> handling, it's just strings comparison and it would work also for dsp
> breakpoints.

Sure.


	- Eero



More information about the hatari-devel mailing list