I want to document my current search, mainly for myself. But as I document it anyway – I might as well “make it open” – perhaps others can use it…
Yesterday I had a crash with the new “faulty resistent” Vectorblade. This means I took back all changes, and Vectorblade is in the same state, than about a week ago.
The taking back was more demanding than one would anticipate, since I could not just go back to my old sources – I already changed too much – so in the long run it was easier to just take all “faulty” stuff out manually.
For the faulty handling to be inserted, I had to “free” quite a lot of memory – and I want to put that “new” memory to good use.
Since I am running out of options – I am going back to my NMI handling.
This time – because I have more space available – I chose to “do it right”.
a) Hardware:

I glued the damn switch to the card – and soldered everything “permanent” – the lose wires were – well lose!
b) Software
I wrote an NMI handler, that can
- display stack address
- display the bank where the NMI happened
- display all registers
- display the state of VIA
- browse/display memory and give out “hex” values of memory locations (button 1+2)
I had to keep it short due to memory restrictions – but it still took about 400 bytes.
This is the source:
direct $ff if NMI_HANDLER = 1 struct NMI_struct ds ORG_STACK, 2 ds ORG_BANK, 1 ds ORG_VIA_PB, 1 ds ORG_VIA_PA, 1 ds ORG_VIA_DB, 1 ds ORG_VIA_DA, 1 ds ORG_VIA_ACR, 1 ds ORG_VIA_CNTL, 1 ds ORG_VIA_IF, 1 ds ORG_VIA_IE, 1 end struct struct NMI_Stack_struct ds ORG_CC, 1 ds ORG_D, 2 ds ORG_DP, 1 ds ORG_X, 2 ds ORG_Y, 2 ds ORG_U, 2 ds ORG_PC, 2 end struct NMI_RAM = playershotobject_list NMI_STACK = enemyobject_list_end bss org NMI_RAM nmi_struct ds NMI_struct nmi_ypos ds 1 nmi_xpos ds 1 nmi_print_buffer ds 7 nmi_adress_mon ds 2 nmi_tmp ds 1 code NMI_HANDLER_FUNCTION NMI_HANDLER_STACK_DUMMY cmps #NMI_HANDLER_STACK_DUMMY-12 ; minus stored regs length beq bouncy_bouncy sts nmi_struct + ORG_STACK ; game_stack_ptr ;game stack save upon first entry bouncy_bouncy lds #NMI_HANDLER_STACK_DUMMY ldx #60000 bounce_delay leax -1,x bne bounce_delay ldy #NMI_RAM lda VIA_DDR_b ; if $40 set than bit 1 switch is 0 (this is bit 0 of switch) ldb VIA_int_flags ; if $80 set than bit 1 switch is 0 (this is bit 1 of switch) sta ORG_VIA_DB,y stb ORG_VIA_IF,y ; bank in "plain" sight clr ORG_BANK, y bita #$40 bne noBit0 ; if != 0, bit is set, than jump (bit 0 = 0) inc ORG_BANK, y ; bs switch bit 0 = 1 noBit0 bitb #$80 bne noBit1 ; if != 0, bit is set, than jump (bit 1 = 0) inc ORG_BANK, y ; bs switch bit 1 = 1 inc ORG_BANK, y noBit1 ; bit as in "bankswitch bit" 00, 01, 10, 11 (banks) ldd VIA_port_b std ORG_VIA_PB,y lda VIA_DDR_a sta ORG_VIA_DA,y ldd VIA_aux_cntl std ORG_VIA_ACR,y lda VIA_int_enable sta ORG_VIA_IE,y lds #NMI_STACK ; stack frame for nmi handler ldd #$c800 std nmi_adress_mon lda #$30 sta Vec_Text_Width CLR Vec_Music_Flag ; no music is playing ->0 (is placed in rottis! JSR Init_Music_Buf ; shadow regs JSR Do_Sound ; ROM function that does the sound playing, here used to clear all regs jsr Init_VIA nmi_msg JSR Wait_Recal ; Vectrex BIOS recalibration JSR Intensity_5F ; Sets the i JSR Read_Btns ldu #nmi_output_start ldy #nmi_struct bsr outPutFrame ldy ORG_STACK,y bsr outPutFrame lda #16 sta nmi_tmp ldd nmi_adress_mon ldu #nmi_print_buffer jsr dhex_to_uascii ldd # ':'*256 + $80 std ,u+ ldd #$b080 std nmi_ypos ldu #nmi_print_buffer jsr Print_Str_d ldd #$a080-16 std nmi_ypos ldy nmi_adress_mon monLoop lda ,y+ ldu #nmi_print_buffer bsr ahex_to_uascii ldd # ' '*256 + $80 std ,u ldd nmi_ypos addd #16 std nmi_ypos ldu #nmi_print_buffer jsr Print_Str_d dec nmi_tmp bne monLoop lda Vec_Btn_State bita #1 beq noPlus_1 ldd nmi_adress_mon addd #16 std nmi_adress_mon noPlus_1 lda Vec_Btn_State bita #2 beq noMinus_1 ldd nmi_adress_mon subd #16 std nmi_adress_mon noMinus_1 jmp nmi_msg outPutFrame outputContinue lda ,u+ beq nmi_msg1_done deca beq nmi_do_1byte jsr out_2 bra outputContinue nmi_do_1byte jsr out_1 bra outputContinue nmi_msg1_done rts AHEX_TOUASCII macro pshs a lsra lsra lsra lsra adda # '0' cmpa # '9' ble ok1\? adda #( 'A'-'0'-10) ok1\? sta ,u+ lda ,s anda #$f adda # '0' cmpa # '9' ble ok2\? adda #( 'A'-'0'-10) ok2\? sta ,u+ leas 1,s endm BHEX_TOUASCII macro pshs b lsrb lsrb lsrb lsrb addb # '0' cmpb # '9' ble ok3\? addb #( 'A'-'0'-10) ok3\? stb ,u+ ldb ,s andb #$f addb # '0' cmpb # '9' ble ok4\? addb #( 'A'-'0'-10) ok4\? stb ,u+ leas 1,s endm DHEX_TOUASCII macro AHEX_TOUASCII BHEX_TOUASCII endm ahex_to_uascii AHEX_TOUASCII rts out_1 ldd ,u++ pshs d jsr Print_Str_d lda ,u+ pshs u ldu #nmi_print_buffer lda a,y bsr ahex_to_uascii ldd # ' '*256 + $80 sta ,u+ bra entryOut1 out_2 ldd ,u++ pshs d jsr Print_Str_d lda ,u+ pshs u ldu #nmi_print_buffer ldd a,y bsr dhex_to_uascii ldd # ' '*256 + $80 entryOut1 std ,u++ ldd 2,s addd #$30 ldu #nmi_print_buffer jsr Print_Str_d puls u puls d,pc dhex_to_uascii DHEX_TOUASCII rts ; format: ; XX byte count or exit 1,2 or 0 ; y,x pos ; String ; pointer to data that is output (offset to Y reg) nmi_output_start db 2, $70, $a0, "S :", $80, ORG_STACK db 1, $60, $a0, "BS:", $80, ORG_BANK db 1, $40, $10, "PB:", $80, ORG_VIA_PB db 1, $30, $10, "PA:", $80, ORG_VIA_PA db 1, $20, $10, "DB:", $80, ORG_VIA_DB db 1, $10, $10, "DA:", $80, ORG_VIA_DA db 1, $00, $10, "AC:", $80, ORG_VIA_ACR db 1, $f0, $10, "CN:", $80, ORG_VIA_CNTL db 1, $e0, $10, "IF:", $80, ORG_VIA_IF db 1, $d0, $10, "IE:", $80, ORG_VIA_IE db 0 nmi_output_start2 db 1, $40, $a0, "CC:", $80, ORG_CC db 2, $30, $a0, "D :", $80, ORG_D db 1, $20, $a0, "DP:", $80, ORG_DP db 2, $10, $a0, "X :", $80, ORG_X db 2, $00, $a0, "Y :", $80, ORG_Y db 2, $f0, $a0, "U :", $80, ORG_U db 2, $e0, $a0, "PC:", $80, ORG_PC db 0 endif
(The debounce code was Thomas idea – and it works good!)
I played again a couple of hours VB today, to provoke the crash.
First crash that happened – the NMI did not respond. WHAT THE F***!
(corrupt NMI handler address?)
Than I had the game “reset” during play – this never happened before – new bug?
Than finally – I had a crash, where the NMI worked!

The NMI was executed in Bank 3, with PC at: $8429 – which well – first looks sort of “ok”.
All VIA registers look “normal” or at least not suspicous.
AC = $98 is default, CN = $CE means the beam can move. Data direction registers are ok (DB = $9f in this case means (together with the 7bit of IF not set) -> bank 3)… etc
What is slightly suspicious is Stack = $cac4:

The “location” seems to suggest, that S is somewhere “in” an enemy object – that is not so strange, since within the behaviour routines, I reference the enemy data with the stack pointer. What IS strange though, is that I ALLWAYS leave the stack pointer at position 4 of a enemy structure – in above “enemyobjectc” – this should be $cabb.
The stack position is one, that should NOT happen within my program!
CORRECTION!
The stack is “ok” – I didn’t account in my thoughts (at 1:30 in the night), that the output stack pointer als INCLUDES the 12 bytes stored by the NMI. If I take that into account, the stack actually should read $cad0, which is the propper stack address for an entry to enemyObjectD!.
But which would also suggest, that we actually were able to “reach” enemyObjectD…
Since we are “obviously” near the enemy handling – I chose to investigate further.
Enemies are kept in memory with a “list” structure.
The head of the list is located at $c888 (enemylist_objects_head) – with the NMI handler “browsing” – I investigated that and it pointed to “enemyobjectc” – which is ok, the list addresses get garbeled during execution so it does not start with object 1.
That enemyobject is located at $CACC – again I browsed to that location:

The last 4 bytes shown are the first 4 bytes of enemyobjectc.
This is the corresponding part of the enemyStructure:

So in the above image you see, that enemyC has the position of y=$6d and x= $91 – and that its “behaviour” starts at address $8429.
Wait – WHAT?
$8429 is NOT a behaviour routine of mine!

$8429 is right in the middle of a “shot”-“collision handling” routine.
And since registers X/Y are certainly not set to the right value – the four marked lines constitute an endless loop!
That endless loop, is right where I pressed my NMI button. Somehow the “stack” and the RAM of my poor enemyC had been corrupted!
In conclusion:
- the crash still only occurs on the live system not the emulator
- the crash seems to have to do with memory (RAM) corruption
- those two taken together mean – I still do not have a clue!
What kind of RAM corruption can happen, that is not emulated?
Usually RAM corruption is a a typical coding bug!
to be continued…
Stupid question… Have you tried using totally different hardware, to decide if it’s a hardware or software problem? Maybe it’s a faulty RAM?Also, not sure it’s useful but have you considered utilising the NoICE debugger? The licence for 6809 is free. https://www.noicedebugger.com/index.html
You’d need a serial port, but you could use the one on a VecFever, if you can make that part of your test setup. Or add an FT232 or (better) FT245R to get an I/O port.
The problem occurs on at least 3 different Vectrex.
I haven’t tried any external debugger yet – I will consider the options, when I run out of my own :-)…
With a different cartridge and flashrom? Maybe the Zif socket is dodgy?
yep, different cards… π
This has to be immensely frustrating for you. I really sympathize.
Any chance that the reason it works in the emulator is because the emulator isn’t 100% accurate with some obscure corner case. The sort of thing like pushing a 2-byte value at FFFF where it wraps around. Not that specific case but that kind of thing – something where the emulator takes a shortcut because ‘no-one would ever do that so it’s not worth checking for’ but eventually someone actually does do it…?
I don’t suppose you can get your hands on a logic analyser/data logger and record the full execution in a cyclic buffer using hardware external to the CPU? Then find a way to stop recording when the crash happens before the old trace is cycled out of the buffer? I can’t help but think that if you can’t replicate the fault in an emulator, then you’re going to need hardware support to finally track this down. The NMI button was a clever idea and a good start but I have a feeling this is really going to need something like a logic analyzer or a bus sniffer of some description. Maybe you could program a bare-metal PI to sniff the bus and log everything on bus clock transitions?
We’re all rooting for you!
G
So if the problem is that some data in RAM is being overwritten, it might be possible to come up with hardware support that can trap when that ram address is being written to. You would need some mechanism to signal when writes were legitimate and when they were unexpected – some sort of hardware semaphore that you would set before a legit write and unset afterwards. Is that at all feasible?
Do you have any free RAM? If you do, this might work: put unused bytes of ram between objects as a tombstone/land mine. If they are ever written to, you’ve experienced corruption and have a chance of either debugging it or – with hardware support – examining a trace of the most recent instruction execution.
Aha! You can get that trace with Thomas’s help! Since the Vecfever is supplying the ROM data, have the Vecfever log to a cyclic buffer the address whenever an instruction fetch is made (ie a read from ROM within a certain range). It doesn’t have to be a huge buffer in the vecfever RAM, just large enough that you can see back to what caused the corruption at the point the corruption is detected. You could even do the corruption detection in software at some safe point (like on WaitRecal) and know that at most 30,000 cycles have been executed since the corruption happened. Then if Thomas has enough ram to store 30,000 addresses in a cyclic buffer, you can be sure that the history is sufficient.
Do you think that might work and could Thomas be able to support it?
Graham
Thx for your suggestions. I will keep that in the back of my mind. Although at the moment have exactly 0 bytes RAM free.
If all else fails… I need to make room and try “desperate” measures, ehm, I mean – even more desperate measures…
I’ve just now suggested adding data logging to the Vextreme cart – even if not useful or in time for your project, it’ll be good to have in the future. https://github.com/technobly/vextreme/issues/51 – the logging part is definitely worth having, it’s the triggering mechanism that is problematic for using it to debug your program. If you can think of any other way of detecting the memory corruption…
No, no current brilliant ideas :-).
Other than providing a list of allowed values, or min/max…
How often does it crash? Does it crash in attract mode/self play mode? If you cut the whole thing down to fit into 32k and use a standard cart & ROM, would it still crash? I would always suspect custom hardware π Are you still using the spare I/O line? (PB6?) Maybe you could make a cart with a couple of latches to extend address space instead of using the spare I/O line, might make it a bit more robust.
Like I said, it seems to be not predictable. Sometimes after minutes, sometimes after dozen hours. (Minutes very seldom!)
No, can’t cut it. There is no really custom hardware. Its just a 256kb flash and 2 lines for bankswitching, PB6 and IRQ.
Have you only run this on your own carts or can you run it on a vecfever and has it shown the bug on both platforms?
If it only happens on your own cartridges, is this an area worth looking at?:
“The 6809 actually latches the address bus and R/W at a particular time after the falling edge of E. A real 6809 is combinatorial, but the latching minimizes the transition effect of one value to another. This actually isn’t guaranteed in the MC6809 datasheet, but some designs depend on it. (Depending on it is somewhat dangerous. You might very rarely see a problem, but during a transition you might have an instant of ‘write’ to an address other than what is intended – even with the latching.) The correct design strategy is to only depend on the Address bus, the Data bus, and the R/W signal (as well as signals such as BS and BA) when they’re guaranteed to be valid (if E and Q are both low, they’re in a state of transition).”
The crash happens on both versions!
Old VIA depend on this behavior.
WDC VIA fixed it.
I recall there was a sound routine I was working on that worked OK in VIDE but not on real hardware, it corrupted RAM because I was ignorantly writing to some bits that were listed as being “ignored” but really were “undefined” and caused undefined behavior which sounds similar to your issue here. is there anywhere you are writing to a mapped memory location that only uses less than 8 bits of a byte and you are somehow writing more into it?
Hm – can’t think of any “dubious” regions, other the once where VIA and RAM are shadowed…
Any hints, of what that might have been?
I seem to remember that at one stage you sent me something that was not working 100% in Vide… wasn’t that Joystick related?
it affected the joystick but was from doing soemthing stupid with a sound routine I think? I have to dig man sorry π
No harm done. You did send it to me – I remember, I just also put it somewhere save :-).
In case you’re not a member, there’s a guy posting in the Facebook group “6502 CPU Family” an in depth look at the 6522, a few logic analyzer screen dumps and commentary on the signals, e.g. PB6. I haven’t read it all yet, but looks quite interesting.
I am not a member… but heading over to it now – thx for the tip!
Is this relevant? tip #2 in http://forum.6502.org/viewtopic.php?p=2310 – analog problems with the interrupt pin…
Probably not… I have some timing buffer, when using the IRQ.
Something I found extremely helpful when I first wrote tailgunner was to record all inputs at the point where they were read by the game, and replay them with cycle-perfect accuracy on re-runs of the game. If it is not a truly random occurrence, but just one that is hard to duplicate, this should make it possible to duplicate. Unfortunately you will need some sort of hardware support to save those input values.
If you look at the recent posts in the Vectorblade Forum…
This is exactly what I am doing right now!
> If you look at the recent posts in the Vectorblade Forumβ¦
> This is exactly what I am doing right now!
Ha! Yes those posts were after I had read that page, I’m currently working backwards (on page 14 now, there’s a lot to read) Peer is right, run-length encoding of button changes is the most efficient storage. But you need to either store the entire game, or a save state plus all changes since the save-state.
One more off-the-wall idea… would there be any benefit in running on the Pitrex, with the version of the emulator that accesses the VIA? If it didn’t fail I don’t think that would tell you anything but if it did fail it would point strongly at the VIA.
Doubt it – emulation is not so good – it would not be comparable …
Last comment for today: is peepholing turned on? For example could a “ldd” followed by “sta” and “stb” be turning one of them into a CLR if the constant for ldd was 0x00nn ? Do you have enough rom space to compile with peepholing disabled?
For ASM peep holing is never active.
Hi Malban, I have bolted an FT245R UART onto my Vectrex. It mostly works, but for reading incoming bytes, it has as interrupt line that needs to be connected to the CPU e.g. on NMI or IRQ, with a handler. Looking at the above code, I can’t make out where you setup the interrupt, it looks like it is only the code that runs when you trigger the interrupt. The Vectrex interrupt handlers are in ROM and jump to CBF2-CBFB, where there are 3 bytes for each interrupt. I’m guessing those 3 bytes are for a JMP with a handler address? I also looked for any interrupt hander setup code in the BIOS code and, apart from the hardware vectors, there isn’t any. Any chance you could share your interrupt handler setup code for the above?
Thanks in advance!
Contacted you on Discord…