This whole thread started with the observation that by modern definitions as N-bit ISA is one which has N-bit pointers and the only exception to this is that the so-called 8 bit processors actually had 16 bit pointers.
That wouldn't be my definition at all. To me it's about the ALU and data processing, not the width of the address bus.
Having less wires going into the ALU directly affects processing power and speed of execution, having more or less address lines makes no difference at all.
So by your argument the IBM 360 Models 30, 40, 50, 60, 62, and 70 which were announced on the same day in 1964 are completely different 8, 16, and 32 bit computers, despite the fact they all run exactly the same programs? Indeed, the higher-end ones have 64 bit data busses to memory.
Having more or fewer bits in address registers directly affects what programs you can run.
Having different widths of ALU only affects the speed at which programs run, a comparatively trivial matter which can be compensated for (or exaggerated) by varying the clock speed.
the only exception to this is that the so-called 8 bit processors actually had 16 bit pointers.
Not true. The 8086/80286 had 16-bit registers, 16-bit ALU and 20-bit pointers (22-bit pointers on the 80286 IIRC).
Plus: Many, many 32 and 64 bit processors don't have the same number of address lines as bits in the address registers. eg. Are the Motorola 68000 and Intel 80386 24-bit processors? I don't think you'll find many people arguing that case.
The 8086/286 have 16 bit pointer registers. The total address space is 20 bits because of a segmentation scheme. Such schemes were used at least as far back as the PDP11, which let you put a lot of RAM (megabytes) in a computer but gave each program (convenient) access to 64 KB of it at a time. It is still a 16 bit machine.
The number of address lines coming out of the chip is irrelevant. It can be smaller than the size of a pointer (e.g. all x86_64 machines so far) or larger than the size of a pointer (e.g. 8086, bigger PDP-11).
The relevant thing is the address space a program can conveniently use. That determines the size of problem a program can solve without doing gymnastics.