Author Topic: Superscalar 68000, have you seen the Apollo core ? What do you think about ? (Read 43304 times)

legacy · « **on:** December 24, 2014, 01:51:08 pm »

Quote

APOLLO CPU

The Apollo CPU is a modern CISC CPU design. Apollo is code compatible with the Motorola M68K and ColdFire families. The CPU is fully pipelined and superscalar supporting execution of up to 2 integer instructions per cycle. The CPU features two address calculation engines and two integer execution engines.

The size efficient variable length instruction encoding provides market leading code density and optimal cache utilization.

The CPU features a full internal Harvard architecture with separate multiway data and instruction caches. The instruction and data caches are designed to support concurrent instruction fetch, operand read, and operand write references on every clock. The operand data cache permits simultaneous read and write access each clock. The caches come with write combining, as well as memory stream detection and automatic memory prefetching. The combination of these features enables the core to be very efficient in memory and data manipulation tasks.

The branch prediction and branch folding makes the core ideally suited for execution of control flow code.

Optionally, a fully pipelined, double precision FPU is available to be included in the Core.

The Core is fully written in VHDL and can also be synthesized to be used in an FPGA device. When synthesized in an FPGA, the core offers a good combination of moderate FPGA space consumption and excellent performance. The core can reach up to 200 MHz / 400 Mips in consumer type Cyclone FPGA, and up to 400 MHz / 800 Mips in enterprise type FPGA. Clock by clock the core performs very good and scores in many benchmarks better than several ColdFire, ARM and PowerPC cores on a clock by clock comparison.

Features

Fully User-Code Compatible with MC68000
Superscalar Implementation of M68000 Architecture
Dual Integer Instruction Execution Improves Performance
Branch Cache Reduces Branches to Zero Cycles
Separate Data and Instruction Caches
Full Harvard Architecture allows Simultaneous Access to both caches
Data Cache allows Read and Write Access on Each Clock
Bus Snooping
32bit Address bus
Optimized to Achieve Very High Performance Using DDR DRAM Memory
128 bit Deep Store Buffer and One Deep Push Buffer to Maximize Write Bandwidth
Automatic memory stream detection and prefetching
Several memory loads can be held in flight in parallel to maximize bandwidth

for more info, see here the apollo-core project

legacy · « **Reply #1 on:** December 24, 2014, 01:51:58 pm »

personally i do not trust it is really possible, perhaps not for hobby purposes

guys, what do you think about ?

Zad · « **Reply #2 on:** December 24, 2014, 02:18:15 pm »

Or use a $10 ARM cored chip which has a huge support ecosystem? People do seem to get carried away with the "because we can" side of things.

paulie · « **Reply #3 on:** December 24, 2014, 03:39:40 pm »

Quote from: legacy on December 24, 2014, 01:51:08 pm

Superscalar Implementation of M68000 Architecture
Full Harvard Architecture allows Simultaneous Access to both caches

68k was Von Neumann, definitely not Harvard. How does that work?

I will say that those who do actual programming (aka assembly, aka Real Men) would find 68k a breath of fresh air compared to ARM.

amyk · « **Reply #4 on:** December 24, 2014, 03:54:46 pm »

Quote from: paulie on December 24, 2014, 03:39:40 pm

Quote from: legacy on December 24, 2014, 01:51:08 pm
Superscalar Implementation of M68000 Architecture
Full Harvard Architecture allows Simultaneous Access to both caches

68k was Von Neumann, definitely not Harvard. How does that work?

They're talking about the cache: http://en.wikipedia.org/wiki/Modified_Harvard_architecture#Split_cache_architecture

legacy · « **Reply #5 on:** December 24, 2014, 05:27:44 pm »

Quote from: paulie on December 24, 2014, 03:39:40 pm

68k was Von Neumann, definitely not Harvard. How does that work?

it's a common approach in fpga, in order not to stall the fetch and the Load/Store stage
don't worry about that, it's just an implementation detail, it doesn't matter with the ISA, which is 68000

legacy · « **Reply #6 on:** December 24, 2014, 05:28:26 pm »

Quote from: Zad on December 24, 2014, 02:18:15 pm

Or use a $10 ARM cored chip which has a huge support ecosystem? People do seem to get carried away with the "because we can" side of things.

this soft core could be funny for things like "Amiga/Classic"

paulie · « **Reply #7 on:** December 24, 2014, 07:57:53 pm »

Hmmmm... Not really being an fpga freak I didn't realize MCU can "go both ways". Nice thing about this site... you learn something just about every day.

Rasz · « **Reply #8 on:** December 24, 2014, 10:12:18 pm »

400MHz is inside $1K virtex
100-200MHz is inside $100 cyclone

it was made with amiga accelerator "market" in mind (that is ~100 people running sysinfo) so you can forget about it, it will never see the light of day. Authors probably hope to get a job offer at IBM (HAHA good luck, IBM is getting rid of its chip business at the moment).

Reminds me of http://www.majsta.com , that dude supposedly joined 'apollo team' whatever that means.

legacy · « **Reply #9 on:** December 24, 2014, 11:53:32 pm »

Quote from: Rasz on December 24, 2014, 10:12:18 pm

400MHz is inside $1K virtex
100-200MHz is inside $100 cyclone

yes, it's unbelievable story, unbelievable project: see TINA

personally i do not trust it will never exist !

legacy · « **Reply #10 on:** December 24, 2014, 11:55:29 pm »

that diagram was discussed 1 year ago, and it is still under discussion

the same happens to the superscalar 68000 core

chickenHeadKnob · « **Reply #11 on:** December 25, 2014, 12:14:51 am »

The choice to call this new soft-core "Apollo" or "68020" is very confusing, if that are the names they choose. Back in the day, when the 68020 was a real part number from Motorola, Mentor Graphics (the chip CAD/EDA company), was selling a line of work stations called Apollo. That line used 68020 processors. Worse yet Mentor Graphics implemented a CPU board with a 68020 simulated with PAL's and other small scale integrated parts because they couldn't get real 68020 in time, I think Motorola was having trouble producing or delayed at the time. That emulated 68020 CPU board was a mess of bodge wires and revisions, a real banjo board. Needed constant repair

theoldwizard1 · « **Reply #12 on:** December 25, 2014, 12:19:37 am »

Quote from: legacy on December 24, 2014, 05:27:44 pm

Quote from: paulie on December 24, 2014, 03:39:40 pm
68k was Von Neumann, definitely not Harvard. How does that work?

it's a common approach in fpga, in order not to stall the fetch and the Load/Store stage
don't worry about that, it's just an implementation detail, it doesn't matter with the ISA, which is 68000

First, anyone who is "in to" specific implementation of a given Instruction Set Architecture should have already read Computer Architecture, Fifth Edition: A Quantitative Approach. I think mine is 1st edition, but it is all still relative.

Harrvard versus Von Neumann IS an implementation detail, but it is an extremely important detail !

The biggest performance "bottleneck" is accessing memory for instructions and data. For years, bigger caches, and multiple layer caches have been the solution. Pretty much all Harvard architecture machine are "folded" back into a single address space.

IMHO, the best way to improve performance is to improve the CPU to memory (cache or main memory) access and the fastest way to do this is wider buses. IIRC, the Digital Equipment Corporation Alpha architecture chips used a 256 bit wide bus. Of course this causes all sorts of other issues.

I am a bit surprised that this is an implementation of the full M68K instead of the Coldfire. The Coldfire was designed for small size and performance.

The last M68k chip, the 68060, had many interesting features. Fast instruction decoding and parallel Effective Address calculation.

nctnico · « **Reply #13 on:** December 25, 2014, 01:39:29 am »

I very much doubt they can get a core to run at >400MHz inside an FPGA. Sure FPGAs can run at these speeds but you can only have a tiny bit of simple logic. As soon as logic gets more complex routing and logic delays severely hamper the maximum clock frequency.

legacy · « **Reply #14 on:** December 25, 2014, 01:44:17 am »

Quote from: theoldwizard1 on December 25, 2014, 12:19:37 am

Surprised that this is an implementation of the full M68K instead of the Coldfire. The Coldfire was designed for small size and performance.

Coldfire is not 100% compatible with 68k, so it may be a problem for a 68k system like Amiga.

Quote from: theoldwizard1 on December 25, 2014, 12:19:37 am

the 68060 had many interesting features. Fast instruction decoding and parallel Effective Address calculation.

Inside there is a RISC super scalar design, which makes things much more complex to be implemented, especially for the pipeline around the EA calculation for complex addressing modes (which go much more simpler in the 68000)

legacy · « **Reply #15 on:** December 25, 2014, 01:58:02 am »

Quote from: nctnico on December 25, 2014, 01:39:29 am

I very much doubt they can get a core to run at >400MHz inside an FPGA

I have a lot of dubs, like yours, never used something so fast, never seen >400Mhz fpga.

Also, all my two toy-softcores (1) are using half the physically frequency provided to the fpga_clock, i mean in my case i usually use soft core_clock = fpga_clock / 2. I am using a pair of Xilinx fpga, a Spartan3 @ 50Mhz and Spartan6 @ 100Mhz, so in my case the max soft core clock is 25Mhz and 50 Mhz.

The "Apollo 68k-softcore" (2) seems to have fpga_clock equal to the softcore_clock, so 400 Mhz clock, it's unbelievable for me, unfortunately there are no sources, not released yet, so

(1) the first one is MIPS3K compatible, the second is called "ponoku" and is a tiny-RISC ISA i have been developing, similar to MIPS2K but not compatible, they are both multi cycle, not pipelined, Harrvard approached without any cache because i want them designed the simpler they can go.

(2) i have discovered that it was previously called "Natami 68070", i think it is an other big source of confusion

legacy · « **Reply #16 on:** December 25, 2014, 02:09:34 am »

Quote from: edavid on December 25, 2014, 01:48:03 am

Is this CPU core a hobby project, or something they are trying to sell?

good question, i can't understand

What i can say:

the TINA project is related to the mobo
this mobo uses fpgas in order to implement all of the features of an old Amiga 1200 (video, sound, process, and so on)
one of these fpga may use

the OpenCores TG68K soft core, which is not superstar, and which is simple a 68000 compatible soft core
the Apollo Softcore, which is superscalar, super cpu, and you can't reach mars with it

About Apollo i have NO information, i only know this project was called "Natami"
About TINA it seems there is a small-company off the stage, but they have chosen to hide the company name.

legacy · « **Reply #17 on:** December 25, 2014, 02:15:21 am »

here it is an other project which uses the TG68K, it is called "Vampire", it is an accelerator project for Amiga, and it seems someone has improved the old soft core trying to provide more features which makes it partially compatible with 68020

just an other ball of confusion from the Amiga Community

it seems sources are downloadable from here, you can give an eye, if you want

BloodyCactus · « **Reply #18 on:** December 25, 2014, 03:52:08 pm »

I believe this came about because people wanted something besides the TG68k softcore for other Amiga + AtariST projects. Right now the TG68k core has some issues, so of course people spin up other projects for 68k in fpga.

legacy · « **Reply #19 on:** December 25, 2014, 04:05:41 pm »

yeah, it may be, what i can't believe is the superscalar version of the 68020, the Apollo-core (it was Natami core), or whatever they want to call it: it is too complex to be made for hobby purposes, very very complex to be validated, it consumes a lot of human resources.

Look at the date of the first Natami news: it was claimed in the far 2007, we are in 2014 (2015 in a few days) and … no Natami-core, dead project, dead code, dead hardware, everything is dead and nothing has been done/released etc, so they have changed the project name, and reloaded the game, again

The TG68K is a 68000 core, with a pipeline, not superscalar, and it has costed a lot of resources, it was claimed in 2007, and it was ready a few years ago, actually they want to improve it as 68020-compliant core, but it's thousand of million of light year away from a super scalar approach, so … i can't believe the Apollo/Natami/whatever core!

BloodyCactus · « **Reply #20 on:** December 25, 2014, 10:36:07 pm »

natami. lol. lets take everyones design ideas, * 10000, include kitchen sink! at least its 'officially' dead now.

theoldwizard1 · « **Reply #21 on:** December 26, 2014, 04:21:07 pm »

Quote from: legacy on December 25, 2014, 01:44:17 am

Quote from: theoldwizard1 on December 25, 2014, 12:19:37 am
Surprised that this is an implementation of the full M68K instead of the Coldfire. The Coldfire was designed for small size and performance.

Coldfire is not 100% compatible with 68k, so it may be a problem for a 68k system like Amiga.

True, but when Coldfire was introduced, Motorola had a set of macros that were added to the assembler and it became assembly language compatible !

Much of what was left out were instruction and/or addressing modes that were very seldom used.

theoldwizard1 · « **Reply #22 on:** December 26, 2014, 04:32:25 pm »

Quote from: nctnico on December 25, 2014, 01:39:29 am

I very much doubt they can get a core to run at >400MHz inside an FPGA. Sure FPGAs can run at these speeds but you can only have a tiny bit of simple logic. As soon as logic gets more complex routing and logic delays severely hamper the maximum clock frequency.

Very TRUE ! Unless multiple parts of the design are ASYNCHRONOUS timing distribution at higher speeds is CRITICAL !

If you look at the die photos of the DEC Alpha chip you can clearly see the clock line. Why should something as insignificant as the clock show up on a die photo ? Because the had to make it HUGE and drive it hard so that no part of the chip would have re-drive the signal an introduce timing variability !

If arts of the design are physically going to be on differnt FPGAs, they had better be asynchronous !

legacy · « **Reply #23 on:** December 26, 2014, 05:26:41 pm »

Quote from: theoldwizard1 on December 26, 2014, 04:21:07 pm

True, but when Coldfire was introduced, Motorola had a set of macros that were added to the assembler and it became assembly language compatible !

Much of what was left out were instruction and/or addressing modes that were very seldom used.

understood, but how tu run a legacy binary software, e.g. AmigaOS kernel and amiga Applications ?
i think such a macros and fixes are usable if you could rebuild from sources. Am i wrong ?

theoldwizard1 · « **Reply #24 on:** December 26, 2014, 06:43:43 pm »

Quote from: legacy on December 26, 2014, 05:26:41 pm

Quote from: theoldwizard1 on December 26, 2014, 04:21:07 pm
True, but when Coldfire was introduced, Motorola had a set of macros that were added to the assembler and it became assembly language compatible !

Much of what was left out were instruction and/or addressing modes that were very seldom used.

understood, but how tu run a legacy binary software, e.g. AmigaOS kernel and amiga Applications ?
i think such a macros and fixes are usable if you could rebuild from sources. Am i wrong ?

You are 100% correct ! I just ASSUMED someone had the sources !

Designing a general purpose CPU for multitasking because there are no "simple" benchmarks that are really relative to the application that is going to be run. As I stated before, wider data paths, especially off chip data paths, are the simplest solution but have many technically issue for implementation.

My memory of the M68K instruction set is sketchy, but I seem to recall that there are instruction that modify memory contents directly. These instruction are going to cause inherent delays (remember, memory access are always the slowest thing a CPU can do) as well as causing the write-back cache to be being dumped to memory.

andersm · « **Reply #25 on:** December 26, 2014, 07:15:06 pm »

Quote from: legacy on December 26, 2014, 05:26:41 pm

understood, but how tu run a legacy binary software, e.g. AmigaOS kernel and amiga Applications ?
i think such a macros and fixes are usable if you could rebuild from sources. Am i wrong ?

AmigaOS 4 and MorphOS managed decent binary compatibility for well-behaved software using dynamic recompilation on PPC hardware. I never had an Amiga, but IIRC even back then some applications needed hacks to eg. disable caches when using accelerator cars with '030 and higher CPUs.

nctnico · « **Reply #26 on:** December 26, 2014, 07:24:50 pm »

I notice a 68k core (TG68K) is mentioned in this thread. Is it any good?

legacy · « **Reply #27 on:** December 26, 2014, 08:15:34 pm »

TG68k scales bad if compared to a real 68SEC000 (which is ASIC core, used in Minimig v1, Amiga 500 fpga emulator)

68SEC000 scales 1:10 -> 1 mips per 10MHz, stable up to @ 50Mhz -> max 5 mips
TG68K v2 goes for 2.75 mips @ 87.5MHz, it scales ~1:30

Apollo/N050 is a new superscalar design (yeah very confusion about this name), but closed sources, claimed to provide 800 mips @ 400 Mhz
Viper/Phenix is an other super-approach, derived from Natami, closed sources, and claims something similar, but no sources no party

legacy · « **Reply #28 on:** December 26, 2014, 08:23:53 pm »

personally i would like to see the next minimig made with a real 68040V (instead of the actual 68sec000), this 040 version is 3.3V so pretty compatible with the common fpga I/O without any voltage level shifter: it should be pretty superior in performance, and pretty validated than every softcore

i mean, i do not trust any Natami/Apollo/Viper/Phenix or whatever

legacy · « **Reply #29 on:** December 26, 2014, 08:26:37 pm »

this is an AGA Amiga on Fpga

jaxbird · « **Reply #30 on:** December 26, 2014, 09:22:39 pm »

Quote from: legacy on December 26, 2014, 08:26:37 pm

this is an AGA Amiga on Fpga

Dude, sorry, distinguished Sir, where can I buy this? the Amiga has always been my favorite 16bit machine and a classic from my youth.

jaxbird · « **Reply #31 on:** December 26, 2014, 09:35:22 pm »

Back in the days I got a genuine offer from one of my friends that he would let me sleep with his girlfriend if he could have my amiga 500 in exchange, I declined, while cute I did not believe the offer would be a fair exchange of such magnificent power for a simple sexual transaction. Anyway I managed to attract my own personal female just a few months after this exchange, no question the power of Amiga 500 helped me muster the required confidence for this awkward task.

I mean, cmon, they named it Amiga for a reason.

legacy · « **Reply #32 on:** December 26, 2014, 10:05:01 pm »

Quote from: jaxbird on December 26, 2014, 09:22:39 pm

where can I buy this?

i don't know

try to contact the guy at www fpgaarcade com
asking for price and availability

legacy · « **Reply #33 on:** December 26, 2014, 10:11:12 pm »

btw, the question is: guys, do you believe a superscalar 68k softcore will ever be possible for hobby ?
do you believe it could have features like SMP, bus snooping, and very high performances ?

do you believe that, or do you really think that a 68040V + fpga (with AGA&C inside) is a more practical choice ?

Alex Eisenhut · « **Reply #34 on:** December 26, 2014, 11:19:41 pm »

Quote from: jaxbird on December 26, 2014, 09:22:39 pm

Quote from: legacy on December 26, 2014, 08:26:37 pm

this is an AGA Amiga on Fpga

Dude, sorry, distinguished Sir, where can I buy this? the Amiga has always been my favorite 16bit machine and a classic from my youth.

...16 bit?

miguelvp · « **Reply #35 on:** December 26, 2014, 11:23:31 pm »

Quote from: legacy on December 26, 2014, 10:11:12 pm

btw, the question is: guys, do you believe a superscalar 68k softcore will ever be possible for hobby ?
do you believe it could have features like SMP, bus snooping, and very high performances ?

do you believe that, or do you really think that a 68040V + fpga (with AGA&C inside) is a more practical choice ?

The Atmel 68040V is very expensive. Even my Sega Genesis has a 68000 on it.

Maybe the Freescale variants MC683xx variants a better bet, but I'm not sure how compatible they are with the original.

Maybe this might help as a fast core:
https://code.google.com/p/fpgagen/source/browse/#svn%2Ftrunk%2Fsrc

It's a SEGA Megadrive/Genesis console in a FPGA developed for a Terasic/Altera DE1 board with VGA output.

Also let me privy you into the PACE project, but be warned, once you delve into this source code, there is no coming back.

https://svn.pacedev.net/repos/pace/sw/

Their forum is broken, I had to email the developer directly, but no one can post because the forum is still broken after a year or so.
http://pacedev.net/index.php?pageid=home

The developers old main website (hard to find btw)
http://members.iinet.net.au/~msmcdoug/

The developer now is working on something else:

http://ngpace.blogspot.com.au/

But I did warn you, there is so much in there that it's easy to get sidetracked at every turn.
The good thing is that he uses VHDL

Edit: And I forgot to point you to the PACE (Programmable Arcade Circuit Emulation) cpus with three 68K variants of cores two at least based or the same as the tg68
https://svn.pacedev.net/repos/pace/sw/src/component/cpu/

jaxbird · « **Reply #36 on:** December 26, 2014, 11:42:46 pm »

Quote from: Alex Eisenhut on December 26, 2014, 11:19:41 pm

...16 bit?

Sure the 68000 based Amiga (girlfriend) was a hardcore 16 bit based design with several helper GPU and sound chips making it far superior to the Atari ST.

legacy · « **Reply #37 on:** December 26, 2014, 11:53:36 pm »

Quote from: miguelvp on December 26, 2014, 11:23:31 pm

The Atmel 68040V is very expensive

how much is MC68040V ?

Quote from: miguelvp on December 26, 2014, 11:23:31 pm

Maybe the Freescale variants MC683xx variants a better bet, but I'm not sure how compatible they are with the original.

some MC683xx microcontrollers use the CPU32 core, others, such as the MC68302 family and the MC68306, use the 68EC000 core
but, there are no advantages using MC683xx, perhaps not about performances

legacy · « **Reply #38 on:** December 26, 2014, 11:55:46 pm »

Quote from: jaxbird on December 26, 2014, 11:42:46 pm

Sure the 68000 based Amiga (girlfriend) was a hardcore 16 bit based design with several helper GPU and sound chips making it far superior to the Atari ST.

it's confusing definition, the 68k is a 32bit cpu machine, the bus of the 68000 is 16bit, cpu registers are 32bit while graphical registers are 16bit, so … it depends on what you are talking about

miguelvp · « **Reply #39 on:** December 27, 2014, 12:05:59 am »

Quote from: legacy on December 26, 2014, 11:53:36 pm

Quote from: miguelvp on December 26, 2014, 11:23:31 pm
The Atmel 68040V is very expensive

how much is MC68040V ?

Only ones I can find on stock are north of $2K
http://www.findchips.com/search/ts68040

Quote

Quote from: miguelvp on December 26, 2014, 11:23:31 pm
Maybe the Freescale variants MC683xx variants a better bet, but I'm not sure how compatible they are with the original.

some MC683xx microcontrollers use the CPU32 core, others, such as the MC68302 family and the MC68306, use the 68EC000 core
but, there are no advantages using MC683xx, perhaps not about performances

Price, they are available and way cheaper

jaxbird · « **Reply #40 on:** December 27, 2014, 12:10:48 am »

Quote from: legacy on December 26, 2014, 11:55:46 pm

Quote from: jaxbird on December 26, 2014, 11:42:46 pm
Sure the 68000 based Amiga (girlfriend) was a hardcore 16 bit based design with several helper GPU and sound chips making it far superior to the Atari ST.

it's confusing definition, the 68k is a 32bit cpu machine, the bus of the 68000 is 16bit, cpu registers are 32bit while graphical registers are 16bit, so … it depends on what you are talking about

While the MC68K does have some 32 bit support, it is primarily marketing worthy with the CPU being mostly 16bit. As far as I remember it's got 32bit registers, but most instructions are 16 bit based.

legacy · « **Reply #41 on:** December 27, 2014, 12:17:42 am »

Quote from: miguelvp on December 27, 2014, 12:05:59 am

TS68040MF33A

the idea of the MC68040V is that it is 3.3V core and 3.3V I/O (1), it seems to me TS68040MF33A is 5V
fpga needs 3.3V cpus and devices, if you put 5V devices you need voltage level translator (also bidirectional) 3.3V <-> 5V

Quote

Price, they are available and way cheaper

yep, but the 68SEC000 is 3.3V, you can buy a new one for less than 10USD, and it goes (stable) up to 50Mhz
the problem is: it scales 1:10, so if you provide 50Mhz of clock, you get 5 * 10^9 instructions executed per seconds
that is not enough for my purposes

(1) while the MC68060 is 3.3V core, and 5V I/O (i do not know if it can work 3.3V, too)

legacy · « **Reply #42 on:** December 27, 2014, 12:34:26 am »

p.s. 68060 is not exactly 100% compatible with 68040, so .. Amiga needs special patches to be applied, these patches are called:

Quote

MC68060 SOFTWARE PACKAGE

The purpose of the M68060 software package (M68060SP) is to supply, for a target operating system, system exception handlers and user library software that provide:

Software emulation for integer instructions not implemented in MC68060 hardware via the new unimplemented integer instruction exception
System V ABI-compliant library subroutines to help avoid using unimplemented integer instructions
IEEE floating-point support for the on-chip floating-point unit (FPU) as well as software emulation of floating-point instructions, data types, and addressing modes not implemented in MC68060 hardware
System V ABI-compliant library subroutines to help avoid using unimplemented float- ing-point instructions

Rasz · « **Reply #43 on:** December 27, 2014, 06:18:10 am »

Quote from: legacy on December 26, 2014, 05:26:41 pm

Quote from: theoldwizard1 on December 26, 2014, 04:21:07 pm
True, but when Coldfire was introduced, Motorola had a set of macros that were added to the assembler and it became assembly language compatible !

Much of what was left out were instruction and/or addressing modes that were very seldom used.

understood, but how tu run a legacy binary software, e.g. AmigaOS kernel and amiga Applications ?
i think such a macros and fixes are usable if you could rebuild from sources. Am i wrong ?

They were trapping unhandled exceptions, and resolved problems in the trap

You want hardcore? Here is some dudes Coldfire devboard to working atari port diary:
http://didierm.pagesperso-orange.fr/ct60/ctpci-e.htm

legacy · « **Reply #44 on:** December 27, 2014, 10:06:06 am »

that coldfire hack is very very interesting, thank you

legacy · « **Reply #45 on:** December 27, 2014, 10:08:47 am »

guys, anybody has seen a 68060 evaluation board around ? looking for something with just the CPU, ram, serial, and the expansion bus
in case, have you seen a full 3.3V system ? (both 68060 Vcore & IO powered at 3.3V)

legacy · « **Reply #46 on:** December 27, 2014, 10:12:47 am »

this is a pretty discussion group about m68k board and CPUs

legacy · « **Reply #47 on:** December 27, 2014, 10:33:53 am »

about 3.3V 68k core i am reading interesting things

MC68EC000AA10, M680x0 32-Bit, Speed: 10MHz, Voltage: 3.3V, 5V, 64-QFP
MC68EC000AA12, M680x0 32-Bit, 12MHz, Voltage: 3.3V, 5V, 64-QFP
MC68EC000CAA10, M680x0 32-Bit, 10MHz, 3.3V, 5V, 64-QFP
MC68EC000EI12, M680x0 32-Bit, 12MHz, 3.3V, 5V, 68-LCC (J-Lead)
MC68HC000CRC10, M680x0 32-Bit, 10MHz, 3.3V, 5V, 68-BCPGA
MC68HC000CRC12, M680x0 32-Bit, 12MHz, 3.3V, 5V
MC68SEC000, M680x0 32-Bit, 20Mhz (overclock up to 50Mhz) 3.3V, 5V, 64-pin QFP 64-pin LQFP

it seems there is no 68EC020 at 3.3V, am i right ?

a brief summary list

theoldwizard1 · « **Reply #48 on:** December 27, 2014, 02:57:47 pm »

Personally, I think "binary compatibility" is a "fool's quest". Too many limitation to truly maximize performance.

I have been a big fan of Coldfire since the first one hit the market ! There is a very interesting "urban legend" about the Coldfire development. Some is true because it did get published.

Once the lead designer had completed the 68060 design, he was not re-assigned to any specific project. While working on the '060 he realized there were some interesting "short cuts" that could be made that would result in a smaller (cheaper) and faster replacement for the '020. Because he was a "senior" designer, he actually design and built a prototype chip even though it was never targeted for production.

The rumor was HP was about to jump ship on their next generation laser printer because the '020 did not have enough "power" and anything above that was too expensive. The engineer mentioned above got wind of this and said, "Hey, I got this chip I have been working on ...". Some assembly language macros were thrown together to handle the the missing instructions and addressing modes and the rest was history.

Clock speed became "king" and for whatever reason Coldfire could not keep up. At the same time Motorola had "hitched their wagon" to PowerPC (dumping their 88000 product line) so all efforts were placed there. It loked like a good move until Apple went to Intel (word on the street was Apple got a HUGE discount on Intel chips for a number of years). Motoorola did design and build a PowerPC chip for embedded (automotive) application, but that seems to have died out also.

ARM is the new "darling" of the CPU world especially the latest 64 bit version. It is not "screaming fast" like Intel's desktop/server line, but its low cost and low power makes it attractive. New implementations are only a step or 2 behind Intel on die shrinks.

legacy · « **Reply #49 on:** December 27, 2014, 03:23:09 pm »

Quote from: theoldwizard1 on December 27, 2014, 02:57:47 pm

Personally, I think "binary compatibility" is a "fool's quest". Too many limitation to truly maximize performance.

Yeah, off course, but the problem about things like minimig is the AmigaOS & binary applications! With AROS&C (kernel and applications) sources you can do whatever you want, patching things, porting to Coldfire, PowerPC, X86, whatever (in the theory).

I think that is the reason why guys have started project like Natami/Apollo/Vampire/Phenix SuperScalar CPU made in fpga

i mean, i think they wanted performances and full 100% compatibility with the real hardware (in case we are speaking about 68000 and 68020, something mounted on Amiga500 and Amiga1200)

Quote from: theoldwizard1 on December 27, 2014, 02:57:47 pm

Motoorola did design and build a PowerPC chip for embedded (automotive) application, but that seems to have died out also.

my jobs is related to automotive, i am junior profiled but i can say our customers and all my four bosses are addicted for PowerPC quadric core, e.g. the e500 series used in Formula1 and MotoGB

Also Avionics is using PowerPC, made by IBM or AMCC, i mean PPC440, PPC460, or something like that (it may also be PPC405), they should be made around the PowerPC 603 core

andersm · « **Reply #50 on:** December 27, 2014, 05:28:32 pm »

Quote from: theoldwizard1 on December 27, 2014, 02:57:47 pm

(word on the street was Apple got a HUGE discount on Intel chips for a number of years)

They probably did, but the main reason Apple switched to Intel is that neither IBM nor Motorola could make the laptop chips Apple needed.

Rasz · « **Reply #51 on:** December 27, 2014, 10:49:01 pm »

Quote from: legacy on December 27, 2014, 10:08:47 am

guys, anybody has seen a 68060 evaluation board around ? looking for something with just the CPU, ram, serial, and the expansion bus
in case, have you seen a full 3.3V system ? (both 68060 Vcore & IO powered at 3.3V)

ebay old nortel telephone switch boxes ~$50
http://www.ebay.com/sch/i.html?_nkw=NORTEL+NT5D10EA+CP+68060

legacy · « **Reply #52 on:** December 28, 2014, 07:52:03 am »

Quote

Differences between 680x0 family and ColdFire

Although the ColdFire architecture is closely related to the 680x0, there are many simplifications to the instruction set. Nearly all of the differences are omissions from the 680x0 instruction set and addressing modes. This means that (with a few important exceptions detailed below), a 680x0 instruction which is implemented in ColdFire behaves in exactly the same way under the two architectures. In fact, almost all user-level (and much supervisor-level) ColdFire code can be run unchanged on a 68020 or later 680x0 processor (apart from new instructions introduced in the Version 4 ColdFire core). The converse, however, is not the case.

In outline, the main omissions fall into five categories:

Missing addressing modes
Missing instructions
Non-availability of word- and byte-forms of nearly all arithmetic and logical instructions
Many instructions act only on registers, not on memory
Restrictions on available addressing modes for particular instructions
Simplification of the supervisor-level programming model
Principles behind the differences

In order to understand the ColdFire instruction set in relation to that of the 680x0, it helps to have an appreciation of why the simplifications have been made. The philosophy behind ColdFire is influenced by the success of RISC processors in providing high performance - for a given degree of chip complexity - by eliminating seldom-used instructions and complex addressing modes, and by regularizing the instruction set to make it easier for the hardware to optimize despatch of the instruction stream.

However, standard RISC processors such as the PowerPC achieve high performance at the expense of low code density, in part because all instructions are the same width (generally 4 bytes) and also because only very simple addressing modes are available. In addition, RISC processors do not allow direct modification of memory locations; all memory reads and writes have to go via registers. This all means that programs compiled for RISC processors tend to be substantially larger than those compiled for CISC architectures such as the 680x0. This penalty does not greatly matter for desktop systems or servers with 32MB or more of RAM, but for embedded applications it can be a significant disadvantage, both in terms of system cost and power consumption.

The ColdFire architecture - which Freescale Semiconductor characterizes as "Variable-Length RISC" - aims to share many of the speed advantages of RISC, without losing too much of the code density advantages of the 680x0 family. Like most modern processor architectures, it is optimized for code written in C or C++, and instructions which are not frequently generated by compilers are amongst those removed from the instruction set. Some of the complex addressing modes - again not important for compilers - are eliminated, and the additional hardware complexities involved in supporting arithmetic operations on bytes and words also disappear. In order to regularize the instruction stream, all ColdFire instructions are either 2, 4 or 6 bytes wide; this is why certain combinations of source and destination operands are not available.

Missing addressing modes

The ColdFire addressing modes are quite similar to those of the original 68000, i.e. without the extensions introduced in the 68020 and later processors, but with some differences in indexed addressing. Compared with a 68020 or later processor, the comparison is as follows:

Fully supported:

Data Register Direct
Address Register Direct
Address Register Indirect
Post-increment
Pre-decrement
Displacement (16-bit displacement)
PC Displacement (16-bit displacement)
Absolute Short
Absolute Long
Immediate

Partially supported:

Indexed
PC Indexed

The restrictions on these two modes are:

The displacement constant is 8-bit only;
"Zero-suppressed" registers are not supported;
The Index register can only be handled as a Long. Word-length index registers are not supported.
The scale factor must be 1, 2, or 4. Scale factors of 8 are not supported.
Not implemented at all:

Memory-indirect post-indexed
Memory-indirect pre-indexed
PC-indirect post-indexed
PC-indirect pre-indexed
Note that further restrictions may be imposed on the addressing modes supported by particular instructions, even if a particular addressing mode is itself available on ColdFire.

Missing instructions

A number of instructions are not implemented at all under ColdFire. These include:

DBcc, EXG, RTR, RTD, CMPM,

ROL, ROR, ROXL, ROXR, MOVE16

ABCD, SBCD, NBCD

BFCHG, BFCLR, BFEXTS, BFEXTU

BFFFO, BFINS, BFSET, BFTST

CALLM, RTM, PACK, UNPK

CHK, CHK2, CMP2, CAS, CAS2, TAS (restored in V4 core),

BKPT, BGND, LPSTOP, TBLU, TBLS, TBLUN, TBLSN

TRAPV, TRAPcc, MOVEP, MOVES, RESET

ORI to CCR, EORI to CCR, ANDI to CCR

In addition, DIVS and DIVU (with some differences from the 680x0 equivalents) are available on some ColdFire processors but not others. MULU and MULS producing a 64-bit result are not implemented, but 16 x 16 producing 32-bit, and 32 x 32 producing (truncated) 32-bit, are available.

Long-word forms only

Most arithmetic and logical instructions can act on Long words only. This applies to:

ADD, ADDA, ADDI, ADDQ, ADDX, AND, ANDI, ASL, ASR

CMP, CMPI (word/byte forms re-introduced in version 4 core)

CMPA, EOR, EORI, LSL, LSR,

NEG, NEGX, NOT, OR, ORI,

SUB, SUBA, SUBI, SUBQ, SUBX

MOVEM.W has also been removed from the instruction set.

In fact, the only instructions which do act on the full set of byte, word and long operands are CLR, MOVE and TST (and CMP and CMPI in the version 4 core). EXT.W, EXTB.L and EXT.L survive, as do MULx.W and MULx.L

Instructions which act only on registers, not on memory

Some arithmetic instructions cannot act directly on memory - the destination must be a register. This applies to:

ADDI, ADDX, ANDI, CMPI, ASL, ASR, LSL, LSR,

NEG, NEGX, NOT, EORI, ORI, SUBI, SUBX, Scc

Note that ADDQ and SUBQ can act directly on memory.

Restrictions on addressing modes for particular instructions

Even where a particular memory addressing mode does exist in ColdFire, some instructions are subject to further restrictions. Often, this is because of the limit of six bytes as the maximum length of a single instruction. Specific restrictions include:

Some combinations of addressing modes for MOVE are disallowed. If the source addressing mode is Displacement or PC Displacement, the destination addressing mode cannot be Indexed or Absolute. If the source addressing mode is Indexed, PC-Indexed, Absolute or Immediate, the destination addressing mode cannot be Indexed, Displacement, or Absolute.
The addressing modes for MOVEM are restricted to only Displacement and Indexed - no Pre-decrement or Post-increment!
For BTST, BSET, BCLR and BCHG, if the source operand is a static bit number, the destination cannot be Indexed or Absolute memory.
Miscellaneous Omissions

There are a few miscellaneous omissions for specific instructions:

LINK.L is not supported
MOVE to CCR/SR: Source must be Immediate or Data Register
MOVE from CCR/SR: Destination must be data register
BSR and Bcc accept only an 8- or 16-bit displacement in version 2 and version 3 cores (32-bit displacements are reintroduced in version 4)
Instructions which behave differently from the 680x0 equivalent

In most cases, an instruction/addressing mode which does exist in ColdFire behaves exactly like its 680x0 equivalent, which makes it easy for experienced 680x0 programmers to understand ColdFire code. It also means that user-mode code written for ColdFire can generally run unchanged on a 680x0 processor, provided the new ColdFire-only instructions are not used.

However, there are a few subtle cases where the ColdFire instruction is not exactly the same as its 680x0 counterpart. The most important of these is that multiply instructions (MULU and MULS) do not set the overflow bit. This means that a 680x0 code sequence which checks for overflow on multiply may assemble and run under ColdFire, but give incorrect results.

ASL and ASR also differ in that they do not set the overflow bit - but this is less likely to cause problems for real programs!

Simplification of the supervisor programming model

Various members of the 68000 family have different register sets available at the supervisor level. The most important simplification in ColdFire's supervisor-level model is that there is only one stack pointer, shared for all code including interrupts, supervisor-level services, and user code. It follows from this that, on ColdFire, it is never safe to write below the stack, since any interrupt which occurs would overwrite the stored data. (Writing below the stack, though not recommended, is possible in some 680x0 systems in user mode, because interrupts cause a switch to the Interrupt or Supervisor Stack Pointer). A further issue is that ColdFire processors automatically align the stack to a four-byte boundary when an exception occurs, which can cause problems if code is reading or writing at a fixed offset from the stack pointer. In fact, it is strongly recommended (for performance reasons) that the ColdFire stack should be kept long-word aligned at all times.

New features in ColdFire Version 4 core

Version 4 of the ColdFire core architecture re-introduces some familiar 680x0 instructions, and also adds some new instructions. The main changes are:

Reintroduced:

32-bit displacement forms of BSR, Bcc and BRA
Byte and Word forms of CMP and CMPI
Slight relaxation of restrictions in addressing modes for MOVE
Restoration of the TAS instruction

New:

MOV3Q for moving immediate values in the range -1 to 7 to destination
MVS moves and sign-extends in one operation
MVZ moves and zero-extends in one operation
SATS Saturate register if overflow set

legacy · « **Reply #53 on:** December 28, 2014, 02:41:04 pm »

I have eaten a look at what’s still in production of 68k family and it seems there is not much else besides legendary 68000. Out of those, 68SEC000 version seems as best choice. It’s bvased on freshest manufactiuing process, and it comes in small package, low inductance

Also, SEC should be easier to OC while working at 3.3V Using 3.3V might be benefitial for:

– being able to find better other components at lower prices ( fpga )
– lower power drain
– lower noise on the bus

Other options fall off for one reason or another:

68010, whole point of it was to fix a few bugs in 68000, that was needed to work good with external MMU. Since MMU brings extra waitstates and since 68010 is hopelessly obsolete for classic OS with virtual memory, this option makes no sense
680(EC){020,030}, nice but not in production anymore
68040, there are some 3.3V models stil in production, but it is outrageously expensive
68060, not in production anymore
ColdFire, not really 68000 compatible

theoldwizard1 · « **Reply #54 on:** December 28, 2014, 07:56:00 pm »

What surprised me most about Coldfire was the fact that it did not make a big impact in the automotive world. GM had already been using the 68332 (IIRC, the CPU32 cores was not compatible with all 68k family member). With its minimized instructions and addressing modes the Coldfire should have out performed the CPU32. Additionally, Coldfires variable length instruction, which do require a bit more decoding than a fixed length RISC instruction set, should be a big benefit for application that execute out of Flash memory.

TPU and eTPU were criticized by some as too complex and in the early days lack of development/debugging tools were a major issue. Motorola set themselves up to be the supplier of TPU firmware, BUT their solutions did not meet all customer requirements.

Designing an embedded system chip is quite different than designing a general purpose CPU chip. Advanced designs that might have been dismissed, such as a 2 bank of general purpose/address registers (user, interrupt), are extremely beneficial in reducing context switch time. Harvard architectures look like a good way of improving performance until you realize that much embedded code has literally thousands of constants. The processor stalls while extra cycle are required to access the one bank of Flash memory.

legacy · « **Reply #55 on:** December 30, 2014, 11:57:35 am »

Quote from: theoldwizard1 on December 28, 2014, 07:56:00 pm

GM had already been using the 68332

…

TPU and eTPU were criticized by some as too complex and in the early days lack of development/debugging tools were a major issue. Motorola set themselves
up to be the supplier of TPU firmware, BUT their solutions did not meet all customer requirements.

i have bought a 68332EVS board (with a debug processor on it + original debugger software, x86, it needs DOS v6.22, it does not run on XP/NT/2K)

the TPU is amazing, i have coded a TPU-uart, TPU-pwm tasks very easily for micro robotic purposes

legacy · « **Reply #56 on:** December 30, 2014, 12:06:59 pm »

This is just a list of HDL resources that may be helpful for fpga/68k

IanP · « **Reply #57 on:** September 24, 2015, 06:45:00 am »

The Apollo developers recently got their hands on the Vampire V2b card http://www.kipper2k.com/vampire/top.jpg. This is not the final V2 production card, that will have double the ram and double the data bus width. The much larger FPGA on the V2 card will allow a full Apollo implementation unlike the severely restricted, cut down Pheonix core on the V1 board which already performs pretty well. There's a lot still to do to have the full implementation of the Apollo core working on the V2 at it's full potential but work is progressing well http://www.apollo-core.com/bringup/index2.htm

John_ITIC · « **Reply #58 on:** September 24, 2015, 08:23:53 am »

Quote from: paulie on December 24, 2014, 03:39:40 pm

I will say that those who do actual programming (aka assembly, aka Real Men) would find 68k a breath of fresh air compared to ARM.

That's the point or ARM's RISC concept; the Acorn people realized that the complexity of a CISC CPU that was intended to be programmed in assembly had to implement very complex/powerful instructions to simplify the assembly programmers work. The RISC concept, in contrast, implements very basic instructions and pushes the complexity into the compiler. This makes the hardware much simpler, which lowers the cost and complexity (allowing Acorn to build their own CPU for the BBC micro series computers). RISCs are simply not intended to be programmed in assembly.

Acorn RISC Machines then became ARM, which now licenses the RISC concept to lots of companies.

https://www.google.com/?gws_rd=ssl#q=Acorn+risc+machines+

John_ITIC · « **Reply #59 on:** September 24, 2015, 08:27:09 am »

Quote from: nctnico on December 25, 2014, 01:39:29 am

I very much doubt they can get a core to run at >400MHz inside an FPGA. Sure FPGAs can run at these speeds but you can only have a tiny bit of simple logic. As soon as logic gets more complex routing and logic delays severely hamper the maximum clock frequency.

NIOS II can obtain some 150 MIPS in a Cyclone V FPGA; the SoC version (hard CPU core) achieves some 4000 MIPS. That's a pretty good FPGA vs. ASIC speed illustration.

ale500 · « **Reply #60 on:** September 29, 2015, 10:03:45 am »

The guys at NatAMI did develop some boards based on an unknown (to me) MC68060 in QFP package. it is labeled MC68060FE133. It doesn't show up in any list of produced parts...:

http://www.natami.net/gfx/NAe60F/NAe60F_2.jpg

The discussion about a faster (and with new instructions) 68k core goes years back... sad that it is closed source

BloodyCactus · « **Reply #61 on:** September 29, 2015, 01:20:21 pm »

it was never proved that the 68060fe133 were real. supposedly they were 133mhz 060's but.. these chips have no fpu or mmu. its a 68EC060 processor. the chip mask gives it away as a MC68LC060ZU66, as an overclocked 66mhz EC/LC mask chip.

afaik, after thomas found these in china and they were not found by anyone else again.

from memory, speculation was someone in china found some uncommon FE's and remarked them to 133mhz. I think the final conclusion from like 2009? was it was a MC68EC060FE75 chip.

i stopped paying attention to natami back then lol

legacy · « **Reply #62 on:** September 29, 2015, 07:59:40 pm »

Quote from: BloodyCactus on September 29, 2015, 01:20:21 pm

i stopped paying attention to natami back then lol

me too, it's a never ended project, and they have currently switched to Apollo-68K

richardman · « **Reply #63 on:** October 05, 2015, 08:31:10 am »

re:Apollo machines

My first job happens to be programming on the Apollo Workstation. This was before SUN (RIP) was huge, so there were dozens, if not hundreds of 68K based workstations popping up everywhere. The joke was that a workstation vendor would go to the Unix guys (I forgot which one now), and they would hand off a copy of the Unix-de-jour on a tape to them and they would release that.

Anyway, Apollo machines were made by Apollo Inc., not Mentor Graphics as a poster mentioned. They were 68K based, so they used 2 68K core executing the same code, and when one fell over due to memory paging, the other would check the content of the stack so the whole machine can recover. The 68020 solved that finally by putting more machine context onto the fault stack so all instructions can be restarted.

Apollo was bought by HP back in the days, around 1986 I believe.

Motorola/Freescale sold A LOT of Coldfires because that's the device family in the original Laserjets.

grumpydoc · « **Reply #64 on:** October 05, 2015, 10:01:06 am »

Quote from: richardman on October 05, 2015, 08:31:10 am

Anyway, Apollo machines were made by Apollo Inc., not Mentor Graphics as a poster mentioned. They were 68K based, so they used 2 68K core executing the same code, and when one fell over due to memory paging, the other would check the content of the stack so the whole machine can recover. The 68020 solved that finally by putting more machine context onto the fault stack so all instructions can be restarted.

Apollo was bought by HP back in the days, around 1986 I believe.

I wrote a bit of stuff for the Apollo post the HP acquisition as my university comp sci department had a few. Odd things in some ways and I spent rather more time swearing at them than I would like

The trick with two 68010's was also used by Sun - they didn't quite "execute the same code", one CPU was just behind the other so that when the first encountered the page fault the second could be stopped while the page fault was handled, then restarted - that CPU would never see the fault. Not sure how they synced register contents after the faulting instruction - possibly just execute that instruction, then copy from one CPU to the other? Anyone know?

richardman · « **Reply #65 on:** October 05, 2015, 10:19:17 am »

Yes of course they didn't execute the same code at the same time. Otherwise, they would just fall down together

I used to know the details, but too many things happened between 1980'sto 2015 :-)

What's pretty amusing to me though was that in the production CPU board, which is over 1 foot x 1foot, may be even more like 18" square, the back side is full of jumper wires to fix the hardware problems, and those were second generation production machines or something like that.

It also ran Aegis, Apollo's brand of not-quite-Unix because they wanted to write it in Pascal - I think. Again, too long ago now.

My first job was to rewrite the byte code interpreter from C to 68K asm. Took me a few days and sped up the runtime by 20-50x. This was for a semiconductor test company that had its own test language for the "next-gen" test machines. Those things ran at megahertz, blazingly fast at that time.

There were a lot of 68K workstation companies then, not unlike the PC clones later. At the last year of school, the luck few of us got to play with a Unix based "personal" workstation from Pixel. Think Steve Jobs' NextCube but 10 years earlier with lower resolution. One of my friends got a job at that place before she graduated but then the company went kaboo before she started!

Quote from: grumpydoc on October 05, 2015, 10:01:06 am

I wrote a bit of stuff for the Apollo post the HP acquisition as my university comp sci department had a few. Odd things in some ways and I spent rather more time swearing at them than I would like

The trick with two 68010's was also used by Sun - they didn't quite "execute the same code", one CPU was just behind the other so that when the first encountered the page fault the second could be stopped while the page fault was handled, then restarted - that CPU would never see the fault. Not sure how they synced register contents after the faulting instruction - possibly just execute that instruction, then copy from one CPU to the other? Anyone know?

ale500 · « **Reply #66 on:** October 05, 2015, 01:53:49 pm »

The original LaserJets are from early 80s, there where no ColdFires at that time. They probably had 68Ks. At least The Laserjet 2 Had a 68EC000 in PLCC package. Later machines, post Series 6 had them. Some had i960s...

legacy · « **Reply #67 on:** October 05, 2015, 01:55:12 pm »

Quote from: richardman on October 05, 2015, 10:19:17 am

Yes of course they didn't execute the same code at the same time. Otherwise, they would just fall down together I used to know the details, but too many things happened between 1980'sto 2015 :-)

What's pretty amusing to me though was that in the production CPU board, which is over 1 foot x 1foot, may be even more like 18" square, the back side is full of jumper wires to fix the hardware problems, and those were second generation production machines or something like that.

never heard before, thank you for the news, is there any paper/AN/article/whatever (paper or digital copy) that tells it in details ?

richardman · « **Reply #68 on:** October 05, 2015, 06:10:31 pm »

Correct, the original Laserjet had plain 68Ks, the later ones had Coldfires. Kept that Motorola unit alive for a long time! :-)

re: 68K page fault problem. You can get some links by searching for "68K page fault" most are hand waving (one processor executes behind the other one...) descriptions, slightly lack of details. On thinking about it, it must be in the order of when the primary processor falls over, the secondary examines the state on the CPU and figures out what address it's having problem accessing (the issue is to implement virtual memory), and then fetches the memory block and restart the primary CPU. I know there are more in depth explanation as I read them before, so if I find them again, I will place a link here.

ale500 · « **Reply #69 on:** October 06, 2015, 03:26:27 pm »

I got the "SUN 68000 Board User's Manual" where the MMU of the 68000-Based SUNs is described, but it is only single processor. I think I got it from bitsavers. Maybe there is something apollo related there... I'll have a look now...

Another interesting document is this one: http://bitsavers.informatik.uni-stuttgart.de/pdf/sun/sun2/800-1185-01_2-120_CPU_Engr_Sep84.pdf

But nothing on those 2 CPUs systems so far... SUN No, HP No, Apollo also nothing... any other name ?

legacy · « **Reply #70 on:** October 06, 2015, 04:14:20 pm »

thank you guys

edavid · « **Reply #71 on:** October 06, 2015, 04:46:46 pm »

Quote from: grumpydoc on October 05, 2015, 10:01:06 am

The trick with two 68010's was also used by Sun - they didn't quite "execute the same code", one CPU was just behind the other so that when the first encountered the page fault the second could be stopped while the page fault was handled, then restarted - that CPU would never see the fault. Not sure how they synced register contents after the faulting instruction - possibly just execute that instruction, then copy from one CPU to the other? Anyone know?

This is garbled... Apollo built a system with 2 68000s because the 68010 wasn't out yet. As soon as it was available, they switched to a single 68010.

Sun never built a dual 68000 board, they waited until the 68010 was available to turn on demand paging.

grumpydoc · « **Reply #72 on:** October 06, 2015, 07:37:03 pm »

Quote

This is garbled... Apollo built a system with 2 68000s because the 68010 wasn't out yet. As soon as it was available, they switched to a single 68010.

Sun never built a dual 68000 board, they waited until the 68010 was available to turn on demand paging.

I confess I never encountered such a beast so it is an old memory that is 2nd hand - I'm not surprised that it is inaccurate

I must admit I thought Sun had used the dual 68000 thing, obviously not.

68010 when I meant to say 68000 was just a typo.

richardman · « **Reply #73 on:** October 06, 2015, 08:05:25 pm »

Doing some reading, it's coming back to me. What is fairly certain is this: 2 CPU with one running one instruction behind. When one falls over, some actions happen:

one CPU examines the CPU state, and paged in the memory in question
one CPU resumes the program, now without the memory access issue
normal execution resumes

What is not clear is which CPU does which steps. Some people claim that the 2 CPUs swap roles. Obviously, in normal execution, the CPU executing "behind" cannot have its memory writes actually being executed.

It would be an interesting exercise to reverse engineer this. This hack though, must have been obvious to the hardware engineers at that time, that when one company did it, numerous other companies just went ahead and followed. This is before the sue-happy patent days, I guess.

edavid · « **Reply #74 on:** October 06, 2015, 09:04:25 pm »

Quote from: richardman on October 06, 2015, 08:05:25 pm

This hack though, must have been obvious to the hardware engineers at that time, that when one company did it, numerous other companies just went ahead and followed.

I don't think the "numerous" part is right - I don't think it was common at all. For one thing, 68000s were very expensive, so no one wanted to pay for 2 of them, when Motorola was promising the 68010 very soon. It made more sense to put the money into RAM. Also, most of the early 68K systems shipped with a (crappy) Unisoft Unix port, which didn't even support demand paging.

As for the way it worked, what I remember (which may be wrong) was that the second CPU was run far enough behind that it could be interrupted and its pre-fault stack puke saved. Then, when the main CPU was ready to resume the faulted process, it would reload that good stack puke and resume at the instruction that had faulted.

richardman · « **Reply #75 on:** October 06, 2015, 10:08:53 pm »

Quote from: edavid on October 06, 2015, 09:04:25 pm

I don't think the "numerous" part is right - I don't think it was common at all. For one thing, 68000s were very expensive, so no one wanted to pay for 2 of them, when Motorola was promising the 68010 very soon. It made more sense to put the money into RAM. Also, most of the early 68K systems shipped with a (crappy) Unisoft Unix port, which didn't even support demand paging.

As for the way it worked, what I remember (which may be wrong) was that the second CPU was run far enough behind that it could be interrupted and its pre-fault stack puke saved. Then, when the main CPU was ready to resume the faulted process, it would reload that good stack puke and resume at the instruction that had faulted.

The story was certainly passed around A LOT at that time (1984/85). The web forgets and now we only have truncated memory :-/ I suppose it has to be more than one, but who knows how many.

Unisoft was the company I was thinking about. They just handed out "build-of-the-day" to workstation vendors as official releases.

There was also a non-AT&T Unix vendor that claims to be independently developed until a couple Bell Labs guys did a dump on the kernel files at a trade show and found the Bell Labs copyright messages. Afterward, they quietly dropped some claims in their advertising.

ale500 · « **Reply #76 on:** October 07, 2015, 04:20:54 am »

I only could dig that Stratus had a maybe similar 2-CPU design, if that was only due to its fault-tolerance or to fault-page... is not clear to me. But, no documents so far.

richardman · « **Reply #77 on:** October 07, 2015, 05:19:24 am »

Apollo Domain definitely had 2-CPU, and I am almost certain Pixel Workstation did also.

grumpydoc · « **Reply #78 on:** October 07, 2015, 07:31:05 pm »

Quote from: edavid on October 06, 2015, 09:04:25 pm

As for the way it worked, what I remember (which may be wrong) was that the second CPU was run far enough behind that it could be interrupted and its pre-fault stack puke saved. Then, when the main CPU was ready to resume the faulted process, it would reload that good stack puke and resume at the instruction that had faulted.

The more I think about this the more I think this one instruction behind business can't be correct.

I don't think the 2nd CPU could be more than 1 instruction behind otherwise you would have all sorts of problems keeping the CPUs executing the same code due to the effect of, or lack of writes to memory. Even then it might be hard, the lead CPU would probably have to have writes ignored - eg what happens if the lead CPU executes a test & set (an atomic R-M-W operation on the 68000). If the write is ignored, fine the 2nd CPU will see the unchanged value and take the same path but if not it will possibly take a different branch (since test-and-set is usually followed by a branch).

However if writes from the lead CPU are ignored think about what happens if code reads a value, modifies it, writes it back and then immediately reads it again. You probably wouldn't write code like that by hand but it could be generated by a compiler - especially one without much optimisation. The lead CPU write goes ignored so the read will get the old value.

You might fix the above by prioritising lag CPU writes of main memory before lead CPU reads within the same cycle but what about instructions with wildly differing timings. Eg a register op followed by a DIV. The lead CPU will do the register op then start the DIV, at the same time the lag CPU will be on the register op but will move to the DIV while the lead CPU is still executing it. In that case the lead CPU would no longer be one instruction ahead, it would just be a few clock cycles ahead.

We haven't even thought about arbitrating the bus between the two CPUs

In short I can't see that the idea of executing the same code works.

What would work, however is that one CPU executes the code and the other just handles page faults. The CPU which is executing code is just delayed - perhaps by simply not asserting DTACK until the memory operation can complete.

Looking at a few of the online notes that is how the scheme is described.

edavid · « **Reply #79 on:** October 07, 2015, 08:05:19 pm »

Quote from: grumpydoc on October 07, 2015, 07:31:05 pm

I don't think the 2nd CPU could be more than 1 instruction behind otherwise you would have all sorts of problems keeping the CPUs executing the same code due to the effect of, or lack of writes to memory. Even then it might be hard, the lead CPU would probably have to have writes ignored - eg what happens if the lead CPU executes a test & set (an atomic R-M-W operation on the 68000). If the write is ignored, fine the 2nd CPU will see the unchanged value and take the same path but if not it will possibly take a different branch (since test-and-set is usually followed by a branch).

The lead CPU does all the memory accesses. The 2nd CPU doesn't read live memory data, it gets the same data that the lead CPU saw (delayed by a FIFO).

I think that in some cases you have to decode what instruction caused the fault, and back out any pre-fault side effects.

Quote

What would work, however is that one CPU executes the code and the other just handles page faults. The CPU which is executing code is just delayed - perhaps by simply not asserting DTACK until the memory operation can complete.

If that was how they did it, they wouldn't have needed an expensive 68000 CPU for the service processor, they could have used a Z80 or whatever.

However, synchronous page faults are so slow, that it's not really worth building a system that way.

grumpydoc · « **Reply #80 on:** October 07, 2015, 10:21:23 pm »

Quote from: edavid on October 07, 2015, 08:05:19 pm

The lead CPU does all the memory accesses. The 2nd CPU doesn't read live memory data, it gets the same data that the lead CPU saw (delayed by a FIFO).

OK, but the lead CPU, by definition, can't complete the faulting instruction so you have to be able to have either CPU in the lead CPU role.

And putting a buffer/FIFO in there does not sound simple

Quote

I think that in some cases you have to decode what instruction caused the fault, and back out any pre-fault side effects.

Which doesn't sound do-able in the general case.

Quote

Quote
What would work, however is that one CPU executes the code and the other just handles page faults. The CPU which is executing code is just delayed - perhaps by simply not asserting DTACK until the memory operation can complete.
If that was how they did it, they wouldn't have needed an expensive 68000 CPU for the service processor, they could have used a Z80 or whatever.

True but having two identical processors simplifies system design - both talk to memory the same way, you only need to write for one ISA etc.

I honestly remembered the story as two CPUs in lock-step but the more I think about it the more I think it is so much harder to do that than just having a system where one CPU does the paging while holding the other mid instruction. Occam's razor and all that.

Bassman59 · « **Reply #81 on:** October 07, 2015, 11:01:36 pm »

Quote from: richardman on October 05, 2015, 08:31:10 am

Anyway, Apollo machines were made by Apollo Inc., not Mentor Graphics as a poster mentioned. They were 68K based, so they used 2 68K core executing the same code, and when one fell over due to memory paging, the other would check the content of the stack so the whole machine can recover. The 68020 solved that finally by putting more machine context onto the fault stack so all instructions can be restarted.

At the bomb factory I worked for after college, we had a bunch of Apollo machines in an R&D lab. (Networked with token ring!) They ran Mentor Graphics Boardstation (I think!), and were used for analog board design and simulation. They were frightfully expensive (but, hey, bomb factory $$$). As I remember, yes, the hardware was made by Apollo, but the system, both software and hardware, was sold by Mentor.

It was a big deal when the machines were upgraded to 68040 motherboards. (Or maybe they were 030s? This was a long time ago.)

Quote

Apollo was bought by HP back in the days, around 1986 I believe.

It was later than 1986, because I started at the bomb factory in 1988, and the machines were Apollo, not HP-Apollo.

edavid · « **Reply #82 on:** October 08, 2015, 12:01:20 am »

Quote from: grumpydoc on October 07, 2015, 10:21:23 pm

Quote from: edavid on October 07, 2015, 08:05:19 pm
The lead CPU does all the memory accesses. The 2nd CPU doesn't read live memory data, it gets the same data that the lead CPU saw (delayed by a FIFO).
OK, but the lead CPU, by definition, can't complete the faulting instruction so you have to be able to have either CPU in the lead CPU role.

Sure it can... after the fault is resolved, the lead CPU is restarted at the faulting instruction.

Rasz · « **Reply #83 on:** October 08, 2015, 02:26:32 am »

you can restart as long as you are able to fire up interrupt on the following CPU before 'bad' instruction

richardman · « **Reply #84 on:** October 08, 2015, 03:08:15 am »

Quote from: Bassman59 on October 07, 2015, 11:01:36 pm

At the bomb factory I worked for after college, we had a bunch of Apollo machines in an R&D lab. (Networked with token ring!) They ran Mentor Graphics Boardstation (I think!), and were used for analog board design and simulation. They were frightfully expensive (but, hey, bomb factory $$$). As I remember, yes, the hardware was made by Apollo, but the system, both software and hardware, was sold by Mentor.

It was a big deal when the machines were upgraded to 68040 motherboards. (Or maybe they were 030s? This was a long time ago.)

Quote
Apollo was bought by HP back in the days, around 1986 I believe.

It was later than 1986, because I started at the bomb factory in 1988, and the machines were Apollo, not HP-Apollo.

Cool. I worked for LTX, the test machine company (Left Teradyne at Xmas :-) ) only for 9 months. After a spinout on RT. 128 doing a 360 on a snow day was enough for me to say "THANK GOD I AM ALIVE. I AM QUITTING ASAP!" :-) and went to work for Whitesmiths, the company that produced the first commercial C compiler outside of AT&T.

So... back to the OP, this superscalar 68K core is... dead? Too bad. It has a lot going for it. I mean the 68K architecture.

OTOH, even the mighty Motorola, back in the days, could not bring out a 16/32 bits processor, in the forms of mCore. It looks and sounds great, but everyone was/is trampling to move to ARM...

DJohn · « **Reply #85 on:** October 08, 2015, 11:20:18 am »

Quote from: grumpydoc on October 07, 2015, 10:21:23 pm

I honestly remembered the story as two CPUs in lock-step but the more I think about it the more I think it is so much harder to do that than just having a system where one CPU does the paging while holding the other mid instruction. Occam's razor and all that.

Lock-step is the story I remember hearing, but it has to be urban legend. It's too hard to make it work (if it's possible at all - what happens if there are two or more faults in the same instruction?), and the alternative is both simple and obvious.

The 68K will sit waiting for ever if you don't assert /DTACK. When the MMU sees a request for a page that isn't in physical memory, all it has to do is interrupt the other processor, wait for the signal that the data is loaded (and the page tables updated), then tell the first processor to continue. There's no need to restart any instructions. The rest of the time, your second processor can be running useful code (anything that can't fault).

grumpydoc · « **Reply #86 on:** October 08, 2015, 12:54:52 pm »

Quote from: edavid on October 07, 2015, 08:05:19 pm

The 2nd CPU doesn't read live memory data, it gets the same data that the lead CPU saw (delayed by a FIFO).

Not being funny but how does the FIFO know how deep to be?

I'm not overly familiar with 68k assembler (having done all my coding on such systems in C) but AFAICS a 68K instruction can do multiple reads - certainly up to three if both the source and destination operands live in memory. So if we could somehow lock-step instructions we would need to buffer a variable number of reads.

edavid · « **Reply #87 on:** October 08, 2015, 03:04:03 pm »

Quote from: grumpydoc on October 08, 2015, 12:54:52 pm

Quote from: edavid on October 07, 2015, 08:05:19 pm
The 2nd CPU doesn't read live memory data, it gets the same data that the lead CPU saw (delayed by a FIFO).
Not being funny but how does the FIFO know how deep to be?

I don't think it matters as long as it's deep enough.

I should probably stop trying to remember this stuff though, it's just been too long and it makes my head hurt

edavid · « **Reply #88 on:** October 08, 2015, 03:10:23 pm »

Quote from: DJohn on October 08, 2015, 11:20:18 am

The 68K will sit waiting for ever if you don't assert /DTACK. When the MMU sees a request for a page that isn't in physical memory, all it has to do is interrupt the other processor, wait for the signal that the data is loaded (and the page tables updated), then tell the first processor to continue. There's no need to restart any instructions. The rest of the time, your second processor can be running useful code (anything that can't fault).

This is a common misconception about demand paging systems... in practice, they are never built this way. Demand paging only gives a benefit if you can overlap other processing with paging.

Rasz · « **Reply #89 on:** October 08, 2015, 03:59:29 pm »

Quote from: DJohn on October 08, 2015, 11:20:18 am

The 68K will sit waiting for ever if you don't assert /DTACK. When the MMU sees a request for a page that isn't in physical memory, all it has to do is interrupt the other processor, wait for the signal that the data is loaded (and the page tables updated), then tell the first processor to continue. There's no need to restart any instructions. The rest of the time, your second processor can be running useful code (anything that can't fault).

you are in the middle of instruction, afaik you cant hold cpu there (maybe can static variants?), load will fail and corrupt

grumpydoc · « **Reply #90 on:** October 08, 2015, 05:22:35 pm »

Quote

you are in the middle of instruction, afaik you cant hold cpu there (maybe can static variants?), load will fail and corrupt

It should be fine - by not asserting DTACK the CPU will insert wait states so it isn't halted or being held without a running clock. It's just waiting for the memory operation to compete..........

ale500 · « **Reply #91 on:** October 12, 2015, 08:44:10 am »

Going back to the superscalar 68000 topic.... if someone (else) want to (also) undertake this, one possibility would be to make a code-morphing hybrid core/software package. That would also be quite an undertake... I'm tempted to do exactly that but for the simpler (but not by much) 6809, just as an exercise because I think there is not really much point wasting time in something this old and not that used, or is it ?

legacy · « **Reply #92 on:** October 12, 2015, 08:40:18 pm »

Quote from: ale500 on October 12, 2015, 08:44:10 am

code-morphing hybrid core/software package

can you explain it ?

ale500 · « **Reply #93 on:** October 13, 2015, 05:32:28 am »

You have a piece of software that converts a stream of 68k instructions into a stream of your super-fast-but-really-simple-processor instructions. Your simple super fast core can natively do in a few instructions what the 68 k do in one, but faster

.

There are a couple of possibilities regarding for instance flags calculation, they don't need to be calculated every time if they are not used. Such a system has a bit of latency but once going the throughput should be good...

richardman · « **Reply #94 on:** October 13, 2015, 09:40:55 am »

Quote from: ale500 on October 13, 2015, 05:32:28 am

You have a piece of software that converts a stream of 68k instructions into a stream of your super-fast-but-really-simple-processor instructions. Your simple super fast core can natively do in a few instructions what the 68 k do in one, but faster .

There are a couple of possibilities regarding for instance flags calculation, they don't need to be calculated every time if they are not used. Such a system has a bit of latency but once going the throughput should be good...

x86 does this since the late 90s. IMHO, not worth it for any new design - as there aren't a whole lot of existing 68K code anymore (except for old Amiga, Mac, ST etc.) so if you are going to do a fast 68K, then do a fast 68K. If you want to do a fast RISC, do a fast RISC.

Transmeta tried that too, RIP

IanP · « **Reply #95 on:** February 19, 2016, 04:51:51 am »

Just a little update on the progress of the Apollo core and Vampire 2. The Vampire 2 production board for the Amiga 600 began shipping 1 month ago (18th January 2016) to those that pre-ordered it. The boards are shipping with the Silver 1 version of the core described as "stable and fast" but not guaranteed 100% bug free. The Silver 2 version is expected to be released in the next few days. Vampire 2 users will be able to update to the Silver 2 core using an easy software procedure on the Amiga to field program the new core (no need to disassemble the computer). Silver 2 will fix some issues and likely boost performance a little. Next month is the target for production of the Vampire 2 for the Amiga 500 to begin, pending successful tests of the prototype boards. The specifications for the Amiga 500 version of the Vampire 2 are the same as the Amiga 600 version apart from the addition of a 44 pin IDE connector (like on the Amiga 600 and A1200 motherboards). Although targeted at the popular A500/A500+ this version of the Vampire 2 should also work in the original A1000, the A2000 and it's variants and possibly the CDTV. It is hoped that the Gold version of the Apollo core (including new features) will be ready in time to be installed prior to the first shipments of the Vampire 500 V2.

Modern Vintage Gamer reviewed and tested the Vampire 600 V2 with the Silver 1 Apollo core.

Lots more information available on the Apollo forums http://www.apollo-core.com/knowledge.php?b=0 where you'll find links to more videos and discussions about the Apollo core and the Vampire boards.

legacy · « **Reply #96 on:** February 19, 2016, 12:16:37 pm »

interesting

richardman · « **Reply #97 on:** February 19, 2016, 08:52:03 pm »

IMPRESSIVE!!!

sleary78 · « **Reply #98 on:** September 02, 2016, 01:13:43 pm »

So is the Apollo core available as verilog/VHDL or is it just an unlicensed rip off of the 68K series?

edavid · « **Reply #99 on:** September 02, 2016, 02:56:17 pm »

Quote from: sleary78 on September 02, 2016, 01:13:43 pm

So is the Apollo core available as verilog/VHDL or is it just an unlicensed rip off of the 68K series?

What a weird question.

It's a softcore, but it doesn't seem to be available unless you buy the "Vampire" board.

Why would you call it a "rip off"? Since when do you need a license to build a CPU emulator? What kind of license are you even talking about?

legacy · « **Reply #100 on:** September 02, 2016, 03:23:44 pm »

Quote from: edavid on September 02, 2016, 02:56:17 pm

It's a softcore, but it doesn't seem to be available unless you buy the "Vampire" board.

Vampire is a fork, a branch, of the original project and buying a board doesn't come with sources
it's a commercial private project, be aware of that

edavid · « **Reply #101 on:** September 02, 2016, 03:37:35 pm »

Quote from: legacy on September 02, 2016, 03:23:44 pm

Quote from: edavid on September 02, 2016, 02:56:17 pm
It's a softcore, but it doesn't seem to be available unless you buy the "Vampire" board.

Vampire is a fork, a branch, of the original project and buying a board doesn't come with sources
it's a commercial private project, be aware of that

Is the original project source available?

sleary78 · « **Reply #102 on:** September 02, 2016, 04:46:43 pm »

Quote from: edavid on September 02, 2016, 02:56:17 pm

Quote from: sleary78 on September 02, 2016, 01:13:43 pm
So is the Apollo core available as verilog/VHDL or is it just an unlicensed rip off of the 68K series?

What a weird question.

It's a softcore, but it doesn't seem to be available unless you buy the "Vampire" board.

Why would you call it a "rip off"? Since when do you need a license to build a CPU emulator? What kind of license are you even talking about?

Weird How? I've worked on many HDL CPU Cores... this is just another one but closed source.

A soft core? If its implemented in HDL then its not what i'd call a soft core. HDL can be fabbed onto ASICs. If its some sort of RISC cpu running a software emulation then fair enough.

ARM Ltd will threaten to sue you if you release ARM HDL cores/clones. Many of them have been taken off github and in the past google code. I expected Freescale/NXP would have a similar policy.

IMHO this is a 680x0 clone since it can be delivered in hardware form and can replace a real 68K. Hence its ripping off a commercially available chip. Its also probably breaking the licensing owned by whoever has the hardware IP from Amiga Inc if its shipping an AGA chipset clone too. So its a rippoff on two fronts.

edavid · « **Reply #103 on:** September 02, 2016, 05:12:01 pm »

Quote from: sleary78 on September 02, 2016, 04:46:43 pm

A soft core? If its implemented in HDL then its not what i'd call a soft core.

OK, but be aware that you have a different definition of soft core than everyone else.

Quote

ARM Ltd will threaten to sue you if you release ARM HDL cores/clones. Many of them have been taken off github and in the past google code.

That's because of their (stupid) patents. It doesn't apply to 68K.

Quote

IMHO this is a 680x0 clone since it can be delivered in hardware form and can replace a real 68K. Hence its ripping off a commercially available chip.

What 68K CPU is commercially available?

Anyway, why is releasing a compatible CPU a ripoff? Are all the 8051 clones ripoffs?

sleary78 · « **Reply #104 on:** September 02, 2016, 06:05:29 pm »

Quote from: edavid on September 02, 2016, 05:12:01 pm

Quote from: sleary78 on September 02, 2016, 04:46:43 pm
A soft core? If its implemented in HDL then its not what i'd call a soft core.
OK, but be aware that you have a different definition of soft core than everyone else.

I think we may be talking at cross purposes. The binary thats getting flashed into the Vampire II board is a softcore. However HDL sources (if they exist) arent. They're something you can create a physical ASIC from. This is what I am referring to. The topic title refers to the Apollo core not the Vampire core. There is a differents albeit its subtle.

Quote from: edavid on September 02, 2016, 05:12:01 pm

Quote
ARM Ltd will threaten to sue you if you release ARM HDL cores/clones. Many of them have been taken off github and in the past google code.
That's because of their (stupid) patents. It doesn't apply to 68K.

Ok. thats probably true since the 68K is used in military applications and needs to be available from more than one source. I'd be surprised if there were no patents involved.

I know i got nervous about releasing my Archimedes core with an updated Amber ARM core in it. The patent had expired on ARM2 but I was still nervous.

Quote from: edavid on September 02, 2016, 05:12:01 pm

Quote
IMHO this is a 680x0 clone since it can be delivered in hardware form and can replace a real 68K. Hence its ripping off a commercially available chip.
What 68K CPU is commercially available?

Anyway, why is releasing a compatible CPU a ripoff? Are all the 8051 clones ripoffs?

Ripoff in this case implies clone. Not necessarily illegal or negative.

My issue with this project is that the HDL 68Ks out there so far are mostly GPL and i'm sceptical about a closed source core that may be building on GPLd work. If this was at least open source (but commercial) I'd give it less of a hard time.

legacy · « **Reply #105 on:** September 02, 2016, 06:17:55 pm »

Quote from: edavid on September 02, 2016, 03:37:35 pm

Is the original project source available?

last time I checked: - no -
and I am afraid it's still - no -

you can contact the ex-NATAMI team (Apollo-team)
and ask them for their sources
they might agree IF you have a valid purpose

e.g. because you are willing to develop a new-amiga platform
or to improve/test their core
or something similar

but keep in mind their code is NOT opensource

don't ask me why, I am not affiliated with them
I had a few interests in supporting 68K
so I started to check around, meeting their project

currently I am definitively not interested in 68k
since I am busy with my own toy-RISC project

andersm · « **Reply #106 on:** September 02, 2016, 06:22:16 pm »

Quote from: edavid on September 02, 2016, 05:12:01 pm

What 68K CPU is commercially available?

AFAIK the M68SEC000 is still manufactured, as the last of its kind.

legacy · « **Reply #107 on:** September 02, 2016, 06:25:08 pm »

Quote from: sleary78 on September 02, 2016, 06:05:29 pm

What 68K CPU is commercially available?

what I know:

68SEC000 at 3.3V, very good for fpga, used by Texas Instruments
in order to build their TI89/TI92/Vojager CAS-calculators
(even if currently replaced by nSpire calculators, which are ARM-based)

68332 at 5V, used in military applications
and in some automotive applications, e.g. Ford Racing
mainly due to the legacy code and due to the TPU unit

sleary78 · « **Reply #108 on:** September 02, 2016, 08:38:31 pm »

Quote from: legacy on September 02, 2016, 06:17:55 pm

Quote from: edavid on September 02, 2016, 03:37:35 pm
Is the original project source available?

last time I checked: - no -
and I am afraid it's still - no -

you can contact the ex-NATAMI team (Apollo-team)
and ask them for their sources
they might agree IF you have a valid purpose

e.g. because you are willing to develop a new-amiga platform
or to improve/test their core
or something similar

My interest would be getting the core running on the MiST platform which is probably not in the interest of the Apollo-Team. sigh.

Rasz · « **Reply #109 on:** September 02, 2016, 09:44:43 pm »

Quote from: edavid on September 02, 2016, 03:37:35 pm

Is the original project source available?

no, last time I checked ~2? years ago authors were still deluding themselves into thinking they have something commercially viable on their hands and tried to peddle this 30 year old technology to IBM of all places (claiming it can scale >1GHz in asic and be competitive against powerpc).

Tomorokoshi · « **Reply #110 on:** September 03, 2016, 02:01:02 am »

I turbocharged an Amiga 1000 by replacing the 68000 with a 68010 from an Apollo workstation.

For the most part it worked. Needed to run the "vbr" program to set the Vector Base Register. Some of the games and demos that did particularly egregious hardware banging had trouble running. Otherwise there was a nice speed improvement.

edavid · « **Reply #111 on:** September 03, 2016, 02:09:00 am »

Quote from: Tomorokoshi on September 03, 2016, 02:01:02 am

I turbocharged an Amiga 1000 by replacing the 68000 with a 68010 from an Apollo workstation.

For the most part it worked. Needed to run the "vbr" program to set the Vector Base Register. Some of the games and demos that did particularly egregious hardware banging had trouble running. Otherwise there was a nice speed improvement.

Why was it faster? The 68010 is not any faster than than the 68000 except for "loop mode", which hardly makes a difference.

Tomorokoshi · « **Reply #112 on:** September 03, 2016, 02:53:11 am »

Quote from: edavid on September 03, 2016, 02:09:00 am

Quote from: Tomorokoshi on September 03, 2016, 02:01:02 am
I turbocharged an Amiga 1000 by replacing the 68000 with a 68010 from an Apollo workstation.

For the most part it worked. Needed to run the "vbr" program to set the Vector Base Register. Some of the games and demos that did particularly egregious hardware banging had trouble running. Otherwise there was a nice speed improvement.

Why was it faster? The 68010 is not any faster than than the 68000 except for "loop mode", which hardly makes a difference.

If I recall, some of the instructions were optimized to use fewer clock cycles. The number I remember from the time is about a 10% improvement.

I found out about it either from newsgroups or magazines, besides, otherwise there wouldn't have been a need for the "VBR" program.

sleary78 · « **Reply #113 on:** September 03, 2016, 10:54:29 am »

I've just sent off for some boards to be made up of an 020 accelerator that fits in a 68K socket. Mostly i'm wanting to remove the need for PALs and move to a CPLD design. So its proof of concept work rather than production.

legacy · « **Reply #114 on:** September 24, 2016, 11:20:38 am »

Quote

Hi All,

i want to let everyone know in more detail why pricing has gone up, There was no possible way that i could get it done, I have spent hours locked up in my room trying to get these completed, as well, i have fallen way behind in other orders as people can testify and this is having an impact wit my sanity and patience. The amount of work and co-ordination required to get this all done is probably harder than what most people think. Add to this the constant flaming about people complaining that this should be handed over to a board house to get this done, and now we have people are complaining about the increased price.

Whenever you import parts bit by bit you can sometimes get them in under the radar, whenever you import expensive items the customs people rub their hands with glee and will gladly accept visa and your first born. Some people are cancelling their orders now and TBH i can understand. When we started this project and decided on the price we wanted to make this available to the masses and expected a small demand for the A600. I will never believe people who say they dont like the A600 as it appears there are at least 1000 active users of the A600. we were naive to think we could get this done at a lower price and people were excited to see this great product at a basement price, (Which it was). Once we made the decision to get the boards done professionally (at the request of a lot of people) we have now offended people at the other end of the spectrum. I have been receiving hate mail from people who are cancelling their orders and being accused of gouging, and multiple other reasons that we have used to increase the price.

I always said when it is no longer fun what i am doing then it is time to stop, So i will stop, Once the Vampire/Phoenix project is complete then it will be time to me to sell off existing bits and pieces and disappear into the woodwork. I am getting too old for this stress,

Thanks to all who supported me and to the rest, goodbye.

to the above poster, show me the $16 fpga we use from digikey ?

unfortunately the above drama happened here

CJay · « **Reply #115 on:** September 27, 2016, 12:56:24 pm »

Quote from: Tomorokoshi on September 03, 2016, 02:53:11 am

Quote from: edavid on September 03, 2016, 02:09:00 am
Quote from: Tomorokoshi on September 03, 2016, 02:01:02 am
I turbocharged an Amiga 1000 by replacing the 68000 with a 68010 from an Apollo workstation.

For the most part it worked. Needed to run the "vbr" program to set the Vector Base Register. Some of the games and demos that did particularly egregious hardware banging had trouble running. Otherwise there was a nice speed improvement.

Why was it faster? The 68010 is not any faster than than the 68000 except for "loop mode", which hardly makes a difference.

If I recall, some of the instructions were optimized to use fewer clock cycles. The number I remember from the time is about a 10% improvement.

I found out about it either from newsgroups or magazines, besides, otherwise there wouldn't have been a need for the "VBR" program.

That and it was easier to get 16MHz 68010 chips IIRC, you could then overclock it as well.

ale500 · « **Reply #116 on:** September 28, 2016, 07:03:43 am »

The code for the vampire board is available, the version I saw had a TG68K core inside.

legacy · « **Reply #117 on:** September 28, 2016, 11:56:45 am »

Quote from: ale500 on September 28, 2016, 07:03:43 am

The code for the vampire board is available, the version I saw had a TG68K core inside.

where?
the TG68k is not the Apollo-core

legacy · « **Reply #118 on:** September 28, 2016, 12:42:32 pm »

oh, I see a new commit of their web page

legacy · « **Reply #119 on:** September 28, 2016, 12:43:55 pm »

Quote

Info: About recent events
Amiga FPGA accelerator

As you probably know participation in Apollo-team project was 90Eur, later on price went to 120Eur because design change. For that price you were getting Vampire 600 V2 accelerator card and opportunity to have fastest and most compatible Amiga accelerator ever produced, Apollo-core ported to Vampire card with unique features and with more than 100MIPS with latest published cores. From initial core release we went up for more than 20MIPS with adding more and more instructions, more compatible video output drivers and support for MicroSD. Everyone got opportunity to participate in this amazing project seeing it as a next best thing after minimig. Lot of people realized that we inside Apollo-team could bring something to the market, beyond wildest dreams of Amiga enthusiasts who waited for over 20 years for something like this to happen. As a person who worked on hardware design for last 6 years I was under constant pressure helping the team best I could in various areas with my modest knowledge. At the time when money was needed Amiga community recognized potential of this project and placed first pre-orders helping a lot financially. Without those people none of this would happen and that's the truth. They were helping me also in other things, getting parts, equipment. There was so many energy in the air send by them. That gave me the strength to work constantly more than 10 hours per day, every single day. For two years I didn't go to fishing properly or did something else like spending some time with the family.Don't know, just go somewhere with them. I didn't find time to do any other things, this project took it all. Project was frustrating, from the point that design needed change, five times, to the fact that in process of testings I had to buy parts who are now forgotten in those changes.

When you solder by hand about 3200 0402 capacitors and loads of other parts sacrificing in front of everything my health and then when you see that someone else is making money from your hard work then you start to wonder what are you doing wrong. From the start money wasn't my motivation. If that was the case I could sell this design number of times or I could increase the price more. Why solder by hand you may wonder. Because that was the only option to keep prices low and affordable to everyone, at least it was me who was talking constantly that no one should benefit from nostalgic feelings of retro computers enthusiasts. That's why I have opensourced first version of the Vampire 600 so anyone can continue work on it. If this was about the money then I wouldn't end up in the situation where I need to find money for future projects. I was forcing kipper2k to keep prices of the accelerators low with minimum profit or close to zero and each time one FPGA dies he is left with loses. And yes those things happen, I have loads of overheated or ESD damaged parts.

Lot of questions are in my mind lately, could we get even better results if we had money to pay someone for the drivers, demos, for the test cases, instead everyone inside team was working for free. With recent Intel purchases prices of the parts went even higher, yet we kept prices down because we had stock of the parts ordered before those Intel's takeovers. Same time people started to put their Vampire 600V2 cards on eBay selling it for anywhere between 300-800Eur. You may say that's not illegal and they can do whatever they want with their cards but please be in my position to work on this so hard and still don't have money for tooling fees needed for card edge connectors. Some people are adding themselves to our waiting list because pure profit they can make by selling the card five times more than they paid for. I was at the edge to block those cards from future core uploads but I didn't do it and I won't.

Since I have enough parts only to cover initial pre-orders placed on Amibay without need for more funds and because of recent eBay events I m forced to increase the price anywhere between 230-250Eur for various models. Without this decision we are dead in the water and we will hardly find money to finance Vampire 1200. I was idealist, thinking that I should play fair and give anyone opportunity to get this piece of hardware not respecting my time or energy and knowledge rest of the team put into this. I was an idiot thinking that everyone will keep their cards and follow our progress, they didn't even bother to try latest cores, instead they went for quick money. In the darkest corner of my mind I couldn't predict that someone will sell the card instantly when he gets it... Thank you eBay sellers for opening my eyes. World turns when money talks. Finally I learned that.

( link )

legacy · « **Reply #120 on:** September 28, 2016, 12:59:57 pm »

2015, Vampire V2, Gunnar von Boehn
(Apollo Team Member)

legacy · « **Reply #121 on:** September 28, 2016, 01:03:36 pm »

Quote

If you compare Coldfire V4 and Apollo then you see that:
1) Apollo can execute twice as many instructions per clock cycle.
2) Apollo can read/write twice as much data from Caches.
3) Apollo can process per instruction 64bit - so it can do twice as much work per instruction.
4) Apollo has automatic memory stream detection, and automatic hardware prefetching.

Each routine is different of course.
But for new code in general Apollo has the potential to be 4 times faster than Coldfire V4 at the same clock.

For old 68k Legacy code the situation is different.
As you know Coldfire does unfortunately _NOT_ support many 68K instruction and data sizes.
For example BYTE or WORD operations are not supported by Coldfire.

Coldfire will execute an exception when doing many 68K instructions. Executing such an exception takes many ~100 cycles.

We evaluated Coldfire V4 system during the NATAMI time and were disappointed with the legacy code speed.

I would assume for legacy 68k code, APOLLO is in the order of 100 times faster than a Coldfire core.

this may be interesting (written by Gunnar von Boehn)

legacy · « **Reply #122 on:** September 28, 2016, 01:11:24 pm »

Quote

Quote
And Gunnar, if you allready have any plans for the future regarding commercial usage of your Apollo core. Do you plan to sell only your own hardware using Apollo core or do you have plans to sell the core also to other developers and how?

Sure, licensing the APOLLO Core is certainly possible.

(written by Gunnar von Boehn)

legacy · « **Reply #123 on:** September 28, 2016, 02:51:13 pm »

Quote

September 28, 2016 at 14:40:50

message: hi
is it possible to see the HDL-code used in the Vampire accelerators ?
for personal use

let me know
Carlo

Quote

Dear Carlo,

Thanks for your email.

Apollo Core is closed source and we can't provide VHDL code.

Best Regards

Renaud

just asked directly to the team

ale500 · « **Reply #124 on:** September 29, 2016, 08:39:28 am »

Yes, they have a new core, but, here can you see that they used the TG68K:

http://www.majsta.com/modules.php?name=News&file=article&sid=77

Those are the sources, 0.1, that I saw.

Here you can read about some happy owners

: http://www.amiga.org/forums/showthread.php?t=70424&page=3

legacy · « **Reply #125 on:** September 29, 2016, 10:08:02 am »

Quote from: ale500 on September 29, 2016, 08:39:28 am

Yes, they have a new core, but

but TG68k is slower (by different orders of magnitude) than Apollo-core
and it's NOT used in Vampire as *product*
it's a personal choice that owners can do reloading their fpga's bitstream

it's like talking about TI-calculators, sold by TI with a proprietary TI-OS
but people can reload their own calculators with linux (with sucks great)


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee