Author Topic: Mill CPU Architecture  (Read 35668 times)

0 Members and 1 Guest are viewing this topic.

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Mill CPU Architecture
« Reply #25 on: February 22, 2016, 10:52:33 pm »
The Harvard Architecture on the other hand is a marked improvement over the VonNeumman in terms of security.

A bit of an over generalization? Most vonNeumman CPUs that are able to run an OS with virtual memory implement a lot of the "obvious" security enhancements using MMU features (read-only pages, no-execute flags...).
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Bruce Abbott

  • Frequent Contributor
  • **
  • Posts: 627
  • Country: nz
    • Bruce Abbott's R/C Models and Electronics
Re: Mill CPU Architecture
« Reply #26 on: February 22, 2016, 11:22:37 pm »
Quote
Now you are talking about a computer system, not the CPU.
If you insist on making that distinction... On my part I don't know how to take an existing CPU and design a different architecture around it, so the point of the exercise  is lost on me.
Because the design of the system has a huge impact on security - far more than what memory map the CPU uses.

Quote
The Red Pill Attack is an attack first demonstrated in 2006 by Joanna Rutkowska where a complete system working on the bare hardware is transferred to virtual environment without the system realizing it. It renders memory protection completely irrelevant.
Any system can be compromised if you are able to emulate it exactly - but that doesn't make memory protection a hack.

You know what are hacks? Cache RAM, pipelining, out-of-order instruction execution, predictive branching, superscalar. All hacks that were added to make up for memory not being as fast as we want.

Quote
This is only one in a long list of attacks that defeat memory protection. All in all leaving such a fundamental security task to the operating system is foolish, a hack at best, and completely inexplicable from security standpoint.
The primary purpose of memory protection is to prevent an errant program from trashing memory that doesn't belong to it. How easily it can be defeated depends on how it is implemented. If it is controlled by an insecure operating system then that is the fault of the operating system, not memory protection.

Quote
This is like leaving prison security to the filing clerk. It would work 100% if all prisoners were honest...
Most prisons have high walls topped with barbed wire.  But prisoners will still find a way over them if not stopped - proving that the walls are a foolish hack...
 

Offline obiwanjacobiTopic starter

  • Frequent Contributor
  • **
  • Posts: 988
  • Country: nl
  • What's this yippee-yayoh pin you talk about!?
    • Marctronix Blog
Re: Mill CPU Architecture
« Reply #27 on: February 26, 2016, 06:34:53 am »
Here is a video on the Mill Security. Again I am no expert, but it did sound clever to me  ::)

Arduino Template Library | Zalt Z80 Computer
Wrong code should not compile!
 

Offline Mechanical Menace

  • Super Contributor
  • ***
  • Posts: 1288
  • Country: gb
Re: Mill CPU Architecture
« Reply #28 on: February 26, 2016, 10:57:26 am »
The Harvard Architecture on the other hand is a marked improvement over the VonNeumman in terms of security.

Implementing security is always a balancing act with keeping something usable. How do you implement self modifying code on a pure Harvard architecture machine? Or load a program when you can't treat code as data? The answer? Modified Harvard*, which comes with all the same security problems as Von Neumann. Pure Harvard is just terrible for a general purpose computer.

*Most of the time. Some MCUs and DSPs just have special load/save program memory instructions, but they aren't general purpose.
Second sexiest ugly bloke on the forum.
"Don't believe every quote you read on the internet, because I totally didn't say that."
~Albert Einstein
 

Offline Schol-R-LEA

  • Newbie
  • Posts: 2
  • Country: us
Re: Mill CPU Architecture
« Reply #29 on: May 09, 2017, 05:10:52 am »
Do you have a reference for that?
I watched all of his lectures, and that statement is based purely on my understanding of the architecture. It is possible, of course, that they have thought this through, but I feel like they really did not and interrupts will be implemented as a hack with a separate belt, which will mean very inefficient communication between the main belt and interrupt belt.

I apologize for the very, very late response, but I thought that a bit of thread necromancy was called for, as I don't know if you ever found the answer to this matter.

Note that I am myself still unsure if the Mill is ever going to be a working system, or if will even come close to the promoted performance if it does (probably not), but frankly (as Godard says himself many times) these things are only the bare minimum of a successful product in any case. Circumstance, happenstance, and marketing are far more important no matter what, for any product.

That having been said, you are correct in that it would use a second belt for interrupts, as is explained in the second of Godard's lectures. Indeed, this is the case for any procedure call - and  in the Mill design, an interrupt is simply a procedure call initiated externally.

However, this is a misleading answer, and the way it is misleading is directly related to the question you raise. The question presupposes that the belt is a single fixed sequence of registers treated as a FIFO queue - which is the way it seems to the programming model, but is not, in fact, the way it is implemented in any of the planned designs.

Note that I said 'designs'. Godard makes this point repeatedly in the videos, that planned Mill CPUs would be a family, not in the sense that the x86 is, with a single binary execution model and a single basic hardware implementation (which might change over time, but would be the same for a given generation), but a family in the sense that the System 360 was - a single programming model, and a (mostly) common assembly language for compilers to target, but with different concrete instruction sets (they intend to use the same kind if 'assembly specializer' IBM used, in fact) and hardware implementations that could be radically different.

He gives one example approach to implementing the belt, one which he says is one they do mean to use in some models but which would not be universal. The layout he described is a large anonymous register file and a pointer to the current head of the belt; the belt would be operated on as a ring buffer. As elements are added to the belt, the belt head (which is entirely inaccessible to the program, being part of the CPU's internal state) would advance, and the results would be added to the top of the belt while the bottom of the belt is cleared (to prevent insecure peeking, though that really shouldn't be possible anyway unless there is a flaw in the hardware - there is no way to access the parts of the belt register file outside of the section currently in use by the belt).

There is a separate, sequestered stack for procedure return addresses. Procedure arguments are to be placed onto the belt; any values that still need to be kept by the function but would go off of the belt would be explicitly spilled to a scratchpad area (which is the case for data that would fall off the belt before it is 'dead' in general, not something specific to procedures). If a nested procedure (assuming that you are using them, which in turn means it depends on your using a language which supports nested procedures with lexical scope) needs to use a value in the caller's scope, it is up to the compiler to keep track of where it is on the belt (or in the scratchpad).

When a procedure call occurs, the belt head is advanced one 'belt length' past the current procedure's belt. When the procedure exits - well, I am not entirely sure if I've got this right, but my understanding is that the compiler is meant to arrange the return values so that they are at the end of the belt, and the previous procedure's values are then copied to the remainder of the belt in hardware (and the residual values in the previous belt cleared) as part of the return op. The important thing here is that the return values are spliced to the values at the top of the caller's belt.

A later video talks about how the sequencing of execution is used in this to chain successive calls, and thus provide hardware tail call optimization, but that's another issue entirely. Presumably, in this design, if the call depth exceeds the belt register file and wraps, the overwritten part  would be invisibly spilled to the scratchpad or something similar.

Interrupts are essentially just procedure calls in this model. When the interrupt occurs, the interrupt routine is looked up in a vector table as usual; however, I get the impression from the talks and the Mill Computing wiki that the vector table is yet another sequestered register file like the return address stack and the parts of the belt register file not currently in use, and that setting a vector requires a specific instruction. Since a hard interrupt wouldn't have a return value (I get the impression that there are no soft interrupts, and even if there are, they would again behave like a regular procedure), the merge step can be skipped, but this would be true of a void procedure anyway.

Now, I don't know enough about how this would actually perform, so I can't speak to its plausibility as a working model, but that should at least answer the question according to the available information.
« Last Edit: May 09, 2017, 05:26:11 am by Schol-R-LEA »
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11248
  • Country: us
    • Personal site
Re: Mill CPU Architecture
« Reply #30 on: May 09, 2017, 05:51:38 am »
The primary problem with all those inaccessible states is that virtualization and threading are pretty much not going to work. Even if you make all those things accessible though some special registers, it would take a lot of time to save context. This is exactly the same problem as with processors with large explicit register file (like SPARK).
Alex
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19494
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Mill CPU Architecture
« Reply #31 on: May 09, 2017, 07:47:30 am »
The primary problem with all those inaccessible states is that virtualization and threading are pretty much not going to work. Even if you make all those things accessible though some special registers, it would take a lot of time to save context. This is exactly the same problem as with processors with large explicit register file (like SPARK).

I've asked Goddard that, and he has privately responded to me indicating why there isn't a vast amount of state to be saved during context switches.

One thing to consider with conventional out-of-order processors is that where they don't need to save some in-flight computational state, they do need to re-compute it when they restart.

Overall my suspicion is that the patents will be more influential and financially important than the implementations, but I hope I am wrong.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11248
  • Country: us
    • Personal site
Re: Mill CPU Architecture
« Reply #32 on: May 09, 2017, 07:54:55 am »
there isn't a vast amount of state to be saved during context switches.
Don't you actually need to save the whole belt contents?
No matter how ingenious your system is, you need to save the whole state, there is no working around that. I find it hard to believe that hand waving without some evidence.

One thing to consider with conventional out-of-order processors is that where they don't need to save some in-flight computational state, they do need to re-compute it when they restart.
It does not matter, same exact thing will happen here. Things on the belt is a state, that has been committed. There is no recomputing that.
Alex
 

Offline obiwanjacobiTopic starter

  • Frequent Contributor
  • **
  • Posts: 988
  • Country: nl
  • What's this yippee-yayoh pin you talk about!?
    • Marctronix Blog
Re: Mill CPU Architecture
« Reply #33 on: May 09, 2017, 11:27:09 am »
You can only throw away state that is still in the pipeline - just like with a branch. (at least that is how I understand it for classic pipelined CPUs)
Arduino Template Library | Zalt Z80 Computer
Wrong code should not compile!
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19494
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Mill CPU Architecture
« Reply #34 on: May 09, 2017, 11:47:37 am »
there isn't a vast amount of state to be saved during context switches.
Don't you actually need to save the whole belt contents?
No matter how ingenious your system is, you need to save the whole state, there is no working around that. I find it hard to believe that hand waving without some evidence.

Clearly not; for example you don't have to save the state in any of the many caches in a modern high-speed processor!

Quote
One thing to consider with conventional out-of-order processors is that where they don't need to save some in-flight computational state, they do need to re-compute it when they restart.
It does not matter, same exact thing will happen here. Things on the belt is a state, that has been committed. There is no recomputing that.

In this context the belt is roughly equivalent to the architected registers in a conventional processor.  They are only a small part of the state in a processor; much of the state is "hidden" from the ABI, changes between processor implementations, and is usually undocumented. The claim is that that state is much smaller, so context switches can be faster.

N.B. I have only been watching the Mill from a distance. All I know is that it is radical, interesting, and has apparently avoided the blind alleys found in conventional architectures.

If you want more information, I suggest you look at comp.arch, where Ivan Goddard frequently discusses and explains the Mill. He also frequently says "that patent is not yet filed"!
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline Schol-R-LEA

  • Newbie
  • Posts: 2
  • Country: us
Re: Mill CPU Architecture
« Reply #35 on: May 09, 2017, 01:29:32 pm »
This is both a strength and weakness of the whole 'family' narrative, both regarding the design itself and regarding the promotion of the project.

Keeping the ABI and API relatively abstract eases a lot of the pressures both for development and for future iteration, because it means that they aren't tied to a specific implementation and can redesign without adding as many more translation layers and other workarounds (something that has some bearing on, say, the x86). It is the hardware design equivalent of loose coupling. It also means that a dead end can be backed out of quickly and relatively painlessly. However, it also means that the designers won't ever have an endpoint for iterating the design - it makes it so that they can always say, "well, this didn't work out, but the next implementation will work, for sure!" This isn't a setup that instills confidence in the investors or outside observers.

It makes it hard to pin down the answers, especially when they can hide behind the argument that the patents aren't all processed (meaning they can avoid revealing hard answers by claiming IP rights concerns). This means that no one can gainsay their claims yet, but it also weakens their claims because it makes it look like handwaving. Whether this apparent evasiveness is actually justified hardly matters when they can always say,"yeah, but that's just one possible implementation".

While I have high hopes for the architecture, they are more of the "wouldn't it be great if they were right?" sort of thing than a "I just know this is going to be great!" variety. New development is inherently risky, both in terms of technical viability and market success, and even a workable and effective product that is measurably superior still has a long road to go before it can be described as a successful product. The one thing I will give Godard is that he does seem to be aware of this, and speaks of it many times in interviews and lectures, which indicates that he at least knows how to sound realistic about the prospects which the design (or any other new one) has in the market. Whether that is itself just a tactical move is yet to be seen.

And yes, I agree with tggzzz that, if the individual technologies going into the Mill prove viable (even if the whole does not), then there is a good chance that the patents will have a greater impact than the Mill itself.

Note that I said 'patents', not technologies; a concern with anything like this is the possibility that, if the owners turn out to be less ethical than they portray themselves, or the company fails and ends up selling their IP, the patents could end up in the hands of large corporations (or worse, a patent-troll law firm) and used as a bludgeon against anything remotely similar. This is nothing new, though, and is just a part of innovation. IIUC, Watt is seen as the father of the steam engine less because of his improvements over the Newcomen design, and more because Newcomen refused to license his design, leading Watt (or rather, his engineers) to develop a newer model that wasn't covered by the patents, and because he was better at finding applications, markets, and customers for it (Newcomen apparently also refused to sell engines for anything he wasn't directly involved in, meaning his engines were only getting used in mines for pumping).
« Last Edit: May 09, 2017, 02:10:35 pm by Schol-R-LEA »
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11248
  • Country: us
    • Personal site
Re: Mill CPU Architecture
« Reply #36 on: May 09, 2017, 04:11:09 pm »
In this context the belt is roughly equivalent to the architected registers in a conventional processor.
Presumably belt needs to be large to be useful. If it is roughly the same as the modern processors (~16 registers), then I really don't see how this whole thing is different. You can make a weird register allocator for the normal compiler that will basically do what belt does.

"that patent is not yet filed"!
He's been saying this for years on various pretty primitive stuff. I gave them benefit of the doubt, but at this point, I call BS. They may be milking investors, or something, but the useful output from them is nil. There is noting to look at here, until they are actually ready to release something.
« Last Edit: May 09, 2017, 04:16:10 pm by ataradov »
Alex
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19494
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Mill CPU Architecture
« Reply #37 on: May 09, 2017, 04:22:16 pm »
In this context the belt is roughly equivalent to the architected registers in a conventional processor.
Presumably belt needs to be large to be useful. If it is roughly the same as the modern processors (~16 registers), then I really don't see how this whole thing is different. You can make a weird register allocator for the normal compiler that will basically do what belt does.

IIRC the number is implementation dependent, and is simply a parameter input into the tools that create the implementation and the toolchain.

If you understood the Mill, you would realise that the belt in combination with many other facets of the architecture enables the "DSP-like" issue rate. If you have registers, you lose important properties that enable the speedup; hence you can't "make a weird register allocator" to the same effect.


Quote
"that patent is not yet filed"!
He's been saying this for years on various pretty primitive stuff. I gave them benefit of the doubt, but at this point, I call BS. They may be milking investors, or something, but the useful output from them is nil. There is noting to look at here, until they are actually ready to release something.

I appreciate that you don't understand the Mill; neither you nor I have sufficient information to either call BS nor to say it will work. However, for every topic they have chosen to disclose publicly, they have a very good story.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11248
  • Country: us
    • Personal site
Re: Mill CPU Architecture
« Reply #38 on: May 09, 2017, 04:28:59 pm »
IIRC the number is implementation dependent, and is simply a parameter input into the tools that create the implementation and the toolchain.
Yes, but it is important for the discussion to have an approximate number, or a number they expect to be barely useful vs a number providing substantial benefit. But, "family", I know, I know...

If you understood the Mill, you would realise that the belt in combination with many other facets of the architecture enables the "DSP-like" issue rate.
That sounds like a religious argument. Issue rate is largely defined by the memory architecture (if we are talking about desktop class processors), and your ability to deliver instructions in time. How your core is organized is also important, but not that much. Even the fastest code will do nothing if memory is slow.

An with modern register-based CPUs there is a reasonable balance between the two. Significant improvement in one or the other will not do much to improve overall performance.

If you have registers, you lose important properties that enable the speedup; hence you can't "make a weird register allocator" to the same effect.
What properties?
Alex
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19494
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Mill CPU Architecture
« Reply #39 on: May 09, 2017, 04:55:17 pm »
IIRC the number is implementation dependent, and is simply a parameter input into the tools that create the implementation and the toolchain.
Yes, but it is important for the discussion to have an approximate number, or a number they expect to be barely useful vs a number providing substantial benefit. But, "family", I know, I know...

If you understood the Mill, you would realise that the belt in combination with many other facets of the architecture enables the "DSP-like" issue rate.
That sounds like a religious argument. Issue rate is largely defined by the memory architecture (if we are talking about desktop class processors), and your ability to deliver instructions in time. How your core is organized is also important, but not that much. Even the fastest code will do nothing if memory is slow.

An with modern register-based CPUs there is a reasonable balance between the two. Significant improvement in one or the other will not do much to improve overall performance.

If you have registers, you lose important properties that enable the speedup; hence you can't "make a weird register allocator" to the same effect.
What properties?

All the answers to those questions are provided in the comp.arch archives, youtube videos, and the millcomputing website.

I have no intention of wasting my life to produce a poor and probably inaccurate summary of that information.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline free_electron

  • Super Contributor
  • ***
  • Posts: 8517
  • Country: us
    • SiliconValleyGarage
Re: Mill CPU Architecture
« Reply #40 on: May 09, 2017, 05:29:17 pm »
The Red Pill Attack is an attack first demonstrated in 2006 by Joanna Rutkowska where a complete system working on the bare hardware is transferred to virtual environment without the system realizing it. It renders memory protection completely irrelevant.
We are talking about different things. If x86 was Harvard, it still would be possible to get the image of the entire system and put it into an emulator. How does it solve anything?

You need a real hardware authentication method. That's what UEIFI and Secure boot are.
FUNNY. x86 is a modified Harvard with uniform memory space. intel started by making pure harvard machines 8048 , 8051 et al were pure harvard machines. Arm cortex is also harvard !
Professional Electron Wrangler.
Any comments, or points of view expressed, are my own and not endorsed , induced or compensated by my employer(s).
 
The following users thanked this post: JPortici

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19494
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Mill CPU Architecture
« Reply #41 on: May 10, 2017, 07:29:07 am »
All the answers to those questions are provided in the comp.arch archives, youtube videos, and the millcomputing website.

FYI, for Godard's informative postings in comp.arch, see
https://groups.google.com/forum/#!searchin/comp.arch/Ivan$20Godard|sort:date
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Mill CPU Architecture
« Reply #42 on: May 10, 2017, 09:19:54 am »
I remember Itanium - another promising CPU  'family', that offered lots of promise, from some of the world's greatest minds.It was deeply pipelined, supported by big names (Intel, HP...), it took years to get to market, and when it did it had a few rough corners when it came to interrupta, traps and dealing with memory latency.

If I remember correctly, the vendors also claimed that it was superiour - the smarts were inthe software tool chain, and just neede better compilers to unleash the beast within.

No matter how much money was thrown at the project (and a lot of money was thrown at it)  it just couldn't outperform a CPU that can schedule instructions dynamically at runtime as data became available.

Given the slow time to a viable product, The Mill must have some big technical issues slowing it down....
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19494
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Mill CPU Architecture
« Reply #43 on: May 10, 2017, 10:00:38 am »
I remember Itanium - another promising CPU  'family', that offered lots of promise, from some of the world's greatest minds.It was deeply pipelined, supported by big names (Intel, HP...), it took years to get to market, and when it did it had a few rough corners when it came to interrupta, traps and dealing with memory latency.

If I remember correctly, the vendors also claimed that it was superiour - the smarts were inthe software tool chain, and just neede better compilers to unleash the beast within.

No matter how much money was thrown at the project (and a lot of money was thrown at it)  it just couldn't outperform a CPU that can schedule instructions dynamically at runtime as data became available.

Given the slow time to a viable product, The Mill must have some big technical issues slowing it down....

Not quite. It was that there was insufficient parallelism that could be determined at compile time - a problem that had been investigated since the 1960s with little success. The Itanic programme was started, IIRC, in ~1990, and the first product was, again IIRC, in about 2000.

The Mill architects are well aware of why Itanic failed and, appear to have avoided those traps.

The time-to-market is not a reliable indicator of future success; there are too many other factors involved.  In particular, the engineering resources HP applied to the Itanic were vast; I suspect much larger than the Mill's resources.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline richardman

  • Frequent Contributor
  • **
  • Posts: 427
  • Country: us
Re: Mill CPU Architecture
« Reply #44 on: May 10, 2017, 09:14:35 pm »
I was in the teams that worked on the original Itanium, and I was/am in the Mills development team, although I have never been super-active.

Arguably, Itanium lost because it went to Intel. It was a project too costly even for the then behemoth HP to carry, especially because of the fab issues. Unfortunately, the bet was wrong, and Intel made mincemeat out of it. Note that the first Itanium performance pretty much sucks and only recovered some when the HP-designed follow-ons came out. Unfortunately, HP contributions stopped and so it goes...

Whether the original HP-PA WW (Wide Word) would have the performance they promised, we will never know now.

As for Mills, I am sorry to say that few people contributed to this thread understand what it is about. The information is out there, so to speak. Will it be successful eventually as a product? We shall see.
// richard http://imagecraft.com/
JumpStart C++ for Cortex (compiler/IDE/debugger): the fastest easiest way to get productive on Cortex-M.
Smart.IO: phone App for embedded systems with no app or wireless coding
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11248
  • Country: us
    • Personal site
Re: Mill CPU Architecture
« Reply #45 on: May 10, 2017, 09:22:06 pm »
As for Mills, I am sorry to say that few people contributed to this thread understand what it is about.
That comes from the lack of actual concise information. I watched all of his lectures, they are entertaining, but they are so spread in time that it is hard to keep track of everything. "Family" thing does not help either. And no, I'm not going to read through years worth of news group archives to get bits and pieces of information.

If they really want to clarify things, they should write a short, but complete reference manual of sorts. Then all relevant information will be in one place. And once more than two people actually understand the benefits, then spreading of misinformation will stop.

Right now it looks like another Itanic.
Alex
 
The following users thanked this post: hamster_nz

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Mill CPU Architecture
« Reply #46 on: May 11, 2017, 12:01:31 am »
As for Mills, I am sorry to say that few people contributed to this thread understand what it is about. The information is out there, so to speak.
Even obvious questions seem unanswered...

"So if the CPU us a family, each with different capabilities. How can a compiler statically schedule instructions/operations when the exposed micro-architecture changes underneath it between different family members?"

I'm guessing the answer would be a magic wand wave like "Oh, we plan to add another layer abstraction, which dynamically recompiles the executable to take advantage of the CPU it is running on - it's a easy to solve software problem" - pleasing to the software people in the audience (as hardware is a software problem from their perspective), but doesn't answer the question.

Instruction parallelism also seems to be magic wand. If playing around with FPGAs has taught me one thing it is that some computations parallelize easily (either at a macro or micro level), and other things are impossible.  Take for example, the (old, obsolete, broken) RC4 algorithm - https://en.wikipedia.org/wiki/RC4:

Code: [Select]
  i := 0
  j := 0
  while GeneratingOutput:
      i := (i + 1) mod 256
      j := (j + S[i]) mod 256
      swap values of S[i] and S[j]
      K := S[(S[i] + S[j]) mod 256]
      output K
  endwhile

Even in dedicated hardware, with multi-port memory you can't generate more than one K value per cycle, every cycle. I gave it a try and quickly failed, and then found people have written papers - e.g. https://link.springer.com/chapter/10.1007/978-3-642-17401-8_24.

No matter how great you CPU is, or your software tools, you can't extract non-existing instruction-level parallelism. Traditional CPUs address this by going multi-core, hyperthreading and shared some execution units. But I can't see how you can do this on The Mill, which seems to demand that you statically schedule everything at compile time.

It might be really good at doing some things, giving it a niche use-case like high-end DSPs, but it won't be great at mos things.

And I have yet to see it demonstrated, not even running slowly in a high end FPGA.

Very much unlike RISC-V where many cores fit can fit on a low-end FPGA, and I even have one in hard silicon on my bench.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11248
  • Country: us
    • Personal site
Re: Mill CPU Architecture
« Reply #47 on: May 11, 2017, 12:07:44 am »
"So if the CPU us a family, each with different capabilities. How can a compiler statically schedule instructions/operations when the exposed micro-architecture changes underneath it between different family members?"
I read "family" as in ARM Cortex family of architectures. So compiler can expect specific hardware at compile time. Actual capabilities will change between the family members, of course, but nobody promised direct code transfer between them.
Alex
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 19494
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Mill CPU Architecture
« Reply #48 on: May 11, 2017, 06:30:26 am »
As for Mills, I am sorry to say that few people contributed to this thread understand what it is about.
That comes from the lack of actual concise information. I watched all of his lectures, they are entertaining, but they are so spread in time that it is hard to keep track of everything. "Family" thing does not help either. And no, I'm not going to read through years worth of news group archives to get bits and pieces of information.

In which case it is also unrealistic to expect us to spend our time doing your research.

Quote
If they really want to clarify things, they should write a short, but complete reference manual of sorts. Then all relevant information will be in one place. And once more than two people actually understand the benefits, then spreading of misinformation will stop.

What makes you think it is in their interest to give you all such information? The IPR considerations alone mean they would be foolish to do that!

Anybody using that to infer that there is no such information, would be a fool.

Quote
Right now it looks like another Itanic.

And yet up above you say you don't have sufficient information to make such a judgement. That makes any such opinion not worth the paper it isn't written on.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11248
  • Country: us
    • Personal site
Re: Mill CPU Architecture
« Reply #49 on: May 11, 2017, 06:34:27 am »
That makes any such opinion not worth the paper it isn't written on.
I personally don't care if Mill succeeds or fails, I have zero vested interest in it.

But public perception of a product is very important. If significant amount of people will get something into their head, it is very hard to convince them otherwise.
Alex
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf