Author Topic: The Imperium programming language - IPL (Read 67830 times)

Mechatrommer · « **Reply #250 on:** November 28, 2022, 08:25:21 pm »

Quote from: rstofer on November 28, 2022, 08:14:49 pm

One thing that I find problematic with Python is that variables are not declared, they just happen. Given a variable name, you have no idea what is being represented. Could be a string, a float, an int, probably a vector, maybe a matrix and even if you thought you knew the shape, even that could change during execution. How far back do you have to look to find where it was last modified?

Same thing with MATLAB for that matter.

One thing I like about Modern Fortran is 'implicit none'. Every variable has to be declared and no more of the leading character determining the type unless otherwise declared.

my old Basic language has this feature, i ended up enforcing "Option Explicit" everytime so i know what i'm coding. in my dictionary, "Option Implicit" and the things you mentioned about returning multiple values is just good recipe for bloatness. btw i think you are not in love with Python, you are in love with its library

cheers.

Sherlock Holmes · « **Reply #251 on:** November 28, 2022, 09:32:07 pm »

Quote from: rstofer on November 28, 2022, 08:14:49 pm

It seems to me that the multiplicity of numeric types will ultimately lead to a trainwreck.

One thing that I find problematic with Python is that variables are not declared, they just happen. Given a variable name, you have no idea what is being represented. Could be a string, a float, an int, probably a vector, maybe a matrix and even if you thought you knew the shape, even that could change during execution. How far back do you have to look to find where it was last modified?

Yes, Python is dynamically typed, types are determined at and verified at runtime, has benefits but also non trivial costs as you've found. Interpreted languages often have dynamic typing for ease of use.

Quote from: rstofer on November 28, 2022, 08:14:49 pm

Same thing with MATLAB for that matter.

APL too, powerful but has its downsides.

Quote from: rstofer on November 28, 2022, 08:14:49 pm

One thing I like about Modern Fortran is 'implicit none'. Every variable has to be declared and no more of the leading character determining the type unless otherwise declared.

I also like the 'intent' attribute:

http://www.personal.psu.edu/jhm/f90/statements/intent.html

Yes, that "intent" is potentially very useful indeed, with sensible use it can reduce subtle errors.

Quote from: rstofer on November 28, 2022, 08:14:49 pm

I'm not sure what to think about functions returning multiple values and the ability to ignore pieces of the return values. Returning two ndarrays and only keeping one seems bizarre. But so does using indentation as a syntactic element;

Some of these ideas emanate from functional language theory and mathematics, "tuples" for example comes from that world. In such languages they feature very naturally too. C# now has tuples, they are pretty useful because if one wants to return several things then prior to tuples you had to define a struct or class just to wrap these elements. I find them useful in that situation, the need to return multiple - often disparate - things in ways that were not initially anticipated.

Quote from: rstofer on November 28, 2022, 08:14:49 pm

I do like the idea of slicing arrays and being able to concatenate rows or columns.

Yes, I agree, there are some things that can be done regarding "slices" that incur very little runtime cost. Some of the strengths of Fortran were carried over into PL/I by the language designers, that too has some nifty ways of dealing with arrays, stuff not seen explicitly in C or C++.

Quote from: rstofer on November 28, 2022, 08:14:49 pm

Although I grumble about the lack of declarations, I do like Python's ndarray.

Is white space going to be significant? At least in Fortran IV, it wasn't. The statement:
DO10I=1,4
could be the beginning of a loop or just another real variable being set to a value all the way up to the comma. I never tested that concept but it was a side effect of the fact that Fortran ignored white space. Apparently in fixed format Modern Fortran, white space is still ignored but in free form it is significant

https://community.intel.com/t5/Intel-Fortran-Compiler/spaces-not-ignored-in-free-format/td-p/1112317

Embedded spaces (or under scores) in long numeric strings can be useful.
SpeedOfLight = 186_000 miles per second

I've not looked much at all at Python myself. It seems it lets us create arrays whose rank isn't known until runtime, and I can see the convenience of that for some domains and in an interpreted language. Of course arrays are (or can be) just contiguous blocks of memory with a function that can convert n subscripts into an offset into the array, so in that sense they are illusory, just a "way" of perceiving how data is organized.

Spaces? yes I have heard of Fortran's tolerance of that, I guess some of that goes back to the nature of the industry at the time, keyboards were scarce, most people then (including me even in 1982) wrote code onto coding sheets, those had to be transcribed by a team of "punch" operators and so on.

So it's likely Fortran tried to be flexible because of these limited input methods. The grammar I have in mind, borrows from PL/I in several respects. PL/I was designed with a thorough analysis of Fortran, Cobol and Algol, taking some of the best ideas in those and developing a uniform grammar that captured these varied concepts.

Spaces can't be totally ignored, it all depends on what it takes to recognize a language token. Some tokens are simple enough to never be ambiguous, but other are not, like identifiers, these can be arbitrary and any length.

Like Fortran though, PL/I and the grammar I've been exploring, is free of reserved words, like Fortran an identifier can be anything even a keyword, and the grammar rules make it straightforward to resolve.

Here's an interesting historical article that sheds some little known light on how Fortran and Cobol and Algol served as the basis for PL/I.

Note:

Quote

Still it was the first language that contained decent string handling capabilities, pointers, three types of allocation of storage -- static, automatic (stack-based) and controlled (heap-based), exception handling, and rudimentary multitasking. While the idea of preprocessor (a front end macro generator) was never cleanly implemented (as PL/1 did not stored "macro source" line numbers and without them matching preprocessor and "real statements" was difficult) it was also innovative and later was inherited and expanded by C.

All-in-all PL/1 was and probably still is one of the most innovative programming languages in existence and some of its feature are still not matched by "new kids" in the programming language block.

Pointers, ->, while/until, break, the /* */ comments, exceptions, multitasking in the language, a powerful preprocessor, semicolon as statement terminator and more, all originated in PL/I and some were carried over into C.

tggzzz · « **Reply #252 on:** November 28, 2022, 09:36:16 pm »

Quote from: rstofer on November 28, 2022, 08:14:49 pm

It seems to me that the multiplicity of numeric types will ultimately lead to a trainwreck.

One thing that I find problematic with Python is that variables are not declared, they just happen. Given a variable name, you have no idea what is being represented. Could be a string, a float, an int, probably a vector, maybe a matrix and even if you thought you knew the shape, even that could change during execution. How far back do you have to look to find where it was last modified?

That can work nicely. If you need to keep a tight rein on the type, then encapsulate that type in a class and control all arithmetic operations.

Quote

Is white space going to be significant? At least in Fortran IV, it wasn't. The statement:
DO10I=1,4
could be the beginning of a loop or just another real variable being set to a value all the way up to the comma. I never tested that concept but it was a side effect of the fact that Fortran ignored white space. Apparently in fixed format Modern Fortran, white space is still ignored but in free form it is significant

IIRC that lead to spectacular unfortunate consequences.

C used to have a+= b giving the same result as a =+ b but a=+b was ambiguous.

rstofer · « **Reply #253 on:** November 28, 2022, 09:51:57 pm »

Quote from: Mechatrommer on November 28, 2022, 08:25:21 pm

btw i think you are not in love with Python, you are in love with its library cheers.

That is correct, I enjoy the application specific libraries but I'm not a huge fan of the language. It'll probably grow on me. I'm stuck in the Fortran/C world with no intention of upgrading.

For no particularly good reason, I find myself interested in Machine Learning. It's not like I will ever have another job or otherwise derive monetary value from the study but I admire the math and how the library functions do things like backpropogation and partial derivatives.

rstofer · « **Reply #254 on:** November 28, 2022, 10:33:36 pm »

I would have loved Algol but we didn't have it on the machines I was using circa '70. Fortran or RPG was about all we could get. No worries, a couple of assembly language libraries and we had string functionality in Fortran. IBM provided the Business Subroutines package which we improved locally.

Next high level language up was Pascal and I was in love. It just looks pretty on the listing. Most important to me was the concept of nested procedures/functions. When I found out that C didn't have that feature I was ready to throw rocks at it. I also like Wirth's syntax diagrams and the way he structure the language such that a recursive descent compiler was all that was needed. I'm big on recursive descent and nested procedures/functions. "Algorithms + Data Structures = Programs" by Niklaus Wirth is my favorite book on programming. The PL0 compiler in the back of the book shows just how simple it is to create a recursive descent compiler. Alas, it is nearly impossible to pass arbitrary arrays with Pascal and reshaping an array would be right out the window. I don't know where it stands with Oberon.

I pretty much like all of the languages that are derived from Algol. I used PL/I in grad school to write an 8080 assembler (this was in '75, things hadn't evolved very far and free assemblers weren't all over the place). I regret not having another semester to really learn the language. My advisor thought the code looked a lot like Fortran. Deadlines...

After it was no longer relevant, I got a copy of PL/M - it is written in Fortran!

Digital Research (CP/M) created a PL/I compiler for the z80/8080 world. It works well but I haven't used it for anything serious. I did use Microsoft's Fortran compiler. I also works well.

Sherlock Holmes · « **Reply #255 on:** November 28, 2022, 10:56:14 pm »

Quote from: rstofer on November 28, 2022, 10:33:36 pm

I would have loved Algol but we didn't have it on the machines I was using circa '70. Fortran or RPG was about all we could get. No worries, a couple of assembly language libraries and we had string functionality in Fortran. IBM provided the Business Subroutines package which we improved locally.

Next high level language up was Pascal and I was in love. It just looks pretty on the listing. Most important to me was the concept of nested procedures/functions. When I found out that C didn't have that feature I was ready to throw rocks at it.

Oh I understand that. Pascal and PL/I use nested procedures without a second thought, they are totally natural ways to express things. An inner procedure has access too, to all of the parent procedures automatic variables and parameters, all very elegant and sensible. C# recently added these to the language.

Quote from: rstofer on November 28, 2022, 10:33:36 pm

I also like Wirth's syntax diagrams and the way he structure the language such that a recursive descent compiler was all that was needed. I'm big on recursive descent and nested procedures/functions. "Algorithms + Data Structures = Programs" by Niklaus Wirth is my favorite book on programming. The PL0 compiler in the back of the book shows just how simple it is to create a recursive descent compiler. Alas, it is nearly impossible to pass arbitrary arrays with Pascal and reshaping an array would be right out the window. I don't know where it stands with Oberon.

Early parsers were recursive descent, then came the automated parsers made from PDAs, fascinating machines but very frustrating due to it being hard to separate concrete parsing from abstract parsing and report errors intelligibly, messages like "The attribute "static" must not appear more than one in a declaration" are so much friendlier than "Unrecognized state encountered while parsing<attribute-list> when parsing <local-declarations> when parsing <procedure-block> when parsing..."

So most newer languages are parsed with hand crafted recursive descent parsers, here's the one I developed for PL/I on Windows some years back - all written in C.

Quote from: rstofer on November 28, 2022, 10:33:36 pm

I pretty much like all of the languages that are derived from Algol. I used PL/I in grad school to write an 8080 assembler (this was in '75, things hadn't evolved very far and free assemblers weren't all over the place). I regret not having another semester to really learn the language. My advisor thought the code looked a lot like Fortran. Deadlines...

After it was no longer relevant, I got a copy of PL/M - it is written in Fortran!

Digital Research (CP/M) created a PL/I compiler for the z80/8080 world. It works well but I haven't used it for anything serious. I did use Microsoft's Fortran compiler. I also works well.

Strange old world sometimes!

I love the story about Steve Jobs telling Seymour Cray that Apple had just purchased a CRAY to help to design the next Apple, whereupon Cray says "Oh really? we just ordered an Apple to help us design the next CRAY".

brucehoult · « **Reply #256 on:** November 28, 2022, 11:35:36 pm »

Quote from: rstofer on November 28, 2022, 10:33:36 pm

Next high level language up was Pascal and I was in love. It just looks pretty on the listing. Most important to me was the concept of nested procedures/functions. When I found out that C didn't have that feature I was ready to throw rocks at it.

I thought so at the time too, but in retrospect it's a half-arsed feature that is bad for the same reason global variables are bad.

Sure, it's slightly handy for creating comparison functions to pass to sort(), but there are few other good use-cases, and it puts the overhead of maintaining a static link chain or "display" into every single function call and return, for a feature that is seldom used.

In Pascal any variables used by the nested function have to be declared right up at the top of the function, with the nested function up there too, and then the actual use is buried who knows how far down in the body of the function.

In C, you put such comparison functions just before the outer function header, instead of just after, and maybe define a small struct to hold variables (copies or pointers to them) shared between the caller and the "nested" function. The caller needs to allocate the struct, copy a couple of things into it, then pass it as an explicit void* parameter to the sort function. The comparison function needs to cast the void* to the correct type.

C seems like more work, but in fact it is work that Pascal has to do anyway, it's just hidden. In fact, every Pascal I've used did it another way -- the previously mentioned static chains or display -- but they SHOULD have done it the same way you manually do it in C, so that the overhead exists only when you actually use the feature, not in every single function call.

The Pascal feature is just pure sugar, not anything fundamental.

What IS useful and much better than C is when you can write the comparison function right there as an anonymous function in the sort function argument, and it can access variables local to the block it is in, not only top level variables of the enclosing function. Even if, as in PL/1, you have to give the nested function a name and write it as a separate statement on the line before the call to the sort() function, that's still fine.

The other annoying part about how you do it in C is having to cast from a void* inside the "nested" function. This is avoided in C++, Java etc by making sort() take a pointer to an object of some class ("comparable"?) and making your compare function a member of a subclass derived from it.

I seem to recall Oberon didn't have classes but had "extensible types", including records.

Full closures, that you can return from the enclosing function, or store into global data structures, are off course much more useful than mere nested functions. That is something you can do in plain C simply by heap-allocating that little struct you copy the shared local variables into.

Mechatrommer · « **Reply #257 on:** November 29, 2022, 12:01:28 am »

first i learnt fortran in dos in formal course, on 1st day i quickly in love with "computer programming language", then i introduced myself to Delphi/Pascal... C/C++ and Basic.. i left pascal and fortran behind, i keep c and basic until today... both basic and pascal dont support pointer/tree data structure natively... but Basic (VB) is everything i need to build GUI Windows and connect easily to C built Win32 API. C is about "speed" and "very broad" application/hackability/castability. if any of modern/managed languages can beat that, tell me i'm willing to give an eye.. for me they are only worth understanding so i can port their codes back into C...

Sherlock Holmes · « **Reply #258 on:** November 29, 2022, 12:02:23 am »

Quote from: brucehoult on November 28, 2022, 11:35:36 pm

Quote from: rstofer on November 28, 2022, 10:33:36 pm
Next high level language up was Pascal and I was in love. It just looks pretty on the listing. Most important to me was the concept of nested procedures/functions. When I found out that C didn't have that feature I was ready to throw rocks at it.

I thought so at the time too, but in retrospect it's a half-arsed feature that is bad for the same reason global variables are bad.

Sure, it's slightly handy for creating comparison functions to pass to sort(), but there are few other good use-cases, and it puts the overhead of maintaining a static link chain or "display" into every single function call and return, for a feature that is seldom used.

In Pascal any variables used by the nested function have to be declared right up at the top of the function, with the nested function up there too, and then the actual use is buried who knows how far down in the body of the function.

In C, you put such comparison functions just before the outer function header, instead of just after, and maybe define a small struct to hold variables (copies or pointers to them) shared between the caller and the "nested" function. The caller needs to allocate the struct, copy a couple of things into it, then pass it as an explicit void* parameter to the sort function. The comparison function needs to cast the void* to the correct type.

C seems like more work, but in fact it is work that Pascal has to do anyway, it's just hidden. In fact, every Pascal I've used did it another way -- the previously mentioned static chains or display -- but they SHOULD have done it the same way you manually do it in C, so that the overhead exists only when you actually use the feature, not in every single function call.

The Pascal feature is just pure sugar, not anything fundamental.

What IS useful and much better than C is when you can write the comparison function right there as an anonymous function in the sort function argument, and it can access variables local to the block it is in, not only top level variables of the enclosing function. Even if, as in PL/1, you have to give the nested function a name and write it as a separate statement on the line before the call to the sort() function, that's still fine.

The other annoying part about how you do it in C is having to cast from a void* inside the "nested" function. This is avoided in C++, Java etc by making sort() take a pointer to an object of some class ("comparable"?) and making your compare function a member of a subclass derived from it.

I seem to recall Oberon didn't have classes but had "extensible types", including records.

Full closures, that you can return from the enclosing function, or store into global data structures, are off course much more useful than mere nested functions. That is something you can do in plain C simply by heap-allocating that little struct you copy the shared local variables into.

Well yes there are use cases, but as to whether they're good or few or not is dependent upon the problem at hand. The Microsoft C# team supported the introduction of this for several reasons, you can read about the differences between local function and lambdas/anonymous here, sure it C# but it's largely relevant.

In particular note this important detail - iterators - defining lambdas that are also iterators (are able to yield results) is not supported in C# for a whole variety of reasons.

IMHO these nested procedure is very easy to implement, and ideal for certain scenarios as you said, but I don't see why you equate them to "global" variables, as being as bad as them, what do you mean? If the function were not enclosed yet needed access to the callers state, how would that be done? pass in a load of extra arguments? seems like a lot of extra baggage surely?

See also this discussion: https://stackoverflow.com/a/40949214

Finally:

Quote

In addition to svick's great answer there is one more advantage to local functions:
They can be defined anywhere in the function, even after the return statement.

Pascal did limp a little with its cumbersome insistence on "one pass" (a meaningless term these days anyway) and the consequent impact on parsing and analysis (same in C) so in a new language it would be easy to do as C# does and let you lexically position the function definition anywhere within the containing block's body.

SiliconWizard · « **Reply #259 on:** November 29, 2022, 12:11:18 am »

I don't particularly care for nested functions - I remember other threads talking about that as well.
Apart from what you said, that just leads to more visual clutter which is bad. You quickly end up not seeing what exactly is the function being defined and what is not. Functions must be easier to read, not harder.

The underlying problem it's trying to solve is a problem of scope and namespaces. Which can/could be solved using other, more readable approaches.

And helper functions can actually often be reused, so defining them strictly as nested prevents you from any reuse. This is bad. I won't count how often I have been able to reuse helper functions. If you nest them (or make them anonymous, see below), not only can you not reuse them - so you'll be tempted/forced to duplicate them if the same functions are needed elsewhere - but you also cannot unit-test them separately, which can be just as bad. Very much a recipe for bad coding practice.

For the same reason (and also that of readability), I don't care for anonymous functions, expect in very, very limited cases, which IMHO do not make them worth the trouble.

Now those that like Pascal - already mentioned that, but feel free to try freepascal and Lazarus. Great stuff if you like that. Definitely underrated.

Nominal Animal · « **Reply #260 on:** November 29, 2022, 01:40:54 am »

Quote from: brucehoult on November 28, 2022, 11:35:36 pm

In C, you put such comparison functions just before the outer function header, instead of just after, and maybe define a small struct to hold variables (copies or pointers to them) shared between the caller and the "nested" function. The caller needs to allocate the struct, copy a couple of things into it, then pass it as an explicit void* parameter to the sort function. The comparison function needs to cast the void* to the correct type.

A good example of this is the GNU extension to the qsort() function, qsort_r(). The qsort_r() takes an additional void pointer, that is passed to the comparison function as the third parameter. In particular, if your array entries contain an array of potential keys, you can provide the key index via that extra argument (via (void *)(uintptr_t)index and (uintptr_t)pointer casts). Alternatively, you can use a single comparison function per key type, and pass the byte offset to the key within each structure via that extra argument (using offsetof()).

I for one use closures –– structures describing the state of a function or operation ––, a lot: in C for example, where one might use nested functions or coroutines in other languages. A typical example is a tokenizing input buffer, a generator of sorts, where one call returns the next field in the current record, and another call skips to the start of the next record, used for example for CSV file parsing. Queues and FIFOs are another use case, when one side may be generating or consuming more data than may fit in RAM, so the two do need to be interleaved to keep the temporary memory requirements within reason.

A nested function is simply one that receives a closure describing the state of its parent – including any variables it needs access to. This works even when the 'nested' function is recursive. Furthermore, such a 'nested' function can be 'nested' under any parent function that uses the same type of closure to describe its state. So no, I don't really need nested functions either, as long as using such closures/state structures is easy and straightforward.

sizeof, typeof, compatible_types, and typenameof operators should be provided even by statically types languages.
C provides only the first one; GCC, Clang, Intel CC and others provide the two next ones as extensions to C; and the fourth one can only be approximated by preprocessor stringification. This leads to less than optimal interfaces when values of runtime-dependent types need to be passed. Typically, the calls need to explicitly specify the type as a literal (token or string in C) and the size of the value, but there is no good way to check they match the expression being passed. (If the expression type is a structure with a C99 flexible array member, even sizeof won't work.)
In particular, stringifying any typeof (expression) will only lead to a string containing that text, not the type name.

These kinds of use cases are why I believe so strongly that looking at existing code (in different competing languages, suitable for the target niche) to understand what kind of approaches they use for solving these kinds of problems, is extremely important. Because there is a lot of bad code out there, looking for consensus is not a good idea: you need to find the 'best' ways of solving each kind of a problem, and learn from that. (To me, 'best' here means robust with errors easy to catch, without imposing a strict usage pattern; and not 'me like most'.)

rstofer · « **Reply #261 on:** November 29, 2022, 02:43:14 am »

If you look at the Pascal syntax diagram you will start to see where recursive descent was a part of the syntax and that nested procedures are the way to go:

Syntax Diagrams
https://link.springer.com/content/pdf/bbm:978-1-4757-1764-8/1.pdf

P4 Pascal Compiler Source:
https://homepages.cwi.nl/~steven/pascal/pcom.p

Sets are another underappreciated concept. They are terrific as used in this compiler. We have a set of symbols that can start the next statement and a set of symbols that can end the statement (or continue parsing). At all times, the compiler knows exactly what to look for so error detection and notification are pretty easy. The heart of the compiler is at 'procedure statement' and the obvious use of nested procedures is in evaluating expressions at 'procedure expression'.

Here are the sets the compiler actually uses:

Code: [Select]

constbegsys,simptypebegsys,typebegsys,blockbegsys,selectsys,facbegsys,
statbegsys,typedels: setofsys;

I spent a lot of time with the source to UCSD Pascal back around 1980.

The source is available for an interpreter:
https://homepages.cwi.nl/~steven/pascal/pint.p

At one time I was working on an FPGA project to implement the interpreter. I got hung up when I had trouble with the idea of system calls for things like floating point but the rest of it worked.

rstofer · « **Reply #262 on:** November 29, 2022, 03:11:15 am »

Quote from: SiliconWizard on November 29, 2022, 12:11:18 am

Now those that like Pascal - already mentioned that, but feel free to try freepascal and Lazarus. Great stuff if you like that. Definitely underrated.

I have fpc (preepascal) installed on all of my Linux boxes.

I don't do much application programming so I don't use it nearly as much as I would like but I do install it on every machine.

brucehoult · « **Reply #263 on:** November 29, 2022, 03:32:28 am »

Quote from: rstofer on November 29, 2022, 02:43:14 am

If you look at the Pascal syntax diagram you will start to see where recursive descent was a part of the syntax and that nested procedures are the way to go:

I don't see the link.

Mutually-recursive procedures, yes. But not nested.

DiTBho · « **Reply #264 on:** November 29, 2022, 10:40:55 am »

Quote from: SiliconWizard on November 29, 2022, 12:11:18 am

The underlying problem it's trying to solve is a problem of scope and namespaces. Which can/could be solved using other, more readable approaches.

That is a big problem to solve!

DiTBho · « **Reply #265 on:** November 29, 2022, 10:46:12 am »

Quote from: Nominal Animal on November 29, 2022, 01:40:54 am

sizeof, typeof, compatible_types, and typenameof operators

- sizeof()
- typeof()

What about the other two operators? Where are they useful in C?

Sherlock Holmes · « **Reply #266 on:** November 29, 2022, 03:13:36 pm »

Quote from: brucehoult on November 29, 2022, 03:32:28 am

Quote from: rstofer on November 29, 2022, 02:43:14 am
If you look at the Pascal syntax diagram you will start to see where recursive descent was a part of the syntax and that nested procedures are the way to go:

I don't see the link.

Mutually-recursive procedures, yes. But not nested.

I'm interested in exploring the ARM assembly programming, I was reading about it yesterday and the contrast with the crippled x86 ISA is huge. Reading about ARM was reminiscent (for me) of the 68000 which was a very nice device.

It's different of course, the distinction between LOAD/STORE approach and other MCU architectures is important and not something I've ever thought about in detail. I was also quite taken by the fact that the endianness can be altered by an instruction, never seen that before, granted it's restricted but that's not any surprise.

So I have a question, I use Windows as my desktop on (on an AMD CPU) and routinely use Visual Studio with VisualGDB to plat with MCUs. If I want to write and run and debug assembler and I have STM32 devices, what would you suggest? VisualGDB is a good quality tool (certainly for amateur use like mine) but seems to implicitly assume C and doesn't seem to offer me a choice of Assembler when creating a project.

I might be able to just include an assembler source file into a simple project, I read that it will likely understand that, but I wanted to ask someone more expert than me, perhaps there are other tools? other ways? I use VisualGDB because I routinely work with Visual Studio Enterprise for my professional work and very familiar with it and it's a superb system as well, so VisualGDB was a huge appeal.

Thanks in advance.

rstofer · « **Reply #267 on:** November 29, 2022, 03:45:09 pm »

Quote from: brucehoult on November 29, 2022, 03:32:28 am

Quote from: rstofer on November 29, 2022, 02:43:14 am
If you look at the Pascal syntax diagram you will start to see where recursive descent was a part of the syntax and that nested procedures are the way to go:

I don't see the link.

Mutually-recursive procedures, yes. But not nested.

Try this:

Code: [Select]

	procedure expression;
	  var lattr: attr; lop: operator; typind: char; lsize: addrrange;

	  procedure simpleexpression(fsys: setofsys);
	    var lattr: attr; lop: operator; signed: boolean;

	    procedure term(fsys: setofsys);
	      var lattr: attr; lop: operator;

	      procedure factor(fsys: setofsys);
		var lcp: ctp; lvp: csp; varpart: boolean;
		    cstpart: setty; lsp: stp;
	      begin
		if not (sy in facbegsys) then
		  begin error(58); skip(fsys + facbegsys);
		    gattr.typtr := nil

Page 138 and 139 of the syntax diagrams show the relationship between expression, simpleexpression, term and factor. Note that factor can recursively call expression and start the entire tree over again.

The compiler code flows directly from the syntax diagrams.

Compared to the transition matrix we were taught in grad school, Wirth's approach looks elegant.

Sherlock Holmes · « **Reply #268 on:** November 29, 2022, 03:56:54 pm »

Quote from: Nominal Animal on November 29, 2022, 01:40:54 am

Quote from: brucehoult on November 28, 2022, 11:35:36 pm
In C, you put such comparison functions just before the outer function header, instead of just after, and maybe define a small struct to hold variables (copies or pointers to them) shared between the caller and the "nested" function. The caller needs to allocate the struct, copy a couple of things into it, then pass it as an explicit void* parameter to the sort function. The comparison function needs to cast the void* to the correct type.
A good example of this is the GNU extension to the qsort() function, qsort_r(). The qsort_r() takes an additional void pointer, that is passed to the comparison function as the third parameter. In particular, if your array entries contain an array of potential keys, you can provide the key index via that extra argument (via (void *)(uintptr_t)index and (uintptr_t)pointer casts). Alternatively, you can use a single comparison function per key type, and pass the byte offset to the key within each structure via that extra argument (using offsetof()).

I for one use closures –– structures describing the state of a function or operation ––, a lot: in C for example, where one might use nested functions or coroutines in other languages. A typical example is a tokenizing input buffer, a generator of sorts, where one call returns the next field in the current record, and another call skips to the start of the next record, used for example for CSV file parsing. Queues and FIFOs are another use case, when one side may be generating or consuming more data than may fit in RAM, so the two do need to be interleaved to keep the temporary memory requirements within reason.

A nested function is simply one that receives a closure describing the state of its parent – including any variables it needs access to. This works even when the 'nested' function is recursive. Furthermore, such a 'nested' function can be 'nested' under any parent function that uses the same type of closure to describe its state. So no, I don't really need nested functions either, as long as using such closures/state structures is easy and straightforward.

sizeof, typeof, compatible_types, and typenameof operators should be provided even by statically types languages.
C provides only the first one; GCC, Clang, Intel CC and others provide the two next ones as extensions to C; and the fourth one can only be approximated by preprocessor stringification. This leads to less than optimal interfaces when values of runtime-dependent types need to be passed. Typically, the calls need to explicitly specify the type as a literal (token or string in C) and the size of the value, but there is no good way to check they match the expression being passed. (If the expression type is a structure with a C99 flexible array member, even sizeof won't work.)
In particular, stringifying any typeof (expression) will only lead to a string containing that text, not the type name.

These kinds of use cases are why I believe so strongly that looking at existing code (in different competing languages, suitable for the target niche) to understand what kind of approaches they use for solving these kinds of problems, is extremely important. Because there is a lot of bad code out there, looking for consensus is not a good idea: you need to find the 'best' ways of solving each kind of a problem, and learn from that. (To me, 'best' here means robust with errors easy to catch, without imposing a strict usage pattern; and not 'me like most'.)

These are the kinds of ideas I'm interested in, exactly the kinds if things that could be discussed and considered in the context of a new programming language. C doesn't have any concept of closures or anonymous functions (unless there are some vendor's who added these?).

As for nested procedures, first and foremost these are easy to incorporate into a new language, low cost (not free but hardly expensive). Code leveraging them is clear, easy to reason about. The state data the nested procedure can access is clear just by looking at the code. The inaccessibility of the nested function to other code is clear too. We can already nest scopes in C too, not a nested function but it is a nested scope.

GCC has extensions that deliver nested function, someone did invest the effort to define and implement this.

Nested functions are not the same as closures either, they are different, perhaps one could achieve some desired outcome using either closures or nested functions but they are distinct concepts, not interchangeable.

Stuff like "typeof" and "typenameof" and so on, these too are things a new language could support. As a list of desirable features begins to grow, then so too does an encompassing paradigm, the language will begin to take shape by systematizing the way the features are defined.

Note too the plethora of C "extensions" by umpteen vendors, often incompatible, their presence tells us something important...

Sherlock Holmes · « **Reply #269 on:** November 29, 2022, 04:12:59 pm »

Quote from: rstofer on November 29, 2022, 03:45:09 pm

Quote from: brucehoult on November 29, 2022, 03:32:28 am
Quote from: rstofer on November 29, 2022, 02:43:14 am
If you look at the Pascal syntax diagram you will start to see where recursive descent was a part of the syntax and that nested procedures are the way to go:

I don't see the link.

Mutually-recursive procedures, yes. But not nested.

Try this:
Code: [Select]
procedure expression; var lattr: attr; lop: operator; typind: char; lsize: addrrange; procedure simpleexpression(fsys: setofsys); var lattr: attr; lop: operator; signed: boolean; procedure term(fsys: setofsys); var lattr: attr; lop: operator; procedure factor(fsys: setofsys); var lcp: ctp; lvp: csp; varpart: boolean; cstpart: setty; lsp: stp; begin if not (sy in facbegsys) then begin error(58); skip(fsys + facbegsys); gattr.typtr := nil
Page 138 and 139 of the syntax diagrams show the relationship between expression, simpleexpression, term and factor. Note that factor can recursively call expression and start the entire tree over again.

The compiler code flows directly from the syntax diagrams.

Compared to the transition matrix we were taught in grad school, Wirth's approach looks elegant.

Wirth defined a simple language named PL/0 too that he used to convey this idea. I'll tell you another truly superb book on compiler design, one I used a lot in the past it is Understanding and Writing Compilers by Richard Bornat. It is oriented to recursive descent and is very readable, the author has a deep understanding of machine architectures and code generation practicalities, gives easy to follow examples akin to the stuff Wirth wrote, I doubt I could have gotten as far as I did with my PL/I implementation without this book.

I have numerous books and back when I started PL/I there was no internet and books were the bed rock of technical work like this, The "Dragon Book" has its place of course and a few others but Bornat's book is right up there with the best of them, a much overlooked book IMHO.

Sherlock Holmes · « **Reply #270 on:** November 29, 2022, 04:48:47 pm »

Now regarding coroutines, the following has emerged over the past week as I explore these. The original definition of couroutine (and the one embodied in the pseudo code below) is the one found in the 1963 paper, multiple execution contexts that can suspend and resume through mutual interaction.

In the emerging grammar, "coroutine" makes sense as an optional attribute one can place on a procedure definition, consider:

Code: [Select]

proc main
{

	// left_offset = 0;
	// left_frame = XXXX
	// right_offset = 0;
	// right_frame = YYYY
	// curr_frame = left_frame;
	// goto left_offset

	call left (X); // must create a collective stack frame that all coroutines will (in essence) share.

}

proc left(L) coroutine
{

	arg L;

	// dcl locals

	// do something

	yield to right(Z); // actually a goto right_offset into right and set left_offset to 1 and curr_frame to right_frame

	// do something else

	yield to right(Z); // actually a goto right_offset into right and set left_offset to 2 and curr_frame to right_frame

}

proc right(R) coroutine
{

	arg R;

	// dcl locals

	// do something

	yield to left(W); // actually a goto left_offset into left and set right_offset to 1 and curr_frame to left_frame

	// do something else

	yield to left(W); // actually a goto left_offset into left and set right_offset to 2 and curr_frame to left_frame

}

The comments are simple, informal representations of how one might implement this, the register mechanics that arise from this behavior. The compiler - by static analysis (i.e. by the "coroutine" attribute) - can see that a singe aggerate stack frame must be created when main initially invokes left via a conventional call. That results in a single stack frame that has working space - subframes - for both left and right, that stack frame exists until the call in main returns, this is conventional call/return.

The proc left executes normally until it reaches yield to. At that point a jump/goto starts execution at the address implied by right_offset (which lives in the aggregate stack frame) after first adjusting left_offset which represents where the execution of left will resume when right eventually does a yield to left.

Execution thus cycles back and forth, each procedure can access its args and locals because it has its own "slot" within the aggregate stack frame. Inside the procedures things look conventional, the code behaves conventionally.

The yield to can exist in the following variants, supporting coroutines and cofunctions:

Code: [Select]


yield (item);

yield to some_procedure;

yield to some_procedure(args);

yield (item) to some_procedure;

yield (item) to some_procedure(args);

The first is how a basic iterator could be written. Corresponding to these we have:

Code: [Select]


x = yield to some_procedure;

x = yield to some_procedure(args);

x = yield (y) to some_procedure(args);

where in its most general form:

Code: [Select]


a = yield (x) to somewhere(y);

Where a is the "returned" value that appeared in some other yield (value) clause in some other coroutine. The (y) arg also, becomes the current value of the argument defined the target coroutine's parameter list.

and so on.

So arguments can be passed into a coroutine and values can be returned from them, in a variety of ways. There is no setup/teardown of stack frames as the pair of coroutines executes, there is no pushing and popping of arguments either. The subframes exist and remain at all times, only when control eventually reaches the end of the final procedure (or a return is executed) does the aggregate stack frame vanish in the same way any stack frame vanishes when a called function/procedure terminates.

So this is an outline of the standard, historic coroutine pattern, it does what its described in the historic literature, and of course we can have more than two procedures/functions and we can even pass "function pointers" (to use C parlance) into them so that the code they "yield to" is defined at runtime not compile time as in my examples.

I've never really been able to find a clear detailed description of the classic historic coroutine, so some of this is a result of my efforts to make sense of the idea.

Thoughts?

rstofer · « **Reply #271 on:** November 29, 2022, 05:01:07 pm »

Quote from: Sherlock Holmes on November 29, 2022, 04:12:59 pm

Wirth defined a simple language named PL/0 too that he used to convey this idea.

I have coded up that PL/0 compiler several times, once in Fortran. Dozens of pages of Fortran or a very few pages of Pascal. I'll concede that the Fortran was crap code but, really, sets aren't part of the language and they are terribly important to the compiler.

Fortunately, the CDC 6400 on which the compiler was developed had a 60 bit word and that was enough bits to accommodate the elements. It gets ugly when you have to compare across 8 or 16 bit or even 32 bit entities. Then there is the lack of formal recursion in Fortran (I believe I was using a IV version of Fortran).

Quote

I'll tell you another truly superb book on compiler design, one I used a lot in the past it is Understanding and Writing Compilers by Richard Bornat. It is oriented to recursive descent and is very readable, the author has a deep understanding of machine architectures and code generation practicalities, gives easy to follow examples akin to the stuff Wirth wrote, I doubt I could have gotten as far as I did with my PL/I implementation without this book.

I ordered a copy from alibris.com - $15 including shipping. I get a lot of used books from Alibris. I suspect that's the way my library will go as well.

eutectique · « **Reply #272 on:** November 29, 2022, 05:05:17 pm »

Quote from: Nominal Animal on November 24, 2022, 10:49:26 pm

The only thing that annoys me is that the linker does not yet know how to sort those entries at link time.

There is SORT_BY_NAME (or simply SORT) keyword for input section description, does it not work?

Sherlock Holmes · « **Reply #273 on:** November 29, 2022, 05:24:35 pm »

Quote from: rstofer on November 29, 2022, 05:01:07 pm

Quote from: Sherlock Holmes on November 29, 2022, 04:12:59 pm

Wirth defined a simple language named PL/0 too that he used to convey this idea.
I have coded up that PL/0 compiler several times, once in Fortran. Dozens of pages of Fortran or a very few pages of Pascal. I'll concede that the Fortran was crap code but, really, sets aren't part of the language and they are terribly important to the compiler.

Fortunately, the CDC 6400 on which the compiler was developed had a 60 bit word and that was enough bits to accommodate the elements. It gets ugly when you have to compare across 8 or 16 bit or even 32 bit entities. Then there is the lack of formal recursion in Fortran (I believe I was using a IV version of Fortran).

Quote
I'll tell you another truly superb book on compiler design, one I used a lot in the past it is Understanding and Writing Compilers by Richard Bornat. It is oriented to recursive descent and is very readable, the author has a deep understanding of machine architectures and code generation practicalities, gives easy to follow examples akin to the stuff Wirth wrote, I doubt I could have gotten as far as I did with my PL/I implementation without this book.
I ordered a copy from alibris.com - $15 including shipping. I get a lot of used books from Alibris. I suspect that's the way my library will go as well.

Oh, well let me know what you think, I used the book in the early 90s then lost it, but bought a used copy from Amazon last year!

brucehoult · « **Reply #274 on:** November 29, 2022, 06:35:13 pm »

Quote from: rstofer on November 29, 2022, 03:45:09 pm

Quote from: brucehoult on November 29, 2022, 03:32:28 am
Quote from: rstofer on November 29, 2022, 02:43:14 am
If you look at the Pascal syntax diagram you will start to see where recursive descent was a part of the syntax and that nested procedures are the way to go:

I don't see the link.

Mutually-recursive procedures, yes. But not nested.

Try this:
Code: [Select]
procedure expression; var lattr: attr; lop: operator; typind: char; lsize: addrrange; procedure simpleexpression(fsys: setofsys); var lattr: attr; lop: operator; signed: boolean; procedure term(fsys: setofsys); var lattr: attr; lop: operator; procedure factor(fsys: setofsys); var lcp: ctp; lvp: csp; varpart: boolean; cstpart: setty; lsp: stp; begin if not (sy in facbegsys) then begin error(58); skip(fsys + facbegsys); gattr.typtr := nil
Page 138 and 139 of the syntax diagrams show the relationship between expression, simpleexpression, term and factor. Note that factor can recursively call expression and start the entire tree over again.

The compiler code flows directly from the syntax diagrams.

Compared to the transition matrix we were taught in grad school, Wirth's approach looks elegant.

There are three lines of actual executable code there, referring to the following names:

Code: [Select]

sy
facbegsys
error
skip
fsys
gattr

Of those, fsys is a formal argument of the current function (factor). All the others are undefined by the code shown here. "Free variables" if you will. Globals.

NONE of them are defined by any of the enclosing functions: expression, simpleexpression, term.

I can't see any necessity here for them to be nested functions instead of peer functions. It appears no use is being made of the nested scopes.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: The Imperium programming language - IPL (Read 67830 times)

Share me