Author Topic: What's your favorite language feature? (Read 13385 times)

Mattjd · « **Reply #75 on:** June 26, 2020, 12:02:30 am »

Quote from: IDEngineer on March 23, 2020, 05:32:31 pm

Quote from: jfiresto on March 23, 2020, 04:17:01 pm
I think the missing delimiters and significance of whitespace bit me once or twice the first couple days I used Python. That ended after I banished tabs and turned on Python mode: in the same old editor I have been using for the last forty years.
Just to be clear, my Python code works just fine. And I use IDLE, straight out of the Python distribution, so it's not an "editor thing".

It's in reading and (especially) debugging, and especially when it's someone else's code, that Python's block delimiter mistake becomes very apparent. There's just no way that the visual absence of something is as clear and quick to recognize. This is particularly bad with nested loops. One of the things I had to write in Python recently was some matrix manipulation code, and the nested loops where a pain for everyone on the team. I lost count of the number of times one or more of us was carefully examining a line of code in the wrong loop because the visibility of the indent level was affected by your angle to the screen - if you weren't straight in front of a monitor it became very easy to misjudge which indent level you were tracking by eye. Utter insanity! And so easily avoided, as has been done in other languages for decades. This was a completely avoidable error in language design.

It also doesn't help, if you're going to rely on whitespace, that the default indent is just four spaces. I like tight indenting when block delimiters are in use, but when whitespace on the screen is all you've got, four little spaces makes things just that much harder. Particularly in a team environment and it's not YOUR screen so you're off-axis.

Finally, Python isn't even consistent in its elimination of block delimiters. It actually has an OPENING delimiter (the colon), which basically substitutes for (say) the open brace in C, C++, Java, etc. But Python's closing delimiter is - wait for it - a correct number of spaces from the left edge. Which is a number that varies depending upon where the associated colon happens to be (or, rather, where the line that contains that colon happens to start). And it's up to the user to line everything up, visually, through blank white space across potentially many lines on the screen, to avoid (best case) compile errors or (worst case) unintended code behavior.

All of which could have been easily eliminated with a closing block delimiter. And that wouldn't have even been a new concept in the world of programming languages. Unbelievable.

EDIT: A good related question to this topic is "What problem were they trying to solve by eliminating the closing block delimiter?" Honestly, how many times have programmers pounded their keyboard in frustration and shouted "My life would be so much better if there was a language out there that DIDN'T require a closing block delimiter!!!" I've said, and heard, that complaint precisely zero times.

Guess Guido and the team are just a bunch of idiots that most of FAANG and the world fell for.

Ironically, the case for eliminating block delimiters, to my understanding, is the one you thing you are griping about - readability of others code. The idea was to remove the fuss of the nine different brace indentation variants found here https://en.wikipedia.org/wiki/Indentation_style

In my opinion, they succeeded. You go onto any repo and everything looks practically the same.

As far as one developer using tabs vs 3 spaces vs 2 spaces. Is replacing tabs with spaces really that difficult? Literally in vim command mode

Code: [Select]

%s/\t/ /g

westfw · « **Reply #76 on:** July 07, 2020, 06:45:04 am »

In general, I'm against using "cute" but relatively rare language features...
I did like the assembler macros that could:

recurse.
define other assembler macros.
repeat sections per character of the argument.

(You know, in general I miss the sort of advanced macro capabilities that most assemblers had, in higher-level languages...)

brucehoult · « **Reply #77 on:** July 07, 2020, 07:20:02 am »

Generic Functions aka Multiple Dispatch

Present in Common LISP, Dylan. Has been talked about for a long time in Python, but never implemented :-(

Lets you do OO (i.e. runtime selection of the operation based on the data type) programming without either having functions "inside" classes (C++/Java/most others) or having type switch/case/match statements inside functions (Haskell/ML/Caml).

Chris42 · « **Reply #78 on:** July 09, 2020, 04:42:55 pm »

By language:

Python:
Portability. I can write a program and it will run on any machine that can run python interpreter. This is the best feature of this language. No more recompiling for some fancy new architecture!

C:
Very simple and elegant. You can learn all syntaxes in 2 weeks. The rest is standard library.

C++:
Smart pointers. They really simplify resource management.

Java:
Garbage collector.

VBA:
It has IP protection built in. Your code will be so horrible that no one will want to steal it.

SiliconWizard · « **Reply #79 on:** July 09, 2020, 04:58:14 pm »

Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

tggzzz · « **Reply #80 on:** July 09, 2020, 05:05:59 pm »

Quote from: SiliconWizard on July 09, 2020, 04:58:14 pm

Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

It makes a good attempt at reclaiming most garbage in programs that match its limitations.

Scarcely general purpose.

tggzzz · « **Reply #81 on:** July 09, 2020, 05:11:48 pm »

Quote from: Chris42 on July 09, 2020, 04:42:55 pm

By language:

Python:
Portability. I can write a program and it will run on any machine that can run python interpreter. This is the best feature of this language. No more recompiling for some fancy new architecture!

Ditto Java, of course.

Quote

C:
Very simple and elegant. You can learn all syntaxes in 2 weeks. The rest is standard library.

For some definition of "learn" and "weeks". The latter can be compared with the earth being created in 7 days.

Quote

C++:
Smart pointers. They really simplify resource management.

They get you halfway towards the equivalent features in modern languages.

Quote

Java:
Garbage collector.

Plural. Choose the right GC for your application.

Quote

VBA:
It has IP protection built in. Your code will be so horrible that no one will want to steal it.

Agreed

brucehoult · « **Reply #82 on:** July 10, 2020, 08:06:54 am »

Quote from: tggzzz on July 09, 2020, 05:05:59 pm

Quote from: SiliconWizard on July 09, 2020, 04:58:14 pm
Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

It makes a good attempt at reclaiming most garbage in programs that match its limitations.

Scarcely general purpose.

I've not seen it fail in any significant way. A number of programming languages use Boehm as their GC -- and many others *should*. For example Mono (open source C#) did for a long time -- they eventually wrote their own GC and it is the only example I know of where replacing Boehm by a custom GC improved performance.

On many or most C/C++ programs you can improve performance significantly (at the cost of a relatively minor increase in RAM usage) by compiling Boehm so that malloc becomes GC_malloc() and free() becomes a no-op and using LD_PRELOAD to replace the malloc() and free() in glibc.

tggzzz · « **Reply #83 on:** July 10, 2020, 08:45:18 am »

Quote from: brucehoult on July 10, 2020, 08:06:54 am

Quote from: tggzzz on July 09, 2020, 05:05:59 pm
Quote from: SiliconWizard on July 09, 2020, 04:58:14 pm
Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

It makes a good attempt at reclaiming most garbage in programs that match its limitations.

Scarcely general purpose.

I've not seen it fail in any significant way. A number of programming languages use Boehm as their GC -- and many others *should*. For example Mono (open source C#) did for a long time -- they eventually wrote their own GC and it is the only example I know of where replacing Boehm by a custom GC improved performance.

I'm more interested in correctness than performance. After all, if it is permissible that my code can contain faults, I can increase performance by orders of magnitude

It can be the devil's own job demonstrating that a GC is at fault.

I have no idea how closely Mono took account of Boehm's GC's limitations.

Quote

On many or most C/C++ programs you can improve performance significantly (at the cost of a relatively minor increase in RAM usage) by compiling Boehm so that malloc becomes GC_malloc() and free() becomes a no-op and using LD_PRELOAD to replace the malloc() and free() in glibc.

Presumably until garbage is collected!

I haven't looked at the topic for a long time. What are the Boehm's GC's characteristics on SMP systems and/or with highly threaded code?

brucehoult · « **Reply #84 on:** July 11, 2020, 01:22:59 am »

Quote from: tggzzz on July 10, 2020, 08:45:18 am

Quote from: brucehoult on July 10, 2020, 08:06:54 am
Quote from: tggzzz on July 09, 2020, 05:05:59 pm
Quote from: SiliconWizard on July 09, 2020, 04:58:14 pm
Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

It makes a good attempt at reclaiming most garbage in programs that match its limitations.

Scarcely general purpose.

I've not seen it fail in any significant way. A number of programming languages use Boehm as their GC -- and many others *should*. For example Mono (open source C#) did for a long time -- they eventually wrote their own GC and it is the only example I know of where replacing Boehm by a custom GC improved performance.

I'm more interested in correctness than performance. After all, if it is permissible that my code can contain faults, I can increase performance by orders of magnitude

It can be the devil's own job demonstrating that a GC is at fault.

It's amazing how many programs have use-after-free bugs. They just get lucky because the same memory chunk hasn't been reallocated yet. GC makes that impossible -- indeed Boehm can help you find those.

Even more programs simply stop using objects (and nothing points to them) without freeing them. That is probably *the* major reason why you have to restart your web browser, Windows, your WIFI router etc on a regular basis. GC prevents that.

Note that while in theory conservative GC can fail to collect some objects, Hans Boehm has a paper that shows that this effect is bounded -- it's extremely unlikely to cause OOM in the way that memory leaks with malloc/free do.

"Smart" pointers in C++ are just a PITFA to use. They clutter the code and drastically reduce its clarity -- and then people *still* make mistakes.

Quote

Quote
On many or most C/C++ programs you can improve performance significantly (at the cost of a relatively minor increase in RAM usage) by compiling Boehm so that malloc becomes GC_malloc() and free() becomes a no-op and using LD_PRELOAD to replace the malloc() and free() in glibc.

Presumably until garbage is collected!

Including GCs of course. I'm talking about 10% to 100% more memory use than with the original malloc/free library. A big benefit of GC is you can tune your program to get the speed/memory use tradeoff you want (e.g. using the GC_FREE_SPACE_DIVISOR environment variable in Boehm)

If you simply disable free() while keeping standard malloc() then programs will work and will run fast, right up until they run out of memory. I've tried this in a professional setting. For example disabling free() during one LLVM optimization pass on one function and then releasing all the memory at the end in one hit can make sense -- it will probably use 5 MB. Disabling free() entirely in LLVM (or gcc, I've tried both) works OK for very small programs but for anything non-trivial you quickly find the compiler using tens or hundreds of GB of RAM.

Simply disabling free() on long running programs such as servers is completely impossible, of course.

Quote

I haven't looked at the topic for a long time. What are the Boehm's GC's characteristics on SMP systems and/or with highly threaded code?

It's excellent for throughput-oriented tasks i.e. "how long does the program take to finish?" Each thread gets its own pool of space for new objects, so there is essentially no allocation contention between threads. When a GC is needed all threads are stopped during marking -- but all CPU cores are used for the marking. To whatever extent app threads were allocating private objects there is no contention between marker threads, but they are free to follow pointers to "other threads" object graphs as needed.

This "world stopped" marking is usually very fast. A few ms or even less.

Boehm always does the sweep phase incrementally. Each page of objects of a particular size (normally the same as a VM page) is swept at the moment that the first new object is about to be allocated from that page. That is, the mark bits for that page are scanned and any unmarked objects are added to the start of the free list for objects of that size (and the first one will be allocated immediately). Succeeding objects of that size allocated by the same app thread will be from that same memory page until the page is full.

All these features have been in Boehm since .. at least around 1995-2000 when I started to learn its deep innards and use it extensively. (I'm probably ranked pretty close after Hans Boehm and the long time official maintainer Ivan Maidansky in knowledge of the current innards of Boehm -- Ivan hired me to work with him at Samsung R&D Institute Russia in 2015 based largely on my GC knowledge and experience)

In 2005-2008 I worked for a tiny company writing a Java native compiler for BREW mobile phones. One of my main tasks was modifying Boehm GC to run well on 1 MHz ARM machines with as little as 400 K of RAM (though 1 to 2 MB was more common). The vast majority of the users were porting games from Java phones to BREW, though there was the occasional medical or industrial app. We ported the same system to iPhone (before there was an SDK and you had to jailbreak and hack everything yourself) and a number of games and other apps early in the AppStore were actually written in (native compiled) Java. Apple never noticed and some e.g. Virtual Villagers were in the top 20 iPhone games for ages.

tggzzz · « **Reply #85 on:** July 11, 2020, 08:21:18 am »

Quote from: brucehoult on July 11, 2020, 01:22:59 am

Quote from: tggzzz on July 10, 2020, 08:45:18 am
Quote from: brucehoult on July 10, 2020, 08:06:54 am
Quote from: tggzzz on July 09, 2020, 05:05:59 pm
Quote from: SiliconWizard on July 09, 2020, 04:58:14 pm
Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

It makes a good attempt at reclaiming most garbage in programs that match its limitations.

Scarcely general purpose.

I've not seen it fail in any significant way. A number of programming languages use Boehm as their GC -- and many others *should*. For example Mono (open source C#) did for a long time -- they eventually wrote their own GC and it is the only example I know of where replacing Boehm by a custom GC improved performance.

I'm more interested in correctness than performance. After all, if it is permissible that my code can contain faults, I can increase performance by orders of magnitude

It can be the devil's own job demonstrating that a GC is at fault.

It's amazing how many programs have use-after-free bugs. They just get lucky because the same memory chunk hasn't been reallocated yet. GC makes that impossible -- indeed Boehm can help you find those.

Even more programs simply stop using objects (and nothing points to them) without freeing them. That is probably *the* major reason why you have to restart your web browser, Windows, your WIFI router etc on a regular basis. GC prevents that.

Note that while in theory conservative GC can fail to collect some objects, Hans Boehm has a paper that shows that this effect is bounded -- it's extremely unlikely to cause OOM in the way that memory leaks with malloc/free do.

"Smart" pointers in C++ are just a PITFA to use. They clutter the code and drastically reduce its clarity -- and then people *still* make mistakes.

Yes to all of that, especially the smart pointers and "lost" objects.

However, the effects of lost objects are also be found in GC environments - but those are due to unwittingly retaining a handle to the objects. Many people refer to those as "memory leaks", but I think "data cancer" is a better characterisation.

Quote

Quote
Quote
On many or most C/C++ programs you can improve performance significantly (at the cost of a relatively minor increase in RAM usage) by compiling Boehm so that malloc becomes GC_malloc() and free() becomes a no-op and using LD_PRELOAD to replace the malloc() and free() in glibc.

Presumably until garbage is collected!

Including GCs of course. I'm talking about 10% to 100% more memory use than with the original malloc/free library. A big benefit of GC is you can tune your program to get the speed/memory use tradeoff you want (e.g. using the GC_FREE_SPACE_DIVISOR environment variable in Boehm)

If you simply disable free() while keeping standard malloc() then programs will work and will run fast, right up until they run out of memory. I've tried this in a professional setting. For example disabling free() during one LLVM optimization pass on one function and then releasing all the memory at the end in one hit can make sense -- it will probably use 5 MB. Disabling free() entirely in LLVM (or gcc, I've tried both) works OK for very small programs but for anything non-trivial you quickly find the compiler using tens or hundreds of GB of RAM.

Simply disabling free() on long running programs such as servers is completely impossible, of course.

Understood and accepted.

My primary interests have been in server applications and long-running desktop applications. The short-duration embarassingly-parallel compilation you mention is an important use case, but uninteresting to me.

Quote

Quote
I haven't looked at the topic for a long time. What are the Boehm's GC's characteristics on SMP systems and/or with highly threaded code?

It's excellent for throughput-oriented tasks i.e. "how long does the program take to finish?" Each thread gets its own pool of space for new objects, so there is essentially no allocation contention between threads. When a GC is needed all threads are stopped during marking -- but all CPU cores are used for the marking. To whatever extent app threads were allocating private objects there is no contention between marker threads, but they are free to follow pointers to "other threads" object graphs as needed.

This "world stopped" marking is usually very fast. A few ms or even less.

Boehm always does the sweep phase incrementally. Each page of objects of a particular size (normally the same as a VM page) is swept at the moment that the first new object is about to be allocated from that page. That is, the mark bits for that page are scanned and any unmarked objects are added to the start of the free list for objects of that size (and the first one will be allocated immediately). Succeeding objects of that size allocated by the same app thread will be from that same memory page until the page is full.

To be picky, that is multithreaded. Does it also apply when the threads are spread over multiple cores?

Quote

In 2005-2008 I worked for a tiny company writing a Java native compiler for BREW mobile phones. One of my main tasks was modifying Boehm GC to run well on 1 MHz ARM machines with as little as 400 K of RAM (though 1 to 2 MB was more common). The vast majority of the users were porting games from Java phones to BREW, though there was the occasional medical or industrial app. We ported the same system to iPhone (before there was an SDK and you had to jailbreak and hack everything yourself) and a number of games and other apps early in the AppStore were actually written in (native compiled) Java. Apple never noticed and some e.g. Virtual Villagers were in the top 20 iPhone games for ages.

Nice

Compiled Java has sufficient and well-defined information about the objects available at runtime. I could regard that as an application which can fit in with Boehm's limitations. It isn't an arbitrary C/C+ program with unconstrained (and probably UB) access to random bytes.

Nominal Animal · « **Reply #86 on:** July 11, 2020, 03:10:12 pm »

For long-lived processes, I like to use a pooled allocator.

(The locality of the allocations is not important at all. The simplest implementation is one that adds a single pointer to each allocation, so that allocations within a "pool" form a singly-linked list. Some use cases can benefit from having the "pools" form a tree as well, so that if an error occurs at a higher conceptual level, all dependent pools can be destroyed at once. Instead of freeing individual allocations, you free an entire pool (including its sub-pools). This is surprisingly robust, and the only "problem" is that you sometimes need to extract or copy individual allocations from one pool to another, when they need to outlive the pool.)

I've written some BLAS/LAPACK style code in C for linear algebra (matrices), using reference counting for the actual data in the matrices (so that all matrices are "views" to data, and multiple matrices can refer to the same data, for example transposed, or say a row or column vector, or even a diagonal vector), but that still relies on users explicitly destroying the matrices when they're done. Using a pool-based allocator, a large numerical task could simply use a local pool (under the pool for the overarching task at hand), and simply move the final result matrices to the parent pool, and free all others in the local pool. Much simpler, especially in error conditions.

What I've often wondered about, is the lack of native pool allocation facilities -- and even good pool allocator libraries. I don't particularly like APR; while it works well for Apache, I don't think it is a good match for general service-type cases.

Garbage collection has been investigated very deeply (the articles, proofs, and implementations are actually quite impressive, even though I don't like to use GC myself), but I don't think I've seen pool allocators used outside of Apache and maybe a couple of other projects I've already forgotten about.

SiliconWizard · « **Reply #87 on:** July 11, 2020, 04:28:56 pm »

Quote from: brucehoult on July 10, 2020, 08:06:54 am

Quote from: tggzzz on July 09, 2020, 05:05:59 pm
Quote from: SiliconWizard on July 09, 2020, 04:58:14 pm
Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

It makes a good attempt at reclaiming most garbage in programs that match its limitations.

Scarcely general purpose.

I've not seen it fail in any significant way. A number of programming languages use Boehm as their GC -- and many others *should*.

I mostly agree with that. If someone is claiming the Boehm GC is flawed, please at least give details, and accurate facts and figures about it. It has been widely used indeed.

Note that I was merely pointing out an option for those that would be dead sure you can't have a GC with C or C++. I was not particularly advocating using one, as GCs have specific issues (however good they are) that you don't necessarily want to have to deal with, such as non-predictable execution times. I am also not 100% convinced about garbage collecting in general - it naturally leads to sloppy resource management. You may argue that it will almost always do a better job than even careful "manual" resource management, and I wouldn't necessarily disagree in the general case, but I'm still not sure it promotes completely healthy programming habits, or that it's even the best way of dealing with "automatic" resource (memory) management.

SiliconWizard · « **Reply #88 on:** July 11, 2020, 04:37:15 pm »

Quote from: Nominal Animal on July 11, 2020, 03:10:12 pm

For long-lived processes, I like to use a pooled allocator.

I've used this approach quite a bit as well. I like it.

Quote from: Nominal Animal on July 11, 2020, 03:10:12 pm

(The locality of the allocations is not important at all. The simplest implementation is one that adds a single pointer to each allocation, so that allocations within a "pool" form a singly-linked list. Some use cases can benefit from having the "pools" form a tree as well, so that if an error occurs at a higher conceptual level, all dependent pools can be destroyed at once. Instead of freeing individual allocations, you free an entire pool (including its sub-pools). This is surprisingly robust, and the only "problem" is that you sometimes need to extract or copy individual allocations from one pool to another, when they need to outlive the pool.)

That may be a bit off-topic here, but related. That makes me think of the implementation of linked lists, and trees in general.
Except back when I was studying, because it was textbook stuff, I have rarely implemented those with a lot of small (thus very fragmented) allocations linked with pointers. Even with good allocators, it eventually tends to fragment memory a lot and have bad locality properties, which is pretty bad for caches. I usually allocate largish chunks of memory and handle objects in it with indexing. In many cases, this is quite effective and the resulting performance is often very largely better. This is a form of pooling. Then when the pool becomes full, I reallocate a larger one (with a large increment depending on the application). Of course in some very specific cases, it would not be that good, but overall it has proven very effective.

guenthert · « **Reply #89 on:** July 11, 2020, 05:01:22 pm »

Not necessarily my favorite feature, but one which I occasionally miss dearly in lesser languages:
keywords (symbols which evaluate to themselves) as found in Common Lisp.

There seem to be only cumbersome and ugly solutions in C/C++ when one wants e.g. string representations of enums. A common recommendation is to maintain a separate file with the definitions from which the C/C++ code is generated using a script ...

Comparing C++ with Common Lisp might not be all that fruitful, but C++ is really an unnecessarily ugly duckling. One needs to look only as far as Nim to see, what C++ could have become.

Ed.Kloonk · « **Reply #90 on:** July 11, 2020, 05:08:57 pm »

The feature I like most about Basic is that it just. wont. die.

Nominal Animal · « **Reply #91 on:** July 11, 2020, 05:53:41 pm »

Quote from: SiliconWizard on July 11, 2020, 04:37:15 pm

Except back when I was studying, because it was textbook stuff, I have rarely implemented those with a lot of small (thus very fragmented) allocations linked with pointers. Even with good allocators, it eventually tends to fragment memory a lot and have bad locality properties, which is pretty bad for caches.

True, that's why I said it was just the simplest implementation.

Memory allocation strategies themselves are a very complex topic, especially when you move from a single continuous heap (C sbrk() style) to obtaining and returning arbitrary pages back to the OS (POSIX C anonymous mmap() style). For pooled allocators, the discontinuous/page approach is not only easier, but when entire pools are discarded, the memory used can immediately be returned to the kernel, too; and fragmentation is much easier to avoid.

The main benefit of pooled allocators is that it makes it easier to write code that does not leak memory, because the developer does not need to track each individual allocation, and can instead treat them as sets. Using a new set for each complex task means the programmer only needs to remember to discard the set, not each and every allocation individually.

Quote from: SiliconWizard on July 11, 2020, 04:37:15 pm

I usually allocate largish chunks of memory and handle objects in it with indexing. In many cases, this is quite effective and the resulting performance is often very largely better. This is a form of pooling.

Yes, and that is exactly how and why Python NumPy implements its own array type, for example. I myself also use this in C a lot; so often that I no longer even think about it. (I worry more about the names of the variables/members; I like to use used for the number of elements currently in use, and size for the number of elements allocated for the area, with often data as the pointer to the area.. it soothes my OCD, I guess, but some people have found such names unintuitive.)

When I wrote above "the locality of the allocations is not important at all", I tried to express that I am not talking about that form of pooling – which is common, but done on the "user" or "application side", with the application developer managing the pool –, but having dynamic memory management where each allocation always belongs to a pool. For example, in C, you'd have malloc(pool, size), calloc(pool, size, count), free_all(pool), new_pool(parent_pool), and so on. (Logically, the NULL pool, the default pool, would usually have the same lifetime as the process.)

I did not mean that data locality isn't important in general. It is, for efficiency and other reasons. I just wanted to try and highlight the usefulness of the other aspects of pooled memory management.

(SiliconWizard, I do believe you know perfectly well all I wrote in this post already; I only wrote this in the hopes that it helps others understand this too.)

tggzzz · « **Reply #92 on:** July 11, 2020, 07:56:03 pm »

Quote from: SiliconWizard on July 11, 2020, 04:28:56 pm

Quote from: brucehoult on July 10, 2020, 08:06:54 am
Quote from: tggzzz on July 09, 2020, 05:05:59 pm
Quote from: SiliconWizard on July 09, 2020, 04:58:14 pm
Note that you can replace the standard allocator with a garbage collector in C and C++ if you like that. A well known, and relatively well-behaved one is the Boehm GC: https://www.hboehm.info/gc/

It makes a good attempt at reclaiming most garbage in programs that match its limitations.

Scarcely general purpose.

I've not seen it fail in any significant way. A number of programming languages use Boehm as their GC -- and many others *should*.

I mostly agree with that. If someone is claiming the Boehm GC is flawed, please at least give details, and accurate facts and figures about it. It has been widely used indeed.

Hans Boehm was completely open about what it couldn't do. Apply those to arbitrary programs composed of components (e.g. libraries!) produced by people not working together. Then try to ensure correct operation.

If there is a strict control of all code in a particular environment (e.g. Apple's) then it becomes less intractible - although still difficult and somewhat limited.

Quote

Note that I was merely pointing out an option for those that would be dead sure you can't have a GC with C or C++. I was not particularly advocating using one, as GCs have specific issues (however good they are) that you don't necessarily want to have to deal with, such as non-predictable execution times. I am also not 100% convinced about garbage collecting in general - it naturally leads to sloppy resource management. You may argue that it will almost always do a better job than even careful "manual" resource management, and I wouldn't necessarily disagree in the general case, but I'm still not sure it promotes completely healthy programming habits, or that it's even the best way of dealing with "automatic" resource (memory) management.

You sound as if you think people are claiming GC is a silver bullet, which would be stupid.

Nonetheless GC removes a whole host of traditional problems, thus enabling more complex systems to be rapidly constructed and work. Naturally people will overuse that capability without exhibiting any "good taste"; such is life.

I'll never forget the first time I looked at the Java standard libraries in 1996, and saw that they had effortlessly produced that which had eluded C++ for a decade - and gone far beyond. At that stage there wasn't a C++ string class, and there were endless (well, year-long at least) debates about what "const" ought to mean.

brucehoult · « **Reply #93 on:** July 12, 2020, 02:05:41 am »

Quote from: guenthert on July 11, 2020, 05:01:22 pm

Not necessarily my favorite feature, but one which I occasionally miss dearly in lesser languages:
keywords (symbols which evaluate to themselves) as found in Common Lisp.

There seem to be only cumbersome and ugly solutions in C/C++ when one wants e.g. string representations of enums. A common recommendation is to maintain a separate file with the definitions from which the C/C++ code is generated using a script ...

This seems to be mixing up two distinct things: 1) the concept of "symbol", and 2) converting bidirectionally between symbols and human-readable strings.

Many applications of symbols don't require conversion to strings and very very few require conversion from strings to symbols.

Lisp symbols are *not* enums. They don't have sequential or even well-defined values. Normally they are simply the memory address of some object. It doesn't even matter what the type of the object is. This easy to do in C and the linker will ensure that non-static objects with the same name used in different compilations will end up with the same address.

Make one file...

Code: [Select]

typedef void* Symbol;

Symbol square, circle;

float area(Symbol shape, float sz) {
  if (shape == &square) return sz * sz;
  if (shape == &circle) return 3.14/4 * sz * sz;
  return 0.0;
}

... and another ...

Code: [Select]

#include <stdio.h>

typedef void* Symbol;

Symbol circle;

float area(Symbol shape, float sz);

void main() {
  printf("Area = %f\n", area(&circle, 20.0));
}

Compile them, link them, run, and the output will be "Area = 314.000000"

(I've done one very slightly naughty thing here, punning void* and void** but never mind)

It's a bit sad that C "switch" won't work with pointers, but that's only syntax.

You can somewhere -- anywhere -- in yet another file if you want, put:

Code: [Select]

Symbol square="square", circle="circle";

Everything will work just as before.

And then you could modify the area function with some debug...

Code: [Select]

float area(Symbol shape, float sz) {
  float res;
  if (shape == &square) res =  sz * sz;
  if (shape == &circle) res = 3.14/4 * sz * sz;
  printf("calculating area(%s, %f) = %f\n", *(char**)shape, sz, res);
  return res;
}

Now the output is:

Code: [Select]

calculating area(circle, 20.000000) = 314.000000
Area = 314.000000

You can of course hide the casting for the SYMBOL->STRING conversion behind a macro or function.

This covers about 90% of the rest of the usual Lisp/Scheme use of symbols.

To handle runtime uses of STRING->SYMBOL (which is *pretty* unusual in practice) you need to set up some kind of runtime search -- a linked list or binary tree or hash table or something.

Common Lisp (and Emacs Lisp) also use symbols to hold property lists. To handle this you just make the Symbol not be a simple void*/char* but a struct holding both a pointer to the name and a pointer to the first item in the property list (which traditionally are actual linked lists).

GeorgeOfTheJungle · « **Reply #94 on:** July 12, 2020, 09:42:08 am »

Quote from: tggzzz on July 11, 2020, 07:56:03 pm

I'll never forget the first time I looked at the Java standard libraries in 1996, and saw that they had effortlessly produced that which had eluded C++ for a decade - and gone far beyond. At that stage there wasn't a C++ string class, and there were endless (well, year-long at least) debates about what "const" ought to mean.

Strings alone are enough reason to switch from C to C++, if you ask me. Anybody who's ever had to deal with lots of strings in C will immediately fall in love with automatic String objects.

tggzzz · « **Reply #95 on:** July 12, 2020, 02:43:24 pm »

Quote from: GeorgeOfTheJungle on July 12, 2020, 09:42:08 am

Quote from: tggzzz on July 11, 2020, 07:56:03 pm
I'll never forget the first time I looked at the Java standard libraries in 1996, and saw that they had effortlessly produced that which had eluded C++ for a decade - and gone far beyond. At that stage there wasn't a C++ string class, and there were endless (well, year-long at least) debates about what "const" ought to mean.

Strings alone are enough reason to switch from C to C++, if you ask me. Anybody who's ever had to deal with lots of strings in C will immediately fall in love with automatic String objects.

Strings are relatively simple and low-hanging fruit, useful only to illustrate the snail-like progress in C++.

I'd been using plain-vanilla generic containers (sets, bags, dictionaries, linked lists) in multiple languages since the mid 80s. The lack of those in C++ was a major impediment to programming quickly and correctly.

The continual hacked-together erzatz variations of containers in every C++ project reminded me of the 1970s, when new recruits were told to develop the power supply for a project. The PSU was always a problem area in such projects, ditto crap containers in C++.

Also C/C++ finally decided that it needed new language features for multicore processors, decades after other languages had known that and implemented it successfully.

guenthert · « **Reply #96 on:** July 12, 2020, 05:44:41 pm »

Quote from: brucehoult on July 12, 2020, 02:05:41 am

[..]
Many applications of symbols don't require conversion to strings

Well, pretty much all which maintain a log file do. That might very well be the majority of them.

Quote from: brucehoult on July 12, 2020, 02:05:41 am

and very very few require conversion from strings to symbols.

How about parsing config files? Might not be what the majority is doing, but hardly an exotic application.

Quote from: brucehoult on July 12, 2020, 02:05:41 am

Lisp symbols are *not* enums.

I made no such claim and I don't think I implied as much.

Quote from: brucehoult on July 12, 2020, 02:05:41 am

They don't have sequential or even well-defined values.

The value of keyword symbols are themselves. Can't get much more well-defined imho.

brucehoult · « **Reply #97 on:** July 12, 2020, 11:51:05 pm »

Quote from: guenthert on July 12, 2020, 05:44:41 pm

Quote from: brucehoult on July 12, 2020, 02:05:41 am
They don't have sequential or even well-defined values.
The value of keyword symbols are themselves. Can't get much more well-defined imho.

Nope. Symbols are *not* self-evaluating. If you write the bare name of a symbol then it is looked up in the current context and you get whatever it is bound to, not the symbol itself. That's equally true in Lisps and in the C implementation I gave. If you want to refer to a symbol by name -- as a symbol -- then you *have* to quote it by variously putting a ' or a : after it, or perhaps various other things e.g. a # in front if it in Dylan or a * in front of it in Perl or a \ in front of it in PostScript.

guenthert · « **Reply #98 on:** July 13, 2020, 12:00:41 am »

Quote from: brucehoult on July 12, 2020, 11:51:05 pm

Quote from: guenthert on July 12, 2020, 05:44:41 pm
Quote from: brucehoult on July 12, 2020, 02:05:41 am
They don't have sequential or even well-defined values.
The value of keyword symbols are themselves. Can't get much more well-defined imho.

Nope. Symbols are *not* self-evaluating. If you write the bare name of a symbol then it is looked up in the current context and you get whatever it is bound to, not the symbol itself. That's equally true in Lisps and in the C implementation I gave. If you want to refer to a symbol by name -- as a symbol -- then you *have* to quote it by variously putting a ' or a : after it, or perhaps various other things e.g. a # in front if it in Dylan or a * in front of it in Perl or a \ in front of it in PostScript.

Keyword symbols are bound to themselves. You might want to peruse the standard: http://www.lispworks.com/documentation/HyperSpec/Body/t_kwd.htm

brucehoult · « **Reply #99 on:** July 13, 2020, 01:37:16 am »

Quote from: guenthert on July 13, 2020, 12:00:41 am

Quote from: brucehoult on July 12, 2020, 11:51:05 pm
Quote from: guenthert on July 12, 2020, 05:44:41 pm
Quote from: brucehoult on July 12, 2020, 02:05:41 am
They don't have sequential or even well-defined values.
The value of keyword symbols are themselves. Can't get much more well-defined imho.

Nope. Symbols are *not* self-evaluating. If you write the bare name of a symbol then it is looked up in the current context and you get whatever it is bound to, not the symbol itself. That's equally true in Lisps and in the C implementation I gave. If you want to refer to a symbol by name -- as a symbol -- then you *have* to quote it by variously putting a ' or a : after it, or perhaps various other things e.g. a # in front if it in Dylan or a * in front of it in Perl or a \ in front of it in PostScript.

Keyword symbols are bound to themselves. You might want to peruse the standard: http://www.lispworks.com/documentation/HyperSpec/Body/t_kwd.htm

Keyword symbols and symbols are different things -- or at least keyword symbols are a subset of symbols. Keyword symbols are a particular Common Lisp feature while I'm talking about a symbol concept that is common across quite a number of programming languages.

To quote from the page you reference:

Quote

Interning a symbol in the KEYWORD package has three automatic effects:

1. It causes the symbol to become bound to itself.

i.e. any symbol that has *not* been interned into the KEYWORD package is *not* bound to itself, and that is most symbols.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: What's your favorite language feature? (Read 13385 times)

Share me