Author Topic: text / strings in C  (Read 2714 times)

0 Members and 1 Guest are viewing this topic.

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 1775
  • Country: fi
    • My home page and email address
Re: text / strings in C
« Reply #100 on: April 15, 2020, 11:37:23 am »
Surprisingly, there is quite a lot of bare-metal C code, that is portable between wildly different architectures.  And I'm not talking only about the Linux kernel, either -- although it is probably the most well known example.

Any embedded device that has a product lifetime longer than say 5 years should consider the portability aspect, really; it may make a significant difference to the BOM cost, later on.  The portability aspect is the only thing that can save on the software development cost.  If you look at many embedded devices, like routers, you'll notice they can completely switch hardware architectures within the same product, between versions/revisions.  You don't do that, unless (most of) the same code can work on both.

Others who work on actual commercial embedded products could chime in, as I don't, but even using an interface shim layer using the custom types I mentioned on top of existing HALs, can make the actual product much easier to port between vendors (and their HALs).

In particular, for libraries and HALs, one must remember that it does not matter if you have one or more implementation, as long as the user-developer facing side is the same across all of them; then, the user-developer does not even need to port their code between the architectures, as it should Just Work -- much like in the Arduino system.  (Except that because Arduino folks did not consider the integer types, there is a lot of stuff that makes life hard for library writers, and compiled Arduino code less than optimal, particularly when comparing 8-bit AVR and 32-bit ARMs.)

A lot of what I have written here, Simon, is something to consider only, and perhaps let simmer; something you might recall when encountering a related problem.  In particular, do feel comfortable using just size_t and int and unsigned int, because that's how most existing C code does it.

Besides, especially during the learning phase, it is most important to get stuff working, even if it is not that elegant/clean/optimal, as writing the code is just a small part of software engineering, and you need to have somewhat working code to get experience on the rest, especially testing, maintenance (and porting, yes), documentation, and so on.

Also, I've found out that if something turns out to be useful in the long term or in more than one environment, you do end up rewriting it, incorporating all the features (and dropping the unneeded ones) and details one has learned from experience.  So, it is not "ugly" or "bad" to write code that you know is far from optimal!  (Security, on the other hand, must be designed in to the software, and cannot be bolted on top afterwards.)

Indeed, one of the common programmer faults is premature optimization.  Algorithmic and system-level optimizations always yield much better results than code-level optimizations, and my own experience says that one shouldn't bother code-level optimizations at all before the first rewrite; the actual use and testing of the "crude"/"na├»ve" version always teaches me so much about the actual human-scale problem/task at hand, that code optimization before that is usually just wasted time.  There are exceptions, of course, but there is something in code-level optimization that tends to attract programmer minds, and being aware of it and that it doesn't matter much at all in real life, is kinda important.
« Last Edit: April 15, 2020, 11:39:05 am by Nominal Animal »

Offline Simon

  • Global Moderator
  • *****
  • Posts: 15110
  • Country: gb
  • Did that just blow up? No? might work after all !!
    • Simon's Electronics
Re: text / strings in C
« Reply #101 on: April 15, 2020, 01:16:24 pm »
Barebones code cannot be portable unless the new device has the exact same peripheral functionality and registers. Application code can be portable. I assume this is the aim of any HAL to abstract out the hardware so that the same top level application code works. So currently i am working with the SERCOM of the SAMC in SPI mode. it is unlikely that outside of the SAMC family the registers will be the same. I have already found the SAMD registers to be different for the timer/counter even though it's a similar device. But yes i aim to present to my application code functions to call to interact with the hardware that can be rewritten for other architectures, so that a move of target would mean that having rewritten my low level drivers i can just port it over. My SPI libraries are in two files, one that does hardware interaction and one that creates useful things that the main code can call.

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 5452
  • Country: fr
Re: text / strings in C
« Reply #102 on: April 15, 2020, 06:59:53 pm »
Surprisingly, there is quite a lot of bare-metal C code, that is portable between wildly different architectures.

Yes. Most of the code I've ever written was completely portable. Even low-level stuff. Obviously hardware-related code would need to be modified for a different target, but even that you can write as portable as can be, so the porting effort is minimal. It saves a gigantic amount of time in the long run. I'm sad to see that many people never realize it, and kind of keep "rewriting" the same stuff over and over again (possibly with the same initial bugs to iron out), as though they were paid by the amount of code lines. Sadly, I think this is very close to being the case for many employed developers.


Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 1775
  • Country: fi
    • My home page and email address
Re: text / strings in C
« Reply #103 on: April 16, 2020, 04:26:51 am »
Barebones code cannot be portable unless the new device has the exact same peripheral functionality and registers.
No, that is not true.

There is a difference between "being portable" and "compiling as-is" on different architectures.  The code can have modular parts, where alternative implementations of some part all provide the exact same interfaces, only differ in their internal implementation.  This is usually called abstraction -- and if you do it in a library form for a set of hardware, you get a hardware abstraction layer -- or a driver.. but at the core, it is just a modular approach, choosing how a detail is implemented (often, but not always based on the hardware it needs to access), within a single software project.

(Note that I am writing the following in case someone finds it interesting.)

There are four common ways this can be done:
  • As a library. (A set of these libraries, or a single full-set one, is typically called a hardware abstraction layer).  The library provides a public Application Programming Interface that is the same across all hardware, it is just internally implemented differently for different hardware.
  • As dynamically loaded drivers.  This is common in full OS kernels, where the driver functionality is exposed to userspace using kernel core provided services.  In particular, drivers rarely communicate with each other, and if they access the same resources, an arbitration layer or multiplexer is needed.  In the Linux world, you have both in-kernel drivers, and out-of-tree drivers.  The out-of-tree drivers have to track the kernel, because the Linux kernel does not provide any kind of stable API for drivers; only the kernel-userspace binary interfaces are stable.
  • Conditional linking. The project has optional source files, with only a subset compiled and linked in to the final binary or binaries. This usually involves a configuration utility, where one can choose which variants are compiled in.  The configuration utility not only sets the preprocessor flags needed for the compilation, but also provides lists of enabled source files.  The Linux kernel, for example, developed kbuild and kconfig for this, with several frontends for the configuration, from command-line to terminal to graphical user interfaces.
  • Conditional compilation.  Preprocessor macros are used to choose or select which source code files are included, or between sections of source code, so that the same code works on multiple different hardware.  An excellent example of this is the <immintrin.h> compiler extensions on some x86 and all x86-64 architectures, which includes SSE and AVX vector extensions, by conditionally including sub-header files based on compiler provided preprocessor macros that expose compiler built-ins and extensions to C code using a standard interface.

In all four cases, the way the C standard defines the integer types (char, short, int, long, and their unsigned variants) makes it hard to write efficient portable code that works on different hardware (different register and word sizes, and memory access methods).  Custom types, on top of exact-width intN_t/uintN_t, size_t, and int_fastN_t/uint_fastN_t, much better match the programmer intent, while still allowing the compiler to generate optimal code.

One could of course argue that because it is not the exact same preprocessed C source for different architectures, it is not exactly the same C code, but I disagree, because the code is part of the same project, is in the same intertwined and interconnected source code set, written and maintained by the same people.
« Last Edit: April 16, 2020, 04:31:32 am by Nominal Animal »

Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo