Well, in case it is useful, let me describe the code!
If not, no worries; I'm perfectly happy to just leave this here in case someone else may stumble on this later on, and find this useful.
The core idea is that we use a single
read() call, to obtain whatever is already available in the pipe. See
man 7 pipe, especially the "I/O on pipes" and "Pipe capacity" sections.
To avoid dealing with partial samples, we do additional reads until we have a multiple of 4 characters (or a full BLOCK_SIZE) buffer.
If read() returns 0, it indicates end-of-input, and the program can exit. For simplicity, if this occurs after a supplementary read because the previous ones did not return an integral number of samples, we just throw the entire tail buffer away. (This really is intended to be used as a real-time continuous data filter. To fix this, we could just set a flag ending the outer loop iterations, instead of immediately returning.)
Read should return -1 on error, but because of certain old bugs related to 32-vs-64, I like to consider all other negative values as EIO errors as well.
(Technically, read will return -1 with
errno==EINTR when a signal is delivered to a handler installed without SA_RESTART flag; and if nonblocking,
errno==EAGAIN or
errno==EWOULDBLOCK if no data is available, but since this program does not have userspace handlers and we can assume standard input and output are blocking (uh, not nonblocking), we can safely consider all of them as actual errors the program cannot deal with.)
On the write side, we assume that most reads we do, are of the original buffer size, and therefore will fit in the output buffer in a single write() call.
However, the code does not assume that indeed
will happen every single time. Instead, it uses a loop to write the entire converted output buffer.
See the
man 2 write man page for description and rules and further information.
Since there is no cleanup or such to do when complete, and we exit the program directly from within the loop, the outermost loop is an forever one. I prefer
while (1) { ... }, some others prefer
for (;;) { ... }; either one would work fine here.
The logic thus described, let's open up the important parts. For clarity, I'll separate each chunk with a horizontal line.
/* Read some (complete) int32_t's. */
do {
ssize_t n = read(STDIN_FILENO, (char *)(in_buf) + have, (sizeof in_buf) - have);
if (n > 0)
have += n;
else
if (n == 0)
return EXIT_SUCCESS;
else
return EXIT_FAILURE;
} while (!have || (have % sizeof in_buf[0]));
Since
read() returns the number of
chars, we need to consider our input buffer as a buffer of
chars. Within this loop,
have is the number of
chars we already have in the
in_buf. Thus,
(char *)(in_buf) + have is a pointer to the first unused
char in
in_buf, and
(sizeof in_have) - have is the number of unused
chars in it. We do a
read(), trying to fill the rest of the buffer –– and since
have starts at 0, we actually try to fill in the entire buffer. Of course,
read() will block until there is some data in the pipe, and then return that data (up to the limit we specified).
n is the actual number of
chars we read.
The
if clauses check if we did get any data; and if not, exit the program. (Again, while we may throw some already read data away, we only do so if that already read data did not end at a valid sample boundary. To me, this is sufficient to indicate the tail part of the data is suspect anyway, and not worth processing.)
The loop condition can be read in English as
"as long as have is zero, or have is not a multiple of the size of in_buf array elements".
After the above loop is done, we divide
have by the size of the
in_buf array elements, so that it becomes the number of samples we have in
in_buf. Because our reads started at the beginning of the buffer, we know the data is properly aligned. Essentially, it is at this point that we change from treating the input as a stream of
chars, and interpret it as the representation of the
in_buf array elements.
(I like having such clear logical transition points.)
The next part, the conversion from
in_buf to [/tt]out_buf[/tt],
/* Convert to float. */
{
float *const end = out_buf + have;
int32_t *src = in_buf;
float *dst = out_buf;
while (dst < end)
*(dst++) = *(src++);
}
is just the pointer version of the simple loop
for (size_t i = 0; i < have; i++) {
out_buf[i] = in_buf[i];
}
Why did I write it in the pointer form, when the simple loop is so much more readable?
I'm a creature of habit, and GCC used to generate better code when using pointers, compared to array indexing, on some architectures. x86 and x86-64 has powerful indexing built in to its instruction set, so the simple loop tended to be preferable on x86 and x86-64 even using old versions of GCC.
Or, you could equally say that since I was in the pointer-logic-thought-mode when writing the code, I just didn't stop and think before I wrote the loop, and just let my fingers type the solution when I was already thinking something else.
Even examining the possible code generated for the two loops at
Compiler Explorer shows I really should have written the array indexing form instead. What can I say in my defense? I never claimed the code was the best I could think of, I only indicated this would be something I would write in a couple of minutes to perform the task I needed it to perform, with the logic I described above.
There is
always room to learn and improve.
The final part, writing out the (full) converted
out_buf, uses the same logic as the read loop, except that this time we loop until the buffer is empty.
end points to just past the final char to be written. Note how
(char *)(out_buf + have) is the char pointer to the
have'th element; it is deliberately NOT
(char *)(out_buf) + have, which would be a char pointer to the
have'th
char.
Both
ptr (pointing to the first char that needs to be written) and
end (pointing to the char following the last char to be written, or first char after the data to be written) are pointers to
char, because that is the units in which
write() operates with. The expression
(size_t)(end - ptr) is the number of chars between the two pointers, if and only if
ptr <= end (or equivalently
end >= ptr).
The loop condition can be read in English as
"while we have chars between ptr and end to be written":
while (ptr < end) {
ssize_t n = write(STDOUT_FILENO, ptr, (size_t)(end - ptr));
if (n > 0)
ptr += n;
else
return EXIT_FAILURE;
}
All error cases are aggregated into the
else clause. In theory,
write() should return a positive value, or -1 with
errno set (see
man 2 write for a full description). I personally consider both other negative values and 0 as equivalent to
EIO error. (For descriptions of
errno codes, see
man 3 errno.)
Again, the key thing is that the low-level
write() call does not necessarily write the entire buffer: it can return a short count, for example because the output is to a pipe and the pipe is nearly full (because the reader is too slow in reading). It will block until at least one char is written, or an error occurs.
(If the descriptor was nonblocking, it would return -1 with
errno==EINTR if interrupted by a signal delivery to an userspace handler installed without SA_RESTART flag,
errno==EAGAIN or
errno==EWOULDBLOCK if nothing can be written immediately, and so on.)
If standard output is a pipe, and the read end closes its end, this process will be killed by the SIGPIPE signal the first time we attempt to write to the pipe after the read end is closed. We can catch or ignore that signal, in which case the write would fail with
errno==EINTR or
errno==EPIPE. However, since being killed by SIGPIPE is fine with us (humans using this tool for its stated purpose), we don't need to worry about this either.
Even if you personally –– whoever might be reading this post –– do not find this useful, I think there is a chance it might be useful to someone finding this thread later using a web search due to the keywords (
stdin stdout pipe conversion).
In particular, those comparing the freestanding implementation to the standard library implementation might find it useful, because it illustrates how the logic stays the same even when the standard library is not used, and how GCC/clang/ICC extended assembly can be used to interface to something completely different (in this case, to a Linux kernel providing us with
read,
write, and
exit/
exit_group syscalls).
In a very real way, the standard C library is an abstraction we
can replace if we wanted to, as long as we can devise a sane/effective/acceptable function interfaces for the things we need. Because the three syscalls this program needs are so simple, I just used the Linux equivalents for the API.
In particular, in all Linux architectures,
sizeof (long) == sizeof (void *), so that
long has the same properties as
intptr_t, and
unsigned long the same properties as
uintptr_t. This is why the function wrappers around the syscalls use
long and not some other type. Other, non-Linux (compatible) systems, use other conventions, so this is the kind of thing one has to think about when creating interfaces and APIs in freestanding C.