-
my temporary stages are ready for mips2-el-o32 and mips2-el-n32, but when Catalyst tried to move on "compiled" compilers it immediately stopped working and reported "no C compiler", which is always a sign of something catastrophic.
Inspecting what the frog was happening I manually mounted the stage (on a Qemu/MIPS32el vm) and found we still have this (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87316) problem, already reported by others years ago.
Funny:
g++ always works
gcc randomly crashes with "bus error, internal compiler error"
no matter which flags you use, -o0, o1, o2, ... os, it randomly works and randomly crashes, say 50% 50% :o :o :o
Also, when it works, on the same testing-source, it's
-5x slower than gcc v4.1.2!!!
- eats 3x more ram than gcc v4.1.2!!!
sign of regression?
I am going to move Catalyst to gcc 4.7.* (because I know it works well)
-
You should add a post to the ticket you mentioned, that will at least make it pop back up for developers. There's nothing we can do here.
I've never had such an issue using GCC 12 (x86_64, RISC-V and ARM), but no experience with the MIPS32 target.
If you can come up with a minimal example eliciting this behavior... problem is it "randomly crashes".
Also, you mentioned this happening on Qemu - does it happen only on Qemu or on real hardware as well?
-
Also on real hardware.
There is nothing to comment
Gcc hallo.c
Sometimes you get nothing but "Internal computer bug", sometimes you get a.out
-
Does anyone use mips2le/gcc v12 on as a say ... built by Openwrt or OE rootfs?
-
Also on real hardware.
There is nothing to comment
Gcc hallo.c
Sometimes you get nothing but "Internal computer bug", sometimes you get a.out
That's not very minimal. A HelloWorld tends to #include <stdio.h>, which is a huge amount of stuff to process.
Does it still fail if you do a really simple compile, such as...
int puts(const char *s);
int main(){
puts("HelloWorld\n");
return 0;
}
... which is functionally the same, but a much much simpler job for cc1 to process it.
Also try using C-reduce on the preprocessed source code:
https://embed.cs.utah.edu/creduce/
If cc1 is crashing randomly 50% of the time then running it 100 times in a row (or until it crashes) should be enough to find the problem. You can easily put a loop in the script you give to C-reduce.
-
OK, I'm going to go through that stage1 again this afternoon and try as you suggested.
I will also specifically test cc1 on pure C code without any #define
-
CC=gcc
CFLAGS=-O -Wall
all: obj/hallo1 obj/hallo2
obj/hallo1: obj/hallo1.o
@echo -n "linking hallo1 ... "
@mkdir -p obj
@$(CC) -o obj/hallo1 obj/hallo1.o
@echo "done"
obj/hallo2: obj/hallo2.o
@echo -n "linking hallo2 ... "
@mkdir -p obj
@$(CC) -o obj/hallo2 obj/hallo2.o
@echo "done"
obj/hallo1.o: hallo1.c hallo1.h
@echo -n "compiling hallo1 ... "
@mkdir -p obj
@$(CC) $(CFLAGS) -o obj/hallo1.o -c hallo1.c
@echo "done"
obj/hallo2.o: hallo2.c hallo2.h
@echo -n "compiling hallo2 ... "
@mkdir -p obj
@$(CC) $(CFLAGS) -o obj/hallo2.o -c hallo2.c
@echo "done"
clean:
@echo -n "cleaning ... "
@rm -f obj/*
@echo "done"
int puts
(
char *s
);
int main()
{
puts("hAllo\n");
return 0;
}
#include <stdio.h>
int main()
{
printf("hAllo\n");
return 0;
}
for cycle in {0..99}
do
echo "cycle$cycle"
mem_total="`cat /proc/meminfo_total`"
mem_avail="`cat /proc/meminfo_avail`"
mem_ready="`cat /proc/meminfo_ready`"
make clean
echo "mem $mem_ready of $mem_avail of $mem_total"
make obj/hallo2
done
cycle0
cleaning ... done
mem 8436 of 44388 of 59152
compiling hallo1 ... done
linking hallo1 ... done
...
cycle99
cleaning ... done
mem 8188 of 44140 of 59152
compiling hallo1 ... done
linking hallo1 ... done
cycle0
cleaning ... done
mem 8228 of 44168 of 59152
compiling hallo2 ... during RTL pass: cse1
hallo2.c: In function ‘main’:
hallo2.c:8:1: internal compiler error: Bus error
8 | }
| ^
0x15fbeab internal_error(char const*, ...)
???:0
0x74c71b df_insn_rescan(rtx_insn*)
???:0
0x74e373 df_process_deferred_rescans()
???:0
0x736d87 df_finish_pass(bool)
???:0
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
make: *** [Makefile:27: obj/hallo2.o] Error 1
..
cycle1
cleaning ... done
mem 8168 of 44024 of 59152
compiling hallo2 ... done
linking hallo2 ... done
..
cycle19
cleaning ... done
mem 8484 of 44340 of 59152
compiling hallo2 ... during RTL pass: reload
hallo2.c: In function ‘main’:
hallo2.c:8:1: internal compiler error: Bus error
8 | }
| ^
0x15fbeab internal_error(char const*, ...)
???:0
0x74ddaf df_update_entry_block_defs()
???:0
0x74e07b df_update_entry_exit_and_calls()
???:0
0x74e547 df_process_deferred_rescans()
???:0
0x736d87 df_finish_pass(bool)
???:0
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
make: *** [Makefile:27: obj/hallo2.o] Error 1
do(make obj/hallo1): 100 of 100 all successful
do(make obj/hallo2): 73 of 100 successful
-
gcc version 12.2.1 2023-03-04, supported LTO compression algorithms: zlib
-
Cool. So do "gcc -E" on hallo2, then modify you test script to do exit 0 when a compile fails and exit 1 when all 100 succeed, and hit it with creduce.
-
gcc randomly crashes with "bus error, internal compiler error"
Bus error occurs when a virtual memory mapping cannot be backed (by memory, or by inode contents for a MAP_NORESERVE file-backed mapping).
In other words, that is the overcommitted system equivalent of "Out of memory error".
You can verify by temporarily adding e.g. a swap file of a couple of gigabytes, and disabling overcommit.
Also, before compiling anything, running
sudo dash -c 'sync ; echo 3 >/proc/sys/vm/drop_caches ; sync'
will ensure maximum available RAM and swap for following operations (but also causes everything to be re-loaded from storage, so slows things down).
-
Running out of memory should not be a problem as the stats are showing more than 2/3 of RAM as available i.e. used for caches etc but can be dropped automatically. It's only an estimate, but still..
Randomness is troubling. It could be faulty RAM.
-
Randomness is troubling. It could be faulty RAM.
On a Qemu/MIPS virtual machine? Same behavior.
On the real hardware? The same kernel (6.0.2) with a 2008-rootfs, gcc-v4.1.2 is able to compile *everything* without a single error.
-
"Out of memory error".
The kernel shouldn't warn something on the console? I see no-warn
(console=/dev/ttyS0, I am monitoring with Minicom)
swapon /dev/sda2
Adding 131068k swap on /dev/s. Priority:-xtents:1 acro131068k
function check_result()
{
local test="$?"
local ans
if [ "$test" == "0" ]
then
ans="success"
echo "$ans"
else
ans="failure"
echo "$ans"
exit 0
fi
}
for cycle in {0..99}
do
echo "cycle$cycle"
mem_total="`cat /proc/meminfo_total`"
mem_avail="`cat /proc/meminfo_avail`"
mem_ready="`cat /proc/meminfo_ready`"
make clean
echo "mem $mem_ready of $mem_avail of $mem_total"
sync
sync
sync
echo 3 > /proc/sys/vm/drop_caches
sync
sync
sync
make obj/hallo2
check_result
done
CC=gcc
CFLAGS=-std=c89 -O -Wall -freport-bug
MAKEOPTS=-j1
all: obj/hallo1 obj/hallo2
obj/hallo1: obj/hallo1.o
@echo -n "linking hallo1 ... "
@mkdir -p obj
@$(CC) -o obj/hallo1 obj/hallo1.o
@echo "done"
obj/hallo2: obj/hallo2.o
@echo -n "linking hallo2 ... "
@mkdir -p obj
@$(CC) -o obj/hallo2 obj/hallo2.o
@echo "done"
obj/hallo1.o: hallo1.c hallo1.h
@echo -n "compiling hallo1 ... "
@mkdir -p obj
@$(CC) $(CFLAGS) -o obj/hallo1.o -c hallo1.c
@echo "done"
obj/hallo2.o: hallo2.c hallo2.h
@echo -n "compiling hallo2 ... "
@mkdir -p obj
@$(CC) $(CFLAGS) -o obj/hallo2.o -c hallo2.c
@echo "done"
# Stop after the preprocessing stage
# do not run the compiler proper.
# The output is in the form of preprocessed source code
# which is sent to the standard output.
obj/hallo1.txt: hallo1.c hallo1.h
@echo -n "expanding hallo1 ... "
@mkdir -p obj
@$(CC) $(CFLAGS) -o obj/hallo1.txt -E hallo1.c
@echo "done"
obj/hallo2.txt: hallo2.c hallo2.h
@echo -n "expanding hallo2 ... "
@mkdir -p obj
@$(CC) $(CFLAGS) -o obj/hallo2.txt -E hallo2.c
@echo "done"
clean:
@echo -n "cleaning ... "
@rm -f obj/*
@echo "done"
cycle22
cleaning ... done
mem 17784 of 45964 of 59152
compiling hallo2 ... during RTL pass: reload
hallo2.c: In function ‘main’:
hallo2.c:11:1: internal compiler error: Bus error
11 | }
| ^
0x15fbeab internal_error(char const*, ...)
???:0
0x74ddaf df_update_entry_block_defs()
???:0
0x74e07b df_update_entry_exit_and_calls()
???:0
0x74e547 df_process_deferred_rescans()
???:0
0x736d87 df_finish_pass(bool)
???:0
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.
make: *** [Makefile:22: obj/hallo1.o] Error 1
success
:-//
the swap on sda2 is 128Mbyte
now even hallo1 randomly fails ...
-
"Out of memory error".
The kernel shouldn't warn something on the console?
No, it's not a kernel OOM (out of memory) situation, just a failure to immediately provide backing for a specific page in a specific process. That's why I put "out of memory error" in quotes.
The kernel doesn't actually kill the process, it sends (the offending thread) a SIGBUS (https://man7.org/linux/man-pages/man2/sigaction.2.html) signal, with siginfo_t and context populated and si_addr member containing the address that faulted. The default disposition is to terminate the process and dump core, but if an userspace signal handler can actually rectify the situation, all it needs to do is return, and the kernel will re-execute the offending machine instruction. It is also possible to emulate the offending instruction, and modify the context to continue at the next instruction, but you need an async-signal safe (https://man7.org/linux/man-pages/man7/signal-safety.7.html) instruction decoder to do that.
On MIPS, this happens in arch/mips/mm/fault.c:__do_page_fault() (https://elixir.bootlin.com/linux/latest/source/arch/mips/mm/fault.c#L279), in the do_sigbus: branch (and not in the out_of_memory: branch). In most kernels there is a #if 0 printk() just before the force_sig_fault(SIGBUS,...) that you can uncomment to get more information about SIGBUS events; I recommend you do that if possible.
Even running gcc under gdb might provide useful info. Personally, I am aware that newer GCC versions are quite horrible memory hogs, because typical current computers just tend to have insane amounts of RAM, compared to say fifteen years ago; so I'm not at all surprised by this.
-
(15 hours later ...)
# gcc-config -l 1
[1] mipsel-unknown-linux-gnu-4.1.2
[2] mipsel-unknown-linux-gnu-12 *
since it's a modern rootfs, to check glibc &c, I cross compiled gcc-v4.1.2 and selected it as the default c and c++ compiler. It seems working correct, and it's faster.
I wonder ... why the kernel does'n t just use the swap? :o :o :o
-
I wonder ... why the kernel does'n t just use the swap? :o :o :o
It is a transitory problem, so by the time gcc is killed, the kernel memory statistics no longer reflect the exact situation when the problem happened.
I do believe it does use swap, all of it. The reason it is not deterministic is that each command invocation creates a set of processes (four or five or something like that), and their order and time-wise overlap varies.
The other possibility is that there are still some kernel-userspace-API/ABI issues left that cause GCC to occasionally keel over, but I don't think so at all. Newer versions of GCC are horrible memory hogs.
To be sure, I would run
strace -o log -ff gcc "$CFLAGS" -o obj/hallo2.o -c hallo2.c
and examine the trace, log, and in particular compare it to a trace obtained on x86-64; and then repeat that until it triggers the SIGBUS, to find out exactly where it happens. This will generate a set of log.PID files. Running
strace -o log -e trace=%memory -ff gcc "$CFLAGS" -o obj/hallo2.o -c hallo2.c
limits the logs to memory allocations, which will tell you how much memory gcc and its subprocesses actually allocate. (Before running either command, do remember to run rm -f log.* to remove old log files. Since the process IDs will change, new files will be created each time.)
It would be very simple to implement a tiny library (loaded using LD_PRELOAD) interposing sigaction() and chaining SIGBUS so that a custom handler would be called before the set handler (GCC does set a SIGBUS handler). Whenever delivered, the handler would dump the address and /proc/self/maps and/or /proc/self/smaps to standard error or a dedicated file, describing the exact state of the process memory mappings at the point of error, remove itself from the chain, then call the set handler. The only tricky bit is that it has to be async-signal safe, too, so <unistd.h> I/O and a small (4k) static/global array for I/O.
Basically something akin to catchsegv, except for SIGBUS. Let me know if you'd like one.
-
Not enough memory available is not surprising.
GCC 12 requires a gigantic amount of RAM while compiling and that would be one major difference compared to older versions.
It can take several GB of memory to compile a moderately long C++ source file.
For a small C program, I wouldn't expect something that bad, but certainly if you have less than say 512MB of RAM in total, I wouldn't consider using GCC 12 on it.
Your best bet instead of trying to work around it is to cross-compile, IMHO.
-
Your best bet instead of trying to work around it is to cross-compile, IMHO.
Catalyst doesn't cross compile, it compiles natively.
My builder can cross compile, in fact the stage1 compiled successfully and failed as soon as it was moved to Qemu/MIPS, where it should be re-compiled native, as well as all tests failed on the final real hardware, which has only 64 Mbytes of Physical RAM.
So, gcc-v12 is a bloody *big* problem :-//
-
I'm sure you can find a way to cross-compile successfully. I don't see why not.
And unfortunately yes forget about recent GCC versions and 64MB of RAM.
I wouldn't count on LLVM either.
But otherwise, can you not just use Qemu to compile, and allocate a shitload of RAM to the virtual machine?
-
I'm afraid it will also be a damn problem for my armv5tel-softfloat-linux-gnueabi GNU/LINUX PDA, which has only 64Mbytes of ram. Still untested, last night, after 96 hours, everything was successfully built by my builder, but I need to have a Gcc onboard on the PDA, and it can't be gcc-v12 because I am really afraid it has the same problem.
damn :o :o :o
-
But otherwise, can you not just use Qemu to compile, and allocate a shitload of RAM to the virtual machine?
Yes, it can be done, but I also need an on-board C compiler.
edit:
scheduled gcc-v4.1.2, cooking new hybrid stages
p.s.
checking for armv5tel-softfloat-linux-gnueabi-gcc... armv5tel-softfloat-linux-gnueabi-gcc
checking for C compiler default output... configure: error: C compiler cannot create executables
confirmed: there is the same problem even on the armv5tel-softfloat-linux-gnueabi GNU/Linux PDA :palm:
I'm afraid you are right, I have to forget gcc-v12
-
so, the hybrid modern rootfs with old gcc-v4.1.2. seems to somehow mostly work but ...
/usr/lib/gcc/mipsel-unknown-linux-gnu/4.1.2/../../../../mipsel-unknown-linux-gnu/bin/ld: dwarf_sig8_hash.c:(.text+0xadc): undefined reference to `__sync_fetch_and_sub_4'
/usr/lib/gcc/mipsel-unknown-linux-gnu/4.1.2/../../../../mipsel-unknown-linux-gnu/bin/ld: libdw_pic.a(libdw_alloc.os): in function `__libdw_alloc_tail':
libdw_alloc.c:(.text+0x40c): undefined reference to `__sync_fetch_and_add_4'
collect2: ld returned 1 exit status
forget any elfutils support in strace: no dice :o :o :o
-
libdw_alloc.c:(.text+0x40c): undefined reference to `__sync_fetch_and_add_4'
These should be provided by libgcc (see libgcc/sync.c (https://github.com/gcc-mirror/gcc/blob/master/libgcc/sync.c) in the GCC source tree).
(GCC supports two different atomic built-ins. One (4.1+) is Intel-style __sync (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html) builtins, the other (4.8+) is __atomic (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) built-ins. On architectures where these do not directly map to machine instructions, like on MIPS, function calls (using various locking techniques) are used instead. MIPS is an LL/SC architecture, so all these functions can easily be implemented using a trivial loop around LL/SC.)
If your desired GCC version does not implement these for MIPS or any reason, they're very easy to backport; even easier to implement as a separate library (say, libgccsync) you can link against to fix such problems. For example:
__sync_fetch_and_add_4:
sync
loop:
ll v0,0(a0)
addu at,v0,a1
sc at,0(a0)
beqz at,loop
nop
sync
jr ra
However, GCC 4.1.2 on MIPS should (IIRC!) have all __sync_ builtins, up to 32 bits, definitely including __sync_fetch_and_add_4(). A trivial C program to check (mipsel-linux-gnu-gcc -O2 test.c -o test) is
int main(void) {
volatile int a = 1;
int b = __sync_fetch_and_add(&a, 2);
return a * b;
}
You do need 4.8 with libatomic (-latomic) for 64-bit atomics (__atomic_..._8()), though.
-
# mipsel-unknown-linux-gnu-gcc-4.1.2 test_gcc_mips_sync_fetch_and_add_4.c -o test_gcc_mips_sync_fetch_and_add_4.c
/usr/lib/gcc/mipsel-unknown-linux-gnu/4.1.2/../../../../mipsel-unknown-linux-gnu/bin/ld: /tmp/ccEfV3mL.o: in function `main':
test_gcc_mips_sync_fetch_and_add_4.c:(.text+0x34): undefined reference to `__sync_fetch_and_add_4'
collect2: ld returned 1 exit status
(your c testing code)
also confirmed by Catalyst. Three confirmations, so there are no official patches, "sync_fetch_and_add_4" is not supported, needs to be added.
Since I need dev-util/strace, I relaxed its use_flags and it compiled successfully without dev-libs/elfutils
2023-03-22--08-44-20---2023-03-22--10-43-17 - [ dev-libs/elfutils ] - failure - root@dev2.40.0/4.1.2
/usr/lib/gcc/mipsel-unknown-linux-gnu/4.1.2/../../../../mipsel-unknown-linux-gnu/bin/ld: dwarf_sig8_hash.c:(.text+0xadc): undefined reference to `__sync_fetch_and_sub_4'
/usr/lib/gcc/mipsel-unknown-linux-gnu/4.1.2/../../../../mipsel-unknown-linux-gnu/bin/ld: libdw_pic.a(libdw_alloc.os): in function `__libdw_alloc_tail':
libdw_alloc.c:(.text+0x40c): undefined reference to `__sync_fetch_and_add_4'
collect2: ld returned 1 exit status
2023-03-22--19-45-37---2023-03-22--21-30-49 - [ dev-util/strace ] - success - root@dev2.40.0/4.1.2
***use_flags, dev-util/strace: (-aio -elfutils -perl -selinux -static -unwind)
better than nothing, at least now we have
- sys-devel/gcc v4.1.2: working, able to compile (works on { Qemu/MIPS, real hardware })
- dev-util/strace v6.1: minimal setup
- sys-libs/glibc v2.36-r7
- dev-lang/python v3.11.2_p1
changed the gcc system profile
-COMMON_FLAGS="-O2 -march=mips2 -mabi=32 -mplt -pipe"
+COMMON_FLAGS="-O2 -march=mips2 -mabi=32 -pipe"
-mplt is NOT supported by gcc v4.1.2
Now I can go on, and try to investigate about gcc v12.
We need to support gcc v4.1.2, I have the feeling there are other packages like dev-libs/elfutils :-// :-// :-//
-
/usr/lib/gcc/mipsel-unknown-linux-gnu/12/libgcc_s.so.1
/usr/lib/gcc/mipsel-unknown-linux-gnu/12/libgcc.a
sync_fetch_and_add is there :o :o :o
-
#ldd /usr/bin/wget
linux-vdso.so.1 (0x7ffa7000)
libpcre2-8.so.0 => /lib/libpcre2-8.so.0 (0x77c70000)
libssl.so.3 => /usr/lib/libssl.so.3 (0x77bc0000)
libcrypto.so.3 => /usr/lib/libcrypto.so.3 (0x77840000)
libz.so.1 => /lib/libz.so.1 (0x77800000)
libc.so.6 => /lib/libc.so.6 (0x77620000)
libatomic.so.1 => /usr/lib/gcc/mipsel-unknown-linux-gnu/12/libatomic.so.1 (0x775f0000)
/lib/ld.so.1 (0x77dc2000)
and some applications like net-misc/wget v1.21.3-r1 have dependencies with gcc-v12
libatomic.so.1 => /usr/lib/gcc/mipsel-unknown-linux-gnu/12/libatomic.so.1
-
If you use a mipsel-linux-gnu-gcc to cross-compile the following to an object file (or library), using a much later GCC version, you get all needed functions:
char sync_fetch_and_add_1 (char *ptr, char val) { return __sync_fetch_and_add (ptr, val); }
char sync_fetch_and_sub_1 (char *ptr, char val) { return __sync_fetch_and_sub (ptr, val); }
char sync_fetch_and_and_1 (char *ptr, char val) { return __sync_fetch_and_and (ptr, val); }
char sync_fetch_and_xor_1 (char *ptr, char val) { return __sync_fetch_and_xor (ptr, val); }
char sync_fetch_and_or_1 (char *ptr, char val) { return __sync_fetch_and_or (ptr, val); }
char sync_fetch_and_nand_1 (char *ptr, char val) { return __sync_fetch_and_nand (ptr, val); }
char sync_add_and_fetch_1 (char *ptr, char val) { return __sync_add_and_fetch (ptr, val); }
char sync_sub_and_fetch_1 (char *ptr, char val) { return __sync_sub_and_fetch (ptr, val); }
char sync_and_and_fetch_1 (char *ptr, char val) { return __sync_and_and_fetch (ptr, val); }
char sync_xor_and_fetch_1 (char *ptr, char val) { return __sync_xor_and_fetch (ptr, val); }
char sync_or_and_fetch_1 (char *ptr, char val) { return __sync_or_and_fetch (ptr, val); }
char sync_nand_and_fetch_1 (char *ptr, char val) { return __sync_nand_and_fetch (ptr, val); }
short sync_fetch_and_add_2 (short *ptr, short val) { return __sync_fetch_and_add (ptr, val); }
short sync_fetch_and_sub_2 (short *ptr, short val) { return __sync_fetch_and_sub (ptr, val); }
short sync_fetch_and_and_2 (short *ptr, short val) { return __sync_fetch_and_and (ptr, val); }
short sync_fetch_and_xor_2 (short *ptr, short val) { return __sync_fetch_and_xor (ptr, val); }
short sync_fetch_and_or_2 (short *ptr, short val) { return __sync_fetch_and_or (ptr, val); }
short sync_fetch_and_nand_2 (short *ptr, short val) { return __sync_fetch_and_nand (ptr, val); }
short sync_add_and_fetch_2 (short *ptr, short val) { return __sync_add_and_fetch (ptr, val); }
short sync_sub_and_fetch_2 (short *ptr, short val) { return __sync_sub_and_fetch (ptr, val); }
short sync_and_and_fetch_2 (short *ptr, short val) { return __sync_and_and_fetch (ptr, val); }
short sync_xor_and_fetch_2 (short *ptr, short val) { return __sync_xor_and_fetch (ptr, val); }
short sync_or_and_fetch_2 (short *ptr, short val) { return __sync_or_and_fetch (ptr, val); }
short sync_nand_and_fetch_2 (short *ptr, short val) { return __sync_nand_and_fetch (ptr, val); }
int sync_fetch_and_add_4 (int *ptr, int val) { return __sync_fetch_and_add (ptr, val); }
int sync_fetch_and_sub_4 (int *ptr, int val) { return __sync_fetch_and_sub (ptr, val); }
int sync_fetch_and_and_4 (int *ptr, int val) { return __sync_fetch_and_and (ptr, val); }
int sync_fetch_and_xor_4 (int *ptr, int val) { return __sync_fetch_and_xor (ptr, val); }
int sync_fetch_and_or_4 (int *ptr, int val) { return __sync_fetch_and_or (ptr, val); }
int sync_fetch_and_nand_4 (int *ptr, int val) { return __sync_fetch_and_nand (ptr, val); }
int sync_add_and_fetch_4 (int *ptr, int val) { return __sync_add_and_fetch (ptr, val); }
int sync_sub_and_fetch_4 (int *ptr, int val) { return __sync_sub_and_fetch (ptr, val); }
int sync_and_and_fetch_4 (int *ptr, int val) { return __sync_and_and_fetch (ptr, val); }
int sync_xor_and_fetch_4 (int *ptr, int val) { return __sync_xor_and_fetch (ptr, val); }
int sync_or_and_fetch_4 (int *ptr, int val) { return __sync_or_and_fetch (ptr, val); }
int sync_nand_and_fetch_4 (int *ptr, int val) { return __sync_nand_and_fetch (ptr, val); }
int sync_bool_compare_and_swap_1 (char *ptr, char oldval, char newval) { return __sync_bool_compare_and_swap (ptr, oldval, newval); }
int sync_bool_compare_and_swap_2 (short *ptr, short oldval, short newval) { return __sync_bool_compare_and_swap (ptr, oldval, newval); }
int sync_bool_compare_and_swap_4 (int *ptr, int oldval, int newval) { return __sync_bool_compare_and_swap (ptr, oldval, newval); }
char sync_val_compare_and_swap_1 (char *ptr, char oldval, char newval) { return __sync_val_compare_and_swap (ptr, oldval, newval); }
short sync_val_compare_and_swap_2 (short *ptr, short oldval, short newval) { return __sync_val_compare_and_swap (ptr, oldval, newval); }
int sync_val_compare_and_swap_4 (int *ptr, int oldval, int newval) { return __sync_val_compare_and_swap (ptr, oldval, newval); }
char sync_lock_test_and_set_1 (char *ptr, char value) { return __sync_lock_test_and_set (ptr, value); }
short sync_lock_test_and_set_2 (short *ptr, short value) { return __sync_lock_test_and_set (ptr, value); }
int sync_lock_test_and_set_4 (int *ptr, int value) { return __sync_lock_test_and_set (ptr, value); }
void sync_lock_release_1 (char *ptr) { __sync_lock_release (ptr); }
void sync_lock_release_2 (short *ptr) { __sync_lock_release (ptr); }
void sync_lock_release_4 (int *ptr) { __sync_lock_release (ptr); }
void sync_synchronize (void) { __sync_synchronize(); }
that you can then rename with __ prefixes.
Using mipsel-linux-gnu-gcc-9.4.0 -Wall -Os -S, you get the attached libgccsync.s, which you can include in any project or compile to an object file using simply gcc -c libgccsync.s, and it will provide the missing functions. This works, because later GCC versions (on MIPS) do support the built-ins, emitting the necessary code inline.
Do note that OpenWRT has used GCC-7.4.0 (https://openwrt.org/packages/pkgdata_owrt19_7/gcc), GCC-8.4.0 (https://openwrt.org/packages/pkgdata_owrt21_2/gcc), and most recently GCC-11.2.0 (https://openwrt.org/packages/pkgdata/gcc) as the (native) compiler on routers, including for mipsel_mips32 for Mikrotik RB532A (https://openwrt.org/docs/techref/instructionset/mipsel_mips32) and for mipsel_24kc for Mikrotik RBM33G (https://openwrt.org/docs/techref/instructionset/mipsel_24kc). I am not at all convinced that GCC-4.1.2 is any kind of a sweet spot for MIPS architectures.
-
Do note that OpenWRT has used GCC-7.4.0, GCC-8.4.0 [..]
I am not at all convinced that GCC-4.1.2 is any kind of a sweet spot for MIPS architectures.
I am using v4.1.2 simply for these reasons:
- it has been (1)hardly tested tested hard in 2007, 2008, 2009 stages
- and I also have an "hardened" profile
- it's a lot faster than gcc v12 (when gcc v12 doesn't crash, I can compare time-exec values)
- it consumes less ram
- it consumes only 200Mbyte of space (gcc v12 consumes much more space)
- I can re-use 2007, 2008, 2009 stages
- and it's what I practically have ready on hands
I can put gcc v7.4.0 and v8.4.0 on both Catalyst and my-builder; this requires more work, but I will at some point, just I was not prepared to the catastrophic internal compiler error of gcc v12.
edit:
(1) typos
-
and some applications like net-misc/wget v1.21.3-r1 have dependencies with gcc-v12
libatomic.so.1 => /usr/lib/gcc/mipsel-unknown-linux-gnu/12/libatomic.so.1
Yep, they use the __atomic_ builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) introduced in GCC-4.8 instead of the Intel __sync_ (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html) ones. If it does not depend on other libraries, you should be able to use the latest version even with older compiler versions. (Even on 32-bit MIPS, libatomic should support 64-bit atomic ops, but it uses internal locking primitives for these I think.)
hardly tested
hardly tested != tested hard
hardly tested == tested only little
;)
- I can re-use 2007, 2008, 2009 stages
- and it's what I practically have ready on hands
Okay, that makes ample sense to me too.
I can put gcc v7.4.0 and v8.4.0 on both Catalyst and my-builder
I would only bother with GCC 8.4.0 or 11.3.0 at this point. Also check the patches OpenWRT applies to GCC (https://github.com/openwrt/packages/tree/master/devel/gcc/patches).
Again, it is rather simple to implement utilities to check memory usage, say as an interposing library (that calls e.g. getrusage() (https://man7.org/linux/man-pages/man2/getrusage.2.html) for maxrss and checks /proc/self/maps for mappings after each mmap() call) you use via LD_PRELOAD; the issue with GCC is its peak memory use, so normal process statistics utilities like top are unlikely to capture that accurately.
All I'd need is verify no surprises in the /proc/self/maps format on MIPS, i.e. a copy of cat /proc/self/maps output on MIPS, really.
-
- I can re-use 2007, 2008, 2009 stages
- and it's what I practically have ready on hands
Okay, that makes ample sense to me too.
Technically what I am doing is part of the Sonoko project; this has been going on for 5 years and the bloody Murphy's law has been pretty harsh: *everything* that could go wrong has gone wrong :o :o :o
So, an old and obsolete Stage4 that actually work is better than a modern one that well is modern but has problem from the C compiler up.
would only bother with GCC 8.4.0
gcc-v8.5.0 looks promising, next week I will try to cross-compile it on a dedicated mac-mini.
it is rather simple to implement utilities to check memory usage
Sure, looks suuuuuuuper ;D
# cat /proc/self/maps
55560000-55569000 r-xp 00000000 08:03 9531391 /bin/cat
5557f000-55580000 r--p 0000f000 08:03 9531391 /bin/cat
55580000-55581000 rw-p 00010000 08:03 9531391 /bin/cat
55581000-555a2000 rw-p 00000000 00:00 0 [heap]
77b6c000-77b8e000 rw-p 00000000 00:00 0
77b8e000-77c00000 r--p 00000000 08:03 12894875 /usr/lib/locale/locale-archive
77c00000-77db5000 r-xp 00000000 08:03 815666 /lib/libc.so.6
77db5000-77dcd000 ---p 001b5000 08:03 815666 /lib/libc.so.6
77dcd000-77dd0000 r--p 001bd000 08:03 815666 /lib/libc.so.6
77dd0000-77dd3000 rw-p 001c0000 08:03 815666 /lib/libc.so.6
77dd3000-77dd8000 rw-p 00000000 00:00 0
77ddb000-77e07000 r-xp 00000000 08:03 1261748 /lib/ld.so.1
77e18000-77e1a000 rw-p 00000000 00:00 0
77e1a000-77e1b000 r--p 0002f000 08:03 1261748 /lib/ld.so.1
77e1b000-77e1c000 rw-p 00030000 08:03 1261748 /lib/ld.so.1
7fa73000-7fa94000 rwxp 00000000 00:00 0 [stack]
7fefd000-7fefe000 r-xp 00000000 00:00 0
7ff46000-7ff47000 r--p 00000000 00:00 0 [vvar]
7ff47000-7ff48000 r-xp 00000000 00:00 0 [vdso]
-
2023-03-26--20-52-13---2023-03-26--21-45-47 - [ net-dialup/minicom ] - failure - root@dev2.40.0/4.1.2
cc1: error: unrecognized command line option "-Wno-format-truncation"
minicom-2.8-r1 forked into overlay net-dialup/minicom, minicom-2.8-r11
patch "minicom-2.8-gcc-4.1.2-fix.patch" created and applied
2023-03-26--22-24-44---2023-03-26--23-05-11 - [ =net-dialup/minicom-2.8-r11 ] - success - root@dev2.40.0/4.1.2
thus, my-builder uses modern gcc-v12 for cross-compiling (source=i686, target=mips2el), while Catalyst (on Qemu/mips2el) uses gcc-v4.1.2
Gcc-v4.1.2 is doing a great job of proving that 90% of system-2023 and 70% of world-2023 (both minimal & server-profile) can be compiled with v4.1.2 (C only) just by fixing ebuilds from things like
- error: ..for.. loop initial declaration used outside C99 mode
- error: cc1, unrecognized command line option "-Wno-format-truncation"
- ...
and ...
.... and
>[1] mipsel-unknown-linux-gnu-4.1.2 (tested hard, stable)
(2) mipsel-unknown-linux-gnu-4.8.3 (new)
(3) mipsel-unknown-linux-gnu-12 (broken, unstable)
we have a new native compiler: say hAllo to gcc-v4.8.3 ;D ;D ;D
(yesterday afternoon gcc-v4.1.2 compiled v4.8.3 when I was out for a jog but reports say "with all tests passed", and now v4.8.3 is compiling itself)
-
excellent news!
2023-03-28--07-52-23---2023-03-28--10-46-59 - [ dev-libs/elfutils ] - success - root@dev2.40.0/4.8.3
QA_tests[ dev-libs/elfutils ] - success
gcc-v4.8.3 actually works! and can be used to compile gcc-v8.*!
meanwhile, dev-util/strace will be recompiled with elfutils support
now we have tools to investigate the weird gcc-v12 behavior! ;D