Author Topic: Do you know Tiny C Compiler? (Read 2483 times)

Picuino · « **on:** November 03, 2021, 05:27:41 pm »

It is a very tiny C compiler (about 100 kBytes) that can run C source like macros (without generating executables, but you can make an executable if you want).
It is near full C99 compliant.
Compile 9 times faster than GCC, but generated executables execute slower.

It is Open Source (distributed under the GNU Lesser General Public License) and comes from an old project for the IOCCC Contest.

I usually carry a copy on my pendrives, just in case.

https://bellard.org/tcc/

ataradov · « **Reply #1 on:** November 03, 2021, 08:36:08 pm »

Its compatibility with the standards is lacking quite a bit. It is probably useful in some cases, but I'm not sure it is that useful for day to day stuff.

SiliconWizard · « **Reply #2 on:** November 03, 2021, 08:40:53 pm »

Quote from: ataradov on November 03, 2021, 08:36:08 pm

Its compatibility with the standards is lacking quite a bit. It is probably useful in some cases, but I'm not sure it is that useful for day to day stuff.

Can you be more detailed?
It supports C99 almost fully, and a significant chunk of C11. It's able to compile everything I've tried it with. It's actively maintained too.
Where it's lacking is obviously optimizations, it does almost none. And this is why it's so fast to compile and so lightweight too. So it produces rather inefficient code, so that would be one reason not to use it if performance/code size matter. But as far as standard-compliance, it's become pretty good as far as I've seen. Do not hesitate to show specific issues though.

That said, I'm not sure where the OP got the "100 KB" size.
For Windows x64, x86_64-tcc.exe is currently about 320 KB. If you add to that the required include files and libraries to get a working compiler, that amounts to about 2 MB. Pretty small, but nowhere near 100 KB.

ataradov · « **Reply #3 on:** November 03, 2021, 08:59:23 pm »

Ok, it is far more complaint that I thought. I had some problems with it some time ago, but I just tested it with some tricky code, and it did compile it fine. I'll see if I can build a bigger projects using it to see if it will not like something.

And yes, fresh build on Linux is 3.3 MB:

Quote

20K   ./share/man/man1
24K   ./share/man
108K   ./share/doc
136K   ./share
1.1M   ./bin
8.0K   ./include
28K   ./lib/tcc/include
56K   ./lib/tcc
2.2M   ./lib
3.3M   .

SiliconWizard · « **Reply #4 on:** November 03, 2021, 09:12:36 pm »

Well, if you had tried it a few years ago... it has evolved quite a bit over the past few years.

As an example I gave a while ago, I compiled the C source code (amalgamation in a single file) of SQLite with it. GCC takes a "long" time (the amalgamation is a pretty long source file). TCC is nearly instant, and gives a correct output. Impressive. Sure it does almost no optimization, so using it may be restricted to some niche applications, but it's very small and fast, and seems easy to retarget (much easier than GCC or LLVM, certainly!), so that could be a start for a compiler for a custom target, or something, if you don't feel like diving in GCC or LLVM.

TCC also comes with a library with a simple API, so you can easily embed a C compiler in any application.

As for distribution size, you can cut it down to ~2 MB if you remove the docs, man and compilers you don't need - I think by default it will build TCC for all (or at least a number of) supported targets, an executable for each - to be checked. But anyway, yeah, that's pretty small.

ataradov · « **Reply #5 on:** November 03, 2021, 09:28:05 pm »

Docs are pretty small (they actually did not build completely, since I don't have makeinfo installed). But stripping the binaries reduced the size quite a bit:

Quote

260K   ./bin
8.0K   ./include
28K   ./lib/tcc/include
56K   ./lib/tcc
296K   ./lib
568K   .

TheCalligrapher · « **Reply #6 on:** November 03, 2021, 09:34:39 pm »

While their support of more recent C features seems to be progressing, the diagnostics are still behind

Code: [Select]

int main(void)
{
  int n = 10;
  goto skip;
  typedef char A[n];
skip:;
}

The above is a constraint violation, i.e. it is an error. Yet, TinyC compiler remains silent.

SiliconWizard · « **Reply #7 on:** November 04, 2021, 05:23:37 pm »

Well, yep. Of course, TCC is a simple C compiler and has very limited diagnostics overall.
Interestingly, the above is easily spotted as an error in both GCC and CLang, but Cppcheck doesn't see a problem.

One may think that since in the above code, A is not used - it's just a typedef and no variable of type A is declared and used after that - the compiler may just ignore it. I'd be curious to understand what kind of problem it could cause, actually, as it is, even though it's incorrect from a language standpoint.

TheCalligrapher · « **Reply #8 on:** November 05, 2021, 02:50:24 am »

Quote from: SiliconWizard on November 04, 2021, 05:23:37 pm

One may think that since in the above code, A is not used - it's just a typedef and no variable of type A is declared and used after that - the compiler may just ignore it. I'd be curious to understand what kind of problem it could cause, actually, as it is, even though it's incorrect from a language standpoint.

I have already provided a link to my SO post that elaborates on the matter in an adjacent topic:
https://stackoverflow.com/questions/22530363/whats-the-point-of-vla-anyway/54163435#54163435

In short, a VLA typedef is an extremely unusual kind of typedef - it is a typedef that generates executable code. Executable code that shall not be "skipped". At some level of abstraction it is quite similar to C++'s

Code: [Select]

#include <string>

int main()
{
  goto skip; 
  std::string s;
skip:; // ERROR: can't skip initialization!
}

The above VLA typedef in C is a typedef with an implicit initialization. You can't skip it for the very same reasons.

And as I stated in that SO post, all features of VLA are actually attached to its type, not to the VLA object itself. Once you've declared a VLA type, you have already opened Pandora's box of VLA's intricacies. You don't really have to "use" that type.

brucehoult · « **Reply #9 on:** November 05, 2021, 03:08:56 am »

Quote from: Picuino on November 03, 2021, 05:27:41 pm

It is a very tiny C compiler (about 100 kBytes)

It's very sad that 100k is considered "very tiny" now.

I used Pascal on computers with less total RAM than that -- and on others where each program got 64k for code + stack + heap.

ataradov · « **Reply #10 on:** November 05, 2021, 03:22:15 am »

Quote from: brucehoult on November 05, 2021, 03:08:56 am

It's very sad that 100k is considered "very tiny" now.

Was that Pascal anywhere close to feature set of C99? Was it portable to many platforms?

Also, worrying about storage space and RAM is the last thing I do on PCs. It is simply not worth my time in any meaningful way.

Picuino · « **Reply #11 on:** November 05, 2021, 04:34:55 pm »

I think the smallest compiler out there is a Forth compiler.
I remember reading that it could fit in 2kbytes of memory.

Edit: https://www.reddit.com/r/Forth/comments/ay627/ask_rforth_whats_the_smallest_implementation_of/

SiliconWizard · « **Reply #12 on:** November 05, 2021, 06:01:59 pm »

Quote from: ataradov on November 05, 2021, 03:22:15 am

Quote from: brucehoult on November 05, 2021, 03:08:56 am
It's very sad that 100k is considered "very tiny" now.
Was that Pascal anywhere close to feature set of C99? Was it portable to many platforms?

Of course, the comparison is silly. Of course a much simpler compiler for a much simpler language on 8-bit targets (or 16-bit if you had the chance of using a 8088/86 version) would be much smaller. So what.
A couple hundreds KB these days for a C99 compiler on 64-bit is pretty impressive and certainly not even that far apart from the few tens of KB required for compilers such as Turbo Pascal a few decades ago...

SiliconWizard · « **Reply #13 on:** November 05, 2021, 06:18:02 pm »

Quote from: TheCalligrapher on November 05, 2021, 02:50:24 am

In short, a VLA typedef is an extremely unusual kind of typedef - it is a typedef that generates executable code. Executable code that shall not be "skipped". At some level of abstraction it is quite similar to C++'s
(...)
And as I stated in that SO post, all features of VLA are actually attached to its type, not to the VLA object itself. Once you've declared a VLA type, you have already opened Pandora's box of VLA's intricacies. You don't really have to "use" that type.

I tried to find all those claims in the C99 standard.
All I could really find is that the *size of the array* has to be evaluated when the typedef is encountered. This is this evaluation that could lead to actual code being emitted for it, as far as I understand.
Now until you actually declare a variable of that type, no space should be allocated for it.

Point is - in the code you posted, what a compiler could do is evaluate the size of the array being typedef'ed, but nothing else? And if the type is not actually used, I don't really see how a compiler would not be free to just discard this evaluation as part of optimization. But even if it forces evaluation of the size - even if you don't use the type - this piece of code should be pretty harmless?

Now don't hesitate to point us to the exact parts of the standard that would back up all your claims, with the associated consequences. I admit the standard may require some deeper analysis to make full sense of it. There are probably very obvious things that you master here and that many of us are missing. But I'm sorry, your linked post didn't really hep fully get it in relation with the C99 text. At least for me.

TheCalligrapher · « **Reply #14 on:** November 05, 2021, 07:24:15 pm »

Quote from: SiliconWizard on November 05, 2021, 06:18:02 pm

I tried to find all those claims in the C99 standard.

You won't find them there. These are implementation details. They are not explicitly covered by the standard. The standard describes the required behavior. How that behavior is achieved in specific implementations is a different story.

What C standard requires, for one example, is the following assertion to hold

Code: [Select]

int n = 10;
typedef char A[n++ / 2];
n += 42;
assert(sizeof(A) == 5);

and that immediately implies that in general case the compiler will have to store the array size somewhere in a hidden variable. The compiler will have to do it at the point where control passes over the typedef. The standard does not have to say it explicitly, but there's simply no other way around it in general case.

That's why I call that a run-time typedef. It generates code and it occupies storage.

Quote from: SiliconWizard on November 05, 2021, 06:18:02 pm

All I could really find is that the *size of the array* has to be evaluated when the typedef is encountered. This is this evaluation that could lead to actual code being emitted for it, as far as I understand.

No, they key point here is not the evaluation per se. It is the fact that the evaluated value has to be "frozen" at that very moment. The original expression cannot be reevaluated every time that size is requested by the subsequent code (e.g. by a `sizeof`), since the components of that expression might be lost already. For which reason the implementations are forced to evaluate only once and store the result in a hidden variable.

You can easily see that hidden variable in Godbolt:

Code: [Select]

int main()
{
  int n1 = 10, n2 = 20;
  typedef char A[n1 + n2];
}

Code: [Select]

main:
        push    rbp
        mov     rbp, rsp
        push    rbx
        mov     DWORD PTR [rbp-20], 10
        mov     DWORD PTR [rbp-24], 20
        mov     edx, DWORD PTR [rbp-20]
        mov     eax, DWORD PTR [rbp-24]
        add     eax, edx
        movsx   rdx, eax
        sub     rdx, 1
        mov     QWORD PTR [rbp-32], rdx
        cdqe
        mov     rcx, rax
        mov     ebx, 0
        mov     eax, 0
        mov     rbx, QWORD PTR [rbp-8]
        leave
        ret

https://godbolt.org/z/G7e7GW7Y7

The `mov QWORD PTR [rbp-32], rdx` is exactly that - initialization of a hidden variable to the current value of `n1 + n2` (this implementation actually stores `n1 + n2 - 1`, if you notice).

Initialization of that hidden variable is the reason why the typedef cannot be skipped. If you somehow manage to skip it, `sizeof(A)` will access a garbage value from that hidden variable, which is what happens in TinyC actually (TinyC might pre-initialize it to zero, but that's beside the point).

Quote from: SiliconWizard on November 05, 2021, 06:18:02 pm

Now until you actually declare a variable of that type, no space should be allocated for it.

... except for that hidden variable. It can be optimized away in many cases or replaced by a CPU register, but not always, of course.

Quote from: SiliconWizard on November 05, 2021, 06:18:02 pm

But even if it forces evaluation of the size - even if you don't use the type - this piece of code should be pretty harmless?

If you don't "use" that type in any way, then yes, it is probably harmless from practical point of view... But who cares about such contrived cases in practice? And the standard prohibits skipping.

Normally, if you declared it, you will most certainly use it. And if you somehow manage to skip the typedef, all hell will break loose once you attempt to use the type. E.g. `sizeof(A)` will evaluate to a garbage value. This is not harmless at all.

If you know GCC well enough, you know how to force the skip: a GCC extension - run-time goto - will help you. Try this

Code: [Select]

#include <stdio.h>
#include <stdlib.h>
 
int main()
{
  int n = 10;

  if (rand() >= 0) /* <- just to suppress optimizations */
    goto *&&skip;

  typedef char A[n];

skip:;  
  printf("%zu\n", sizeof(A));
}

https://godbolt.org/z/bs8PMafhG

The reason it prints garbage is because it retrieves the size from an uninitialized location. (It retrieves it from a register in this example, but it is beside the point).

Quote from: SiliconWizard on November 05, 2021, 06:18:02 pm

Now don't hesitate to point us to the exact parts of the standard that would back up all your claims, with the associated consequences.

Well, again, see the above. You have already found, I believe, the parts of the standard that prohibit the jump. The rest is out of the standard's scope.

Of course, this is a rather abstract and pedantic way to view of the matter. When the standard features are developed and discussed, possible implementation approaches are always taken into account. Normally, standard will never require something that it does not know how to implement. In many (or most) cases the standard usually implicitly targets one specific approach to the implementation. My description above is the implementation the standard had in mind. Think of me as the proverbial "horse's mouth" when it comes to matters like that.

SiliconWizard · « **Reply #15 on:** November 05, 2021, 07:40:41 pm »

OK, I see now. So yes, it's mainly a matter of implementation. And while leaving things open is good in general in the standard, I think here it may have been a bit too open.

And seeing, for instance, how GCC handles VLAs in some cases is actually intriguing. I think they may have smoked something. Now I admit implementing this feature is not an easy task in general and as a lot is left to the implementation, every compiler is free to do things in many different ways...

Back to TCC, I haven't looked at the generated code for VLAs yet. I'll try to do this to get an idea of how they tackled it.

TheCalligrapher · « **Reply #16 on:** November 05, 2021, 07:59:23 pm »

Note, BTW, that according to the popular belief, GCC also supports VLA in C++ code (as an extension). However, once you start playing with it, you'll discover that in C++ this is a very different feature. C++ VLA in GCC are quite different from standard C99 VLA.

This assertion will hold in C

Code: [Select]

int n = 10;
typedef char A[n];
n = 20;
assert(sizeof(A) == 10);

since C standard requires it to hold. But it will fail in C++ code under GCC.

Interesting to note, Clang also brings this extension over to C++ code, but in Clang the assertion holds. Apparently Clang makes an effort to better mimic C99 specification of VLA in its C++ extension.

SiliconWizard · « **Reply #17 on:** November 05, 2021, 09:05:28 pm »

So, just wanted to have a look. I took your code and just removed the goto:

Code: [Select]

int main(void)
{
  int n = 10;
  typedef char A[n];
}

TCC produces this:

Code: [Select]

0000000000000000 <main>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 81 ec 30 00 00 00    sub    $0x30,%rsp
   b:   b8 0a 00 00 00          mov    $0xa,%eax
  10:   89 45 fc                mov    %eax,-0x4(%rbp)
  13:   8b 45 fc                mov    -0x4(%rbp),%eax
  16:   89 45 f8                mov    %eax,-0x8(%rbp)
  19:   b8 00 00 00 00          mov    $0x0,%eax
  1e:   c9                      leave
  1f:   c3                      ret

We can see that 1/ it produces rather inefficient code, including a "mov $0x0,%eax" instead of "xor %eax,%eax" to return 0, but also that it does compute the array size and even stores the size on the stack. Even though it's never going to be used. But TCC does only very limited optimizations, so that's not surprising.

With GCC, as I expected, things are a bit different.
With "-O1" and above:

Code: [Select]

0000000000000000 <main>:
   0:   48 83 ec 28             sub    $0x28,%rsp
   4:   e8 00 00 00 00          call   9 <main+0x9>
   9:   b8 00 00 00 00          mov    $0x0,%eax
   e:   48 83 c4 28             add    $0x28,%rsp
  12:   c3                      ret

Not sure I get the "call" instruction here...

, but anyway, it effectively optimizes out anything related to the typedef. As I expected.
Interestingly, while I underestand the error from a language standpoint with the "goto" (even though the standard may not be all THAT clear about this point), I find it rather odd and inconsistent that GCC has no problem ignoring the typedef when optimizing, while still finding the code erroneous. Sure we're at different compiling stages here... I get that. But that's just kind of funny.

Now with "-O0", of course, things are different (and get prepared for something really uh... dunno how to call it):

Code: [Select]

0000000000000000 <main>:
   0:   55                      push   %rbp
   1:   57                      push   %rdi
   2:   56                      push   %rsi
   3:   48 83 ec 30             sub    $0x30,%rsp
   7:   48 8d 6c 24 30          lea    0x30(%rsp),%rbp
   c:   e8 00 00 00 00          call   11 <main+0x11>
  11:   c7 45 fc 0a 00 00 00    movl   $0xa,-0x4(%rbp)
  18:   8b 45 fc                mov    -0x4(%rbp),%eax
  1b:   48 63 d0                movslq %eax,%rdx
  1e:   48 83 ea 01             sub    $0x1,%rdx
  22:   48 89 55 f0             mov    %rdx,-0x10(%rbp)
  26:   48 98                   cltq
  28:   48 89 c6                mov    %rax,%rsi
  2b:   bf 00 00 00 00          mov    $0x0,%edi
  30:   b8 00 00 00 00          mov    $0x0,%eax
  35:   48 83 c4 30             add    $0x30,%rsp
  39:   5e                      pop    %rsi
  3a:   5f                      pop    %rdi
  3b:   5d                      pop    %rbp
  3c:   c3                      ret

It does compute the array size here, and stores it on the stack, as TCC does. But the overall code is pretty nasty, even compared to TCC.

brucehoult · « **Reply #18 on:** November 06, 2021, 02:10:50 am »

Quote from: SiliconWizard on November 05, 2021, 09:05:28 pm

With GCC, as I expected, things are a bit different.
With "-O1" and above:
Code: [Select]
0000000000000000 <main>: 0: 48 83 ec 28 sub $0x28,%rsp 4: e8 00 00 00 00 call 9 <main+0x9> 9: b8 00 00 00 00 mov $0x0,%eax e: 48 83 c4 28 add $0x28,%rsp 12: c3 ret
Not sure I get the "call" instruction here...

It's absolutely impossible.

x86 has no way to move the PC directly to a register so something like "call .+0; pop %rax" can be used to transfer the PC into RAX -- at the expense of screwing up your CPUs return address prediction. So on a modern CPU you should instead do (and gcc usually does) "call getPC" and at getPC you have "mov (%rsp),%rax;ret".

The code as it stand will crash:

1) subtract 0x28 from SP
2) push address of the mov onto the stack
3) add 0x28 to SP. It now points to the original SP-4, i.e. 4 bytes below the return address from main
4) ret back to some random place that was stored in memory just below SP ::BOOM::

It's the same problem in the unoptimised version.

HOWEVER -- we are clearly looking at code that hasn't been linked. By the time you run it, the zeros in "e8 00 00 00 00" will have been replaced by something else. I don't know what, but whatever value is in %RAX afterwards isn't used, so it's a procedure / void function.

SiliconWizard · « **Reply #19 on:** November 06, 2021, 03:20:55 am »

Well yes, it was just an objdump on the object file, so before linking indeed. I'll take a look at what it actually calls in linked code. I'm curious.
But the point was that the typedef would effectively yield absolutely no code here with optimizations enabled, as I was pretty sure would happen. The call was just an oddity and I forgot this was just unlinked code. Sorry about the brain fart.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Do you know Tiny C Compiler? (Read 2483 times)

Share me