Author Topic: how to include a binary with C  (Read 3100 times)

0 Members and 1 Guest are viewing this topic.

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 3911
  • Country: gb
Re: how to include a binary with C
« Reply #25 on: January 27, 2023, 10:01:37 am »
There must be a linker that can handle object files? If so just figure out the object file format and generate object files.

yup, of course, but there is no tool to generate an obj, except the C89/99 and Assembly Compilers.

So, I would have to (port "objcopy"  to RISCOS and) play with something like this
Code: [Select]
objcopy \
        --prefix-symbol=_ \
        --input-target binary \
        --output-target elf \
         msg.txt msg.o
(ok, on RISCOS, it's not exactly "elf" for a native application, but ... ignore this for now)

That damned thing took 14 hours to compile inside a RISCOS-UNIX-sandbox (so, you can imagine how slow that Frankenstein stuff is), and then it even dared to complain :o :o :o

Code: [Select]
invalid bfd target

invalid, w_h_a_t_!?  :o :o :o :o

ah, that madness thing wants this
Code: [Select]
objcopy \
        --prefix-symbol=my \
        --input-target binary \
        --output-target elf32-little \
         msg.txt msg.o
(+        --output-target elf32-little)

This way it's happy, and outputs what it's needed.

Fine! Good! Excellent!

Code: [Select]
\()/
 __

let's start a whiskey flavored coffee celebration party :D :D :D

Just ... ummm ...

Code: [Select]
# objdump -t msg.o

msg.o:     file format elf32-little

SYMBOL TABLE:
00000000 l    d  .data  00000000 .data
00000000 g       .data  00000000 my_binary_msg_txt_start
0000000c g       .data  00000000 my_binary_msg_txt_end
0000000c g       *ABS*  00000000 my_binary_msg_txt_size

I am now on a MIPS32R2/LE, so "elf32-little" is ok here, but on my HPPA/BE it should be "elf32-big" instead.

Code: [Select]
# objdump -t msg.o

msg.o:     file format elf32-big

SYMBOL TABLE:
00000000 l    d  .data  00000000 .data
00000000 g       .data  00000000 my_binary_msg_txt_start
0000000c g       .data  00000000 my_binary_msg_txt_end
0000000c g       *ABS*  00000000 my_binary_msg_txt_size

And here I see the annoying part ... not, a real problem, if you export things as byte[..] there is no endian-problem at all, but ... ummm  .... that inconsistency (especially if automatically handled(1)) hits my eyes like a punch, and if I forget about it I'm sure that sooner or later a serious problem will lurk in there once I abuse it, then wasting ____a lot of time___ debugging :-//

* * *

So, moral of the story, I prefer "binary to C", or "binary to assembly", and to handle the result as a "module" filled with nothing but a public data described in its "interface".

          convert(binary) => module(interface, body(%.c | %.s))

This looks coherent, easy for Git, and portable with less details to care about.



(1) with cross-dev, that stupid thing caused a lot of problems with Gcc cross-emergent, not to mention a lot of problems with Portage cross-emergent. A stupid oversight with all the GNU-madness you need to look at (automake, autoconfigure, etc), thousands of hours wasted.
« Last Edit: January 27, 2023, 10:15:00 am by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 3911
  • Country: gb
Re: how to include a binary with C
« Reply #26 on: January 27, 2023, 01:34:24 pm »
Code: [Select]
rm -f obj/msg.*

objcopy \
        --prefix-symbol=my \
        --localize-hidden \
        --rename-section .data=.rdata,CONTENTS,ALLOC,LOAD,READONLY,DATA \
        --input-target binary \
        --binary-architecture powerpc \
        --output-target elf32-powerpc \
        msg.txt \
        obj/msg.o
(PowerPC/BE)

Code: [Select]
rm -f obj/msg.*

objcopy \
        --prefix-symbol=my \
        --localize-hidden \
        --rename-section .data=.rdata,CONTENTS,ALLOC,LOAD,READONLY,DATA \
        --input-target binary \
        --binary-architecture i386 \
        --output-target elf32-i386 \
        msg.txt \
        obj/msg.o
(x86/LE)

Code: [Select]
obj/msg.o:     file format elf32-powerpc

SYMBOL TABLE:
00000000 l    d  .rdata 00000000 .rdata
00000000 g       .rdata 00000000 my_binary_msg_txt_start
00000020 g       .rdata 00000000 my_binary_msg_txt_end
00000020 g       *ABS*  00000000 my_binary_msg_txt_size

Code: [Select]
public uint8_t my_binary_msg_txt_start[];
public uint8_t my_binary_msg_txt_end[];

private void app()
{
    p_uint8_t p_data;
    uint32_t  data_size;

    p_data    = my_binary_msg_txt_start;
    data_size = (my_binary_msg_txt_end - my_binary_msg_txt_start);

    printf("start=0x%p\n", my_binary_msg_txt_start);
    printf("  end=0x%p\n", my_binary_msg_txt_end);
    printf("#size=0x%lx\n", data_size);
    printf("#size=%lu\n", data_size);

    data_show(p_data, data_size, 8);
}

Code: [Select]
start=0x0x80052043
  end=0x0x80052063
#size=0x20
#size=32
|h A l l o   w |
|o r l d !   s |
|u p p a   c a |
|n   f l y   a |
|g a i n . . . |

so, it's arch/endian dependent  :o :o :o
« Last Edit: January 27, 2023, 01:42:43 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 6761
  • Country: pl
Re: how to include a binary with C
« Reply #27 on: January 27, 2023, 01:45:41 pm »
What's endian dependent?

Nothing got mangled here.

The binary is embedded in its original byte order.
I suppose your data_show prints it the same way.
 

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 3911
  • Country: gb
Re: how to include a binary with C
« Reply #28 on: January 27, 2023, 01:57:08 pm »
What's endian dependent?

These lines

Code: [Select]
        --binary-architecture .... \
        --output-target .... \

architecture/endian dependent.
I mean, when you move the project among architectures, you have to "adapt" the configure file or something
(e.g. PowerPC64/BE (PowerMac-G5) is not the same as PowerPC64/LE (Cell,POWER9/10))

But see this:

Code: [Select]
objcopy \
        --prefix-symbol=my \
        --localize-hidden \
        --rename-section .data=.rdata,CONTENTS,ALLOC,LOAD,READONLY,DATA \
        --input-target binary \
        --binary-architecture i386 \
        --output-target elf32-little \
        msg.txt \
        obj/msg.o

Code: [Select]
- linking to my
/usr/lib/gcc/i686-pc-linux-gnu/10.2.0/../../../../i686-pc-linux-gnu/bin/ld: unknown architecture of input file `obj/msg.o' is incompatible with i386 output
collect2: error: ld returned 1 exit status
make: *** [Makefile:16: all] Error 1

objdump is happy with that
gcc/ld are not happy with that

(edit: with superhitachi and mips things get even more messed up
the target-flags have to match :o :o :o )
« Last Edit: January 27, 2023, 02:05:31 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 6761
  • Country: pl
Re: how to include a binary with C
« Reply #29 on: January 27, 2023, 02:06:44 pm »
Yeah, object files are system dependent and it looks like you need to specify it explicitly for objcopy.

In the cool thread we used ld instead, which automatically generates its native object file. So command line is the same for all platform (as long as you use GNU ld).
 
The following users thanked this post: DiTBho

Offline PlainName

  • Super Contributor
  • ***
  • Posts: 6821
  • Country: va
Re: how to include a binary with C
« Reply #30 on: January 30, 2023, 08:08:48 am »
For anyone still wanting another choice, I'm currently using a slightly modified eight-year-old bin2c which has the advantage over the previously mentioned bin2array, and others, of processing a bunch of files all in one go and shoving them in a single .c/.h pair. My use at the moment is to embed a folder holding a bunch of HTML files into a RO virtual drive:
Code: [Select]
# Make HTML files into embedded binary source to place in RO memory.
file (GLOB_RECURSE bin_files RELATIVE ${CMAKE_SOURCE_DIR}/main/wifi/html ${CMAKE_SOURCE_DIR}/main/wifi/html/*)
add_custom_command(
OUTPUT ../main/wifi/html_files.h
WORKING_DIRECTORY ${CMAKE_SOURCE_DIR}/main/wifi/html
COMMAND bin2c.exe -t -d html_files.h -o html_files.c ${bin_files}
COMMAND move html_files.* ..
DEPENDS ${CMAKE_SOURCE_DIR}/main/wifi/configurator.c)
add_custom_target(embed_html DEPENDS ${CMAKE_SOURCE_DIR}/main/wifi/html_files.h)
add_dependencies(${PROJECT_NAME}.elf embed_html)

Most of that is just getting bloody cmake/ninja to recognise a change and rung the thing again :( An advantage is I don't need to dick around editing file names in the build system or code.

The .h contents (for one of the files) ends up looking like this:
Code: [Select]
/* Contents of file framestyles.css */
extern const long int framestyles_css_size;
extern const unsigned char framestyles_css[1800];

and the relevant part in the .c:
Code: [Select]
/* Contents of file framestyles.css */
const long int framestyles_css_size = 1800;
const unsigned char framestyles_css[1800] = {
    0x2A, 0x20, 0x7B, 0x0D, 0x0A, 0x20, 0x20, 0x62 ...
};
« Last Edit: January 30, 2023, 08:10:24 am by PlainName »
 
The following users thanked this post: DiTBho

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 3911
  • Country: gb
Re: how to include a binary with C
« Reply #31 on: January 30, 2023, 10:33:21 am »
the advantage over the previously mentioned bin2array, and others, of processing a bunch of files all in one go and shoving them in a single .c/.h pair. My use at the moment is to embed a folder holding a bunch of HTML files into a RO virtual drive

yeah, that's the right way to do things  ;D

Yesterday, I wrote my own tool

Code: [Select]
# make
- compiling app "mybin2cs" ... done
- finalizing to mybin2cs ... done

# echo "hAllo world" > msg0.txt
# mybin2cs msg0.txt obj/lib_suppa assembly.data suppa
# as obj/lib_suppa.s -o obj/lib_suppa.o
# objdump -t            obj/lib_suppa.o

obj/lib_suppa.o:     file format elf32-hppa
SYMBOL TABLE:
00000000 l    d  .text  00000000 .text
00000000 l    d  .data  00000000 .data
00000000 l    d  .bss   00000000 .bss
00000000 g       .data  00000000 suppa
00000010 g       .data  00000000 suppa_size

# cat obj/lib_suppa.s
        .data
        .globl suppa
        .globl suppa_size

        /* - - - body - - - */
        .align 1
suppa:
        .byte  0x68, 0x41, 0x6c, 0x6c, 0x6f, 0x20, 0x77, 0x6f
        .byte  0x72, 0x6c, 0x64, 0x0a, 0x00, 0x00, 0x00, 0x00
        .align 4
suppa_size:
        .long 12

# cat obj/lib_suppa.h
#ifndef _lib_suppa_
#define _lib_suppa_

#include "types.h"

public uint8_t  suppa[];
public uint32_t suppa_size;

#endif

It always outputs a pair { .c, .h } | { .s, .h }.
If you specify "assembly", you can force the section you want { .data, .rodata, ... }.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6242
  • Country: fi
    • My home page and email address
Re: how to include a binary with C
« Reply #32 on: January 30, 2023, 10:36:42 am »
I've used objcopy or ld, and a simple Bash script to generate both the ELF object file and the header file:
    #ifndef   NAME_H
    #define   NAME_H
    extern type name[count];
    #endif /* NAME_H
with count derived from the file size (file size divided by size of type); and NAME, name, type, the size of type, and the binary file name specified by command line parameters.

Then, sizeof name yields the size of the binary object in bytes, and (sizeof name / sizeof name[0]) == count yields the number of elements in the array, as usual.  The C/C++ compiler also detects trivial array overrun errors then, because it too knows the exact size of the array.

If you have several of them, say icons or images, I'd modify the script to read the list of binaries and their properties from a plain text file, generating one object file for all of them, with the associated header (and C source file) describing each of their properties.

Sure, one could easily write a program to do this, because ELF files are easier to manipulate than one would think, except for a few historical warts like 64-bit Alpha and S390, which use nonstandard fields, but because of Unix philosophy, I prefer the scripted easily adapted approach myself.  (If you have lots of files, use the stat -c '%Y' filename utility to obtain the last modification time of the file in seconds since Unix epoch, and only regenerate the file if any of the sources is newer than the already existing object file.)



There is one pattern I wish everyone who writes sh-compatible scripts (including Bash scripts): Use an autodeleted temporary directory for your temporary files,
Code: [Select]
#!/bin/sh
export LANG=C LC_ALL=C
Work="$(/usr/bin/mktemp -d)" || exit 1
trap "rm -rf '$Work'" EXIT

# Use "$Work/filename" for temporary files, preferably with descriptive file names.
You want the script to work in a well-defined locale; the default (C) is the obvious choice.  This way, even if file names contain non-ASCII characters (using any character set), your script won't be confused, and just uses the file names as-is.

The temporary directory $Work is always removed when the script exits, even if the script fails due to an error.  The way the trap is set, the path to the temporary directory is expanded when the trap is set, so even if you manage to mangle the Work shell variable somehow (say, set it to /), the trap is not affected at all; the original temporary directory and all its contents will be removed.

(Both mktemp and the stat commands I described earlier are part of GNU coreutils, and always installed in Linux.  They or their equivalents are obviously available for other operating systems as well.)

Another pattern is to use nul-separated file names (find -print0, xargs -0, Bash read -d "", and so on), so that your scripts are not confused or do the wrong thing if a file name happens to contain a space or a newline.  But in build environments, that's usually not an issue.
 
The following users thanked this post: DiTBho

Offline DC1MC

  • Super Contributor
  • ***
  • Posts: 1882
  • Country: de
Re: how to include a binary with C
« Reply #33 on: January 30, 2023, 12:00:53 pm »
@Nominal Animal -> this is the  8) way !!!
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14447
  • Country: fr
Re: how to include a binary with C
« Reply #34 on: January 30, 2023, 07:38:38 pm »
Now obviously if you don't have access to binutils and/or you don't want to write your own tool, sure, generating a C file is simple and one of the most portable approaches possible, while allowing to declare variables with the exact type you want (and alignment if required.) The only downside really is the size of the file - if you use hex as initializers you'll need in the order of 5 times the size of the binary file. But these days, storage is cheap and you can always delete the intermediate C file automatically after compiling it.
 
The following users thanked this post: DiTBho

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4028
  • Country: nz
Re: how to include a binary with C
« Reply #35 on: January 30, 2023, 10:12:14 pm »
The only downside really is the size of the file - if you use hex as initializers you'll need in the order of 5 times the size of the binary file. But these days, storage is cheap and you can always delete the intermediate C file automatically after compiling it.

I made a file with a million lines of random bytes in the format "0xnn,\n" [1] and then compressed it with default gzip, the same as git does. It came out 1.53x the size of a binary file of the same bytes.

Other options:

Code: [Select]
Size Time Method
6.00 0.00 raw
3.02 0.04 lz4
1.82 0.52 lz4 -9
1.53 0.42 gzip
1.48 0.07 zstd
1.43 5.09 gzip -9
1.02 0.39 bzip2

[1] perl -e 'for (1..1024**2){printf "0x%2x,\n",int(rand(256))}' >randhex_1M
 
The following users thanked this post: DiTBho

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14447
  • Country: fr
Re: how to include a binary with C
« Reply #36 on: January 31, 2023, 12:31:43 am »
Oh, if you compress, sure!

Otherwise, since it's usually just for generating a temporary C file that'll get compiled to an object file, you can write your conversion tool from binary to C to output on stdout, and pipe it to the C compiler. No intermediate file needed!
 
The following users thanked this post: DiTBho


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf