Author Topic: Is a C struct flatten in memory (Python wrap for a C lib)?  (Read 1091 times)

0 Members and 1 Guest are viewing this topic.

Online RoGeorge

  • Super Contributor
  • ***
  • Posts: 4007
  • Country: ro
Is a C struct flatten in memory (Python wrap for a C lib)?
« on: January 12, 2022, 12:48:53 pm »
I've tried yesterday to add a Python3 wrapping to a function from a C library, and failed.  :-\

It all went like this:

 ;D



This is the C function I've tried to call from Python, and the struct for *info bamboozles me:
Code: [Select]
typedef struct
{
    void (*broadcast)(const char *address, const char *interface);
    void (*device)(const char *address, const char *id);
    void (*service)(const char *address, const char *id, const char *service, int port);
} lxi_info_t;

...

int lxi_discover(lxi_info_t *info, int timeout, lxi_discover_t type);

The code is from https://github.com/lxi-tools/liblxi/blob/master/src/lxi.h , a C library used to control SCPI instruments over LAN (LXI).  The lxi_discover is meant to discover SCPI instruments present in the LAN.

For the other functions present in the same lib, the author already added a Python wrap:
https://github.com/lxi-tools/python-liblxi/blob/master/lxi.py

My attempt to add the discover function is about learning.  My instruments are set with a fixed IP so no need to discover them with a function call, though maybe others might have a different setup and find the lxi_discover useful, IDK.

I'm not sure what I was doing wrong in Python, but no wrapping syntax worked.  I think I don't understand the lxi_info_t struct.

- Can somebody explain to me in words please "void (*broadcast)(const char *address, const char *interface);"?
- How is the lxi_info_t struct datatype represented in memory?  Is it (for first line only) "pointer to function broadcast", "pointer to IP string", "pointer to interface name string"?

- How do I wrap lxi_discover, more precisely how to deal with the *info argument in Python?
« Last Edit: January 12, 2022, 01:04:53 pm by RoGeorge »
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 5610
  • Country: fi
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #1 on: January 12, 2022, 12:59:33 pm »
They are three function pointers.

On memory level, function pointer is like any other pointer - just a memory address.

The arguments to the functions (const char *address, const char *interface) are not stored in the function pointer itself. These are provided so that compiler knows how the functions look like, and can generate correct code when you call the function, through that function pointer.

Depending on the machine, most likely the address is either 32 or 64 bits, so this would be 3 * 4 bytes or 3 * 8 bytes, in that order. I don't exactly remember the padding rules but AFAIK pointers should be definitely self-aligning and not need padding, so struct is just these three memory addresses back-to-back without gaps. Hence, it should Just Work, even if you fail to provide "packed" attritube to the struct.

Regarding Python, I don't know. The only thing that comes in my mind is, do the C compiler and Python agree about the ABI; i.e., the calling conventions? But as I have no idea about Python, maybe someone else can give a better guess.
« Last Edit: January 12, 2022, 01:04:40 pm by Siwastaja »
 
The following users thanked this post: RoGeorge

Online RoGeorge

  • Super Contributor
  • ***
  • Posts: 4007
  • Country: ro
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #2 on: January 12, 2022, 02:06:49 pm »
So, talking about C only and assuming a pointer length is 8 bytes, the line:
Code: [Select]
lxi_discover(info1, 5, 0)means info1 is 8 bytes containing the memory address of a data structure, and that memory zone pointed by info1 will look like this
Code: [Select]
- 8 bytes containing the start address of a function named "broadcast"
- 8 bytes containing the start address of a function named "device"
- 8 bytes containing the start address of a function named "service"
The other arguments are easy, timeout 5 is an int (let's say is stored on 2 bytes), and 0 is an enum, so another int stored in another 2 bytes.

If I got the *info right, then how does one finds out (in C) what instruments were discovered by the "lxi_discover(info1, 5, 0)"?

This is a usage example of calling lxi_discover( ... ), from the file https://github.com/lxi-tools/lxi-tools/blob/master/src/discover.c
Code: [Select]
static int device_count = 0;
static int service_count = 0;

static void broadcast(const char *address, const char *interface)
{
    UNUSED(address);
    printf("Broadcasting on interface %s\n", interface);
}

static void device(const char *address, const char *id)
{
    printf("  Found \"%s\" on address %s\n", id, address);
    device_count++;
}

static void service(const char *address, const char *id, const char *service, int port)
{
    printf("  Found \"%s\" on address %s\n    %s service on port %u\n", id, address, service, port);
    service_count++;
}



int discover(bool mdns, int timeout)
{
    lxi_info_t info;

    // Set up info callbacks
    info.broadcast = &broadcast;
    info.device = &device;
    info.service = &service;

    printf("Searching for LXI devices - please wait...\n\n");

    // Search for LXI devices / services
    if (mdns)
    {
        lxi_discover(&info, timeout, DISCOVER_MDNS);
        if (service_count == 0)
            printf("No services found\n");
        else
            printf("\nFound %d service%c\n", service_count, service_count > 1 ? 's' : ' ');
    }

... 
    return 0;
}

I don't understand the interaction with those 3 callback functions.

- Does this means "address" and "interface" are global variables used to return data (here the IP of the discovered instruments) back to the main program?
- How does lxi_discover (together with the 3 callback functions) returns the discovered data about the existing instruments?
« Last Edit: January 12, 2022, 02:15:27 pm by RoGeorge »
 

Online gf

  • Frequent Contributor
  • **
  • Posts: 701
  • Country: de
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #3 on: January 12, 2022, 03:41:03 pm »
Regarding Python, I don't know. The only thing that comes in my mind is, do the C compiler and Python agree about the ABI; i.e., the calling conventions? But as I have no idea about Python, maybe someone else can give a better guess.

Two possibilities are:
1) Use the C API (implement python bindings to the library in C, i.e. implement an adapter module)
2) Use the CFFI (enables direct calling of C functions from Python)
 
The following users thanked this post: RoGeorge

Offline gmb42

  • Frequent Contributor
  • **
  • Posts: 270
  • Country: gb
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #4 on: January 12, 2022, 04:08:38 pm »
The lxi_discover function is passed the lxi_info_t structure containing the user specified notification callbacks and when necessary the code will call the callbacks to notify of something of interest.

It's up to the user of lxi_discover what those callbacks do, in the example you've posted they simply print something to the console and increment the count variables.  You could for example, accumulate devices in a list of some sort.
 
The following users thanked this post: RoGeorge

Online Cerebus

  • Super Contributor
  • ***
  • Posts: 10239
  • Country: gb
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #5 on: January 12, 2022, 04:35:35 pm »
So, talking about C only and assuming a pointer length is 8 bytes, the line:
Code: [Select]
lxi_discover(info1, 5, 0)means info1 is 8 bytes containing the memory address of a data structure, and that memory zone pointed by info1 will look like this
Code: [Select]
- 8 bytes containing the start address of a function named "broadcast"
- 8 bytes containing the start address of a function named "device"
- 8 bytes containing the start address of a function named "service"
The other arguments are easy, timeout 5 is an int (let's say is stored on 2 bytes), and 0 is an enum, so another int stored in another 2 bytes.

If I got the *info right, then how does one finds out (in C) what instruments were discovered by the "lxi_discover(info1, 5, 0)"?

This is a usage example of calling lxi_discover( ... ), from the file https://github.com/lxi-tools/lxi-tools/blob/master/src/discover.c
Code: [Select]
static int device_count = 0;
static int service_count = 0;

static void broadcast(const char *address, const char *interface)
{
    UNUSED(address);
    printf("Broadcasting on interface %s\n", interface);
}

static void device(const char *address, const char *id)
{
    printf("  Found \"%s\" on address %s\n", id, address);
    device_count++;
}

static void service(const char *address, const char *id, const char *service, int port)
{
    printf("  Found \"%s\" on address %s\n    %s service on port %u\n", id, address, service, port);
    service_count++;
}



int discover(bool mdns, int timeout)
{
    lxi_info_t info;

    // Set up info callbacks
    info.broadcast = &broadcast;
    info.device = &device;
    info.service = &service;

    printf("Searching for LXI devices - please wait...\n\n");

    // Search for LXI devices / services
    if (mdns)
    {
        lxi_discover(&info, timeout, DISCOVER_MDNS);
        if (service_count == 0)
            printf("No services found\n");
        else
            printf("\nFound %d service%c\n", service_count, service_count > 1 ? 's' : ' ');
    }

... 
    return 0;
}

I don't understand the interaction with those 3 callback functions.

- Does this means "address" and "interface" are global variables used to return data (here the IP of the discovered instruments) back to the main program?
- How does lxi_discover (together with the 3 callback functions) returns the discovered data about the existing instruments?

As lxi_discover goes through the discovery process it calls the 'callback' functions to return information as and when it feels like it (i.e. you have no control over when the callbacks are made, you just know that they are a side-effect of calling lxi_discover). You are expected to write those callback functions and pass pointers to those callback functions (as a struct) at the start of the discovery process to lxi_discover. Those particular instances of "address" and "interface" are function parameters in the callback functions that you will have provided and are both of the type "a pointer to a null-terminated C character string".

Take a look here (https://reptate.readthedocs.io/developers/python_c_interface.html) for a short tutorial in C<->Python interfacing. The latter part has an example of how to write a simple callback in python that can be passed to C. Found, by the way, as the first hit on searching for "python interfacing to c callback functions" - sometimes you have to know the name of what you're trying to find ("callback functions") before you can start searching.
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 
The following users thanked this post: RoGeorge

Online RoGeorge

  • Super Contributor
  • ***
  • Posts: 4007
  • Country: ro
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #6 on: January 16, 2022, 08:30:56 am »
If a callback can point anywhere, including in another program - here a function from Python - wouldn't that be a security issue?

I mean, I can write a malicious function in Python (let's say a reset, a jump to zero), and because of the callback exposed in that library, the call will appear like it is coming from a trusted library.

How does the kernel distinguish where from did a call came, and who is granted to make a reset and who is not?

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 3708
  • Country: fi
    • My home page and email address
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #7 on: January 16, 2022, 09:54:05 am »
If a callback can point anywhere, including in another program - here a function from Python - wouldn't that be a security issue?
A callback can only point at addresses in the same process.  The process here is the Python interpreter.  Its privileges are those of the user who executed the interpreter.  Because the executable is the Python interpreter, the privileges on the Python source or object files do not matter, except for readability: they are just data the Python interpreter reads to decide what it does internally.

The Python interpreter uses dlopen() (in Linux/Android/Mac) to load shared libraries into the process memory.  Python provides ctypes, a built-in facility for doing exactly that, plus converting from the OS-and-machine specific binary ABI (application binary interface) to Python and vice versa.  The documentation includes examples how to deal with callback functions, too.
« Last Edit: January 16, 2022, 10:03:14 am by Nominal Animal »
 
The following users thanked this post: RoGeorge

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 3708
  • Country: fi
    • My home page and email address
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #8 on: January 16, 2022, 11:05:44 am »
If you are running on Linux, this example may help.

First, a trivial dynamic library, mycall.c:
Code: [Select]
struct callbacks {
    int (*first)(char *);
    int (*second)(int);
};

int mycall(struct callbacks *cbs)
{
    /* We require a non-NULL pointer to a structure, or we return -1. */
    if (!cbs) {
        return -1;
    }

    /* If the first callback is non-NULL, we'll call it. */
    if (cbs->first) {
        int  retval = cbs->first("Hello, world!");
        /* If it returns nonzero, we return that immediately. */
        if (retval)
            return retval;
    }

    /* If the second callback is non-NULL, we'll call it. */
    if (cbs->second) {
        int  retval = cbs->second(42);
        /* If it returns nonzero, we return that immediately. */
        if (retval)
            return retval;
    }

    /* All done. */
    return 0;
}
Compile this using e.g.
    gcc -Wall -Wextra -O2 -fPIC mycall.c -shared -Wl,-soname,libmycall.so -o libmycall.so
into libmycall.so dynamic library.

Here is a Python 3 (3.5 or later) example, example.py, using the above:
Code: [Select]
# -*- encoding: UTF-8 -*-

from ctypes import CDLL, CFUNCTYPE, Structure, byref, c_char_p, c_int

# Example first callback function
def python_first(arg: c_char_p) -> c_int:
    print(b'python_first() called with "%b".' % arg)
    return 0

# Example second callback function
def python_second(arg: c_int) -> c_int:
    print(b'python_second() called with %d.' % arg)
    return 0

# This is the callback structure we need:
# struct {
#    int (*first)(char *);
#    int (*second)(int);
# };
class struct_cbs(Structure):
    _fields_ = [ ("first", CFUNCTYPE(c_int, c_char_p)),
                 ("second", CFUNCTYPE(c_int, c_int)) ]

# Let's create an instance of that structure, with the example callbacks.
cbs = struct_cbs( (CFUNCTYPE(c_int, c_char_p))(python_first),
                  (CFUNCTYPE(c_int, c_int))(python_second) )

# Load the dynamic library
libmycall = CDLL("./libmycall.so")

# Obtain the symbol (function in this case) from the library
mycall = libmycall.mycall
# Its return value is an int (default, so this is not necessary),
mycall.restype = c_int

# Do the C call
retval = mycall(byref(cbs))
print(b'mycall_func() returned %d.' % retval)
Run this example using e.g.
    python3 example.py
and the output will be
    b'python_first() called with "Hello, world!".'
    b'python_second() called with 42.'
    b'mycall_func() returned 0.'

If the structure contained basic ctypes types (c_int, c_char_p, etc.), they would not need the (CFUNCTYPE(rettype, argtypes...))(pythonfunc) construction when creating the structure instance; only the function type does.  Basic types can just be supplied as-is.

Again, everything you need for this is described in the Python 3 ctypes documentation.

When you get your mind around it (and the fact that the process is the python interpreter, which loads "copies" of the dynamic libraries needed into its memory –– instances of the same library in different processes cannot access each other), it really is very simple to do all the interfacing from the Python side, at least in Linux/Android/Macs.
If you use Windows, sorry; I do not, and I'd loathe trying to help without being able to verify my examples/suggestions first.
 
The following users thanked this post: RoGeorge

Online RoGeorge

  • Super Contributor
  • ***
  • Posts: 4007
  • Country: ro
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #9 on: January 16, 2022, 09:05:47 pm »
Wow, the complete example came extremely useful, thank you!   :-+

It finally makes sense.  I'll go read the doc tutorials, too.  So far I've only compiled and run your example, then put the .so and .py sources side by side in editor, read the comments, and I think this time I've got it how it works without any doubt.  The example clarified all.  Thanks a lot!




The OS, you didn't guess!  ;D

Left Windows behind some years ago and never looked back, trying almost all Linux distros, from Ubuntu to Arch or Gentoo.  Settled to Kubuntu LTS, for comfort.  Then I wanted to have disk snapshots, preferably ZFS, and after yet another long detour of trying distros, including OpenSUSE with BTRFS and also FreeBSD with native ZFS, I've settled to a mixture of experimental Ubuntu on ZFS root + KDE Plasma from Kubuntu (Kubuntu doesn't support ZFS root like Ubuntu).

That combination worked pretty well, it even has a new tool, https://github.com/ubuntu/zsys, integrated in the OS (developed by Ubuntu), between other feature, it makes automated snapshots and lets one to either roll back from the Grub menus, or just boot into former snapshots without rollback.

Worked for a while just fine, then I clicked a link from the presumably Assange's dead man switch files on Wikileaks, and my nicely cobbled ZFS Kubuntu showed me the login screen out of nowhere, and never managed to recover it with all the ZFS and the automated snapshots.

Got quite scared, even made an SPI Flash reader from a Raspberry Pi to check the BIOS chips from the motherboard.  No idea what happened, but since I didn't repair the old OS yet, I kept using a micro SD card on which I've installed FreeBSD.

FreeBSD is not that user-friendly as Ubuntu, but the more I use it the more I like it.  It is closer to the old Unix style, no Ubuntu style bloatware, no systemd, and so far all the must have programs from Linux are available in one way or another in FreeBSD, too.

It's been a month since I use FreeBSD daily, and so far only VirtualBox had a missing part (no VirtualBox PUEL drivers for USB 2 or 3, so only USB 1 for VB machines).  There is a mode to run Linux binaries in FreeBSD, and that might be a workaround, but I don't know yet how that works.

At this point I don't even know if I want to go back to Ubuntu.  I like FreeBSD a lot, and their doc pages are better.

TL;DR - by serendipity I'm running FreeBSD, and your example was no problem for FreeBSD, the compile line was the same.  GCC was not found at first try, so I typed "pkg install gcc", then it worked.  :D

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 3708
  • Country: fi
    • My home page and email address
Re: Is a C struct flatten in memory (Python wrap for a C lib)?
« Reply #10 on: January 17, 2022, 07:32:10 am »
TL;DR - by serendipity I'm running FreeBSD, and your example was no problem for FreeBSD, the compile line was the same.  GCC was not found at first try, so I typed "pkg install gcc", then it worked.  :D
:-+

You can also replace gcc with clang, or even cc (the default C compiler), and it should work just fine on all systems using ELF dynamic libraries and a gcc/clang-like interface for specifying the library name: Linux, Android, FreeBSD, OpenBSD, most other BSD variants, and even MacOS.  (The -Wl,-soname,libname.so records the library name into the generated binary; as if one passed -soname libname.so to the ld linker.  It is not crucial, only recommended.)

In my Makefiles that generate dynamic libraries, I use the recipe
    libname.so: sources-or-object-files-used
        $(CC) $(CFLAGS) -fPIC -shared $^ -Wl,-soname,$@ $(LDFLAGS) -o $@
and it almost always suffices.  (If I recall correctly, symbol visibility may in some cases require some tricks; -rdynamic adds all symbols to the dynamic symbol table, and can be useful.)  Sometimes the target is libname.so.$(VERSION), though, although I like to set the name and symbolic links (from major and default versions) at install time instead.



The key point here is that in most cases, the original dynamic libraries with C bindings do not need any changes or any additions to be usable from Python, and the built-in ctypes interface is actually pretty easy to use to do all the interfacing from Python.

You should, of course, write a Python module (with minimal interface code, as shown in my example), to Pythonize the interface, preferably hiding any OS-specific quirks.  That way, one can later wrap the module (in if sys.platform == "platform": for each OS –– "aix", "linux", "freebsd", "win32", "cygwin", or "darwin" –– and then do the OS-specific stuff) if OS-specific quirks are needed.

The ctypes approach does mean that the Python code needs to describe the C interfaces it wants to use, because (on ELF-based systems) the dynamic libraries do not describe them, only the symbol name.  GObject introspection files (gir packages) do contain those, and the Python gi module can use an import them.  This is why you only need e.g.
    import gi
    gi.require_version("Gtk", "3.0")
    from gi.repository import Gtk
to use Gtk+3.0 in Python, without having to deal with ctypes directly.  (In Debian derivatives, the library itself is package libgtk-3-0, and the introspection files are in gir1.2-gtk-3.0, usually installed at /usr/lib/hwarch/girepository-1.0/library.typelib.  My system has 130 for them installed.)

Since lxi-tools already uses Gtk, I do believe it would make sense to just add GObject introspection to liblxi.  I do see you've already started on the Python bindings two weeks ago, but maintaining it separately from the C sources is more work than just letting the introspection tools do it automatically.  This does add one more build dependency, on gobject-introspection, but it is only a build-time dependency, not a run-time dependency.  How this is done in real life, is described at https://gi.readthedocs.io/en/latest/.  Essentially, when the dynamic liblxi library is compiled, g-ir-scanner is run to extract the type information from the C sources and header files, which is then compiled into the typelib file, Liblxi-1.13.typelib, using g-ir-compiler.  The typelib file is packaged separately (gir1.2-liblxi-1.13), providing the GObject introspection.  When that is installed, to use liblx1, one would just do
    import gi
    gi.require_version("Liblxi", "1.13")
    from gi.repository import Liblxi
The obvious benefit of this approach is automation.  The downside is that if the bindings needed Python logic (currently conversion to/from ASCII), those would still have to be implemented as a Python module... but changes and additions to the C library interface would be immediately reflected in the GI typelib files, and accessible from Python.

Now, I do not have any Test Equipment that could use LXI, so I cannot really tell if a separate Python interface module or GObject introspection files make more sense.
My own approach would be to test both first on my own machine (which I cannot do, lacking any LXI equipment!), and if the GI works without issues as-is, then contact the project maintainers/contributors and suggest the change in the form of a suggested patch and an example.

I do not normally do GI on libraries that have no connection to GObject (because, uh, no reason), but GTK+ is a GObject-based library, so it does kinda sorta make sense here.
GI is most useful when the library interface is still evolving, as it keeps the foreign language interface –– not just Python, mind you! –– in sync with the C implementation without explicit developer effort, only testing needed to make sure nothing breaks accidentally.  A simple way to think about it is to consider the GI Python module a way to automate what the ctypes module does, but using information gathered from the actual C sources.
 
The following users thanked this post: RoGeorge


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf