LD_PRELOAD Trick

Muhammet Soytürk
3 min readNov 6, 2020

I have been working on a library that interposes certain functions of a program without changing the source code of the program to retrieve statistics about memory transfers. I want to share a handy trick that I found to achieve this interposition commonly known as “ld_preload trick”.

The main advantage of this “trick” is that you don’t need to change source code of the given program. You just plug in your shared library with LD_PRELOAD to any program you want and change the behavior of any function that will be linked. Pretty cool huh?

Refresher

Before I start explaining how this trick works, I want to remind how you run a program.

Simplified Compilation and Execution Pipeline [1]
  1. Compilation of your program

When you compile your code, the compiler produces object files each of which is mostly machine code, but they also contain symbols. These symbols are basically names of global objects, functions, etc.. The symbols that we care about in our case are function symbols.

2. Linkage

Linker produces a complete executable by linking all the symbols to their definitions from respective libraries. You can see which symbols an object code contain with nm command line utility.

3. Loader instructs the OS how to start running the executable

Loader basically places programs into the memory of your computer and prepares them for execution.

LD_PRELOAD Trick

This trick is related to the second step (linkage) since the linker links the function symbols. In this trick, we use Unix dynamic linker/loader (ld) to interpose any function in a program. What you do is basically to ask ld to link your shared library, before any other library.

For example, let’s say we have a small program char_to_int.c:

#include <stdio.h>
#include <stdlib.h>
int main(){
const char* a = 9;
printf("%d\n", atoi(a));
}

We basically define a character and turn it into an integer with atoi function of stdlib.h and print it. To compile convert_char_to_int.c:

gcc -o char_to_int char_to_int.c

Let’s say we want to change the behavior of atoi function. We want it to turn the given character to integer and multiply it with 2. We can write our own shared library to achieve that. Let’s call our library “interpose.c”.

#define _GNU_SOURCE
#include <dlfcn.h>
int atoi(const char* c){
int (*fn)(const char*);
fn = dlsym(RTLD_NEXT, "atoi");
int result = (*fn)(c);
return result * 2;
}

In our library, we implement the atoi function in a different way from the original. We first find the original function with dlsym. Then, we call the original function and store the result in result variable. Finally, we return result multiplied by two. To compile our shared library:

gcc -fPIC -shared -o interpose.so interpose.c -ldl

So, we have the compiled version of our program “char_to_int” and our shared library “interpose”. Next step is to load our shared library before any other library (in our case we load “interpose” before “stdlib”).

LD_PRELOAD=./interpose ./char_to_int

When you execute this command, Unix dynamic linker links atoi to your definition instead of the definition in stdlib.h. If everything works as we expect, atoi should return 18 instead of 9.

References:

[1] https://www.geeksforgeeks.org/how-does-a-c-program-executes/

--

--

Muhammet Soytürk

Computer science PhD student at Koç University. Interested in high performance computing, computer architecture and philosophy.