Know What Your Linker Knows
I just had a very frustrating experience getting my new project to link to Bullet Physics.
Each time I face something which causes much frustration, I usually learn something. It's usually obvious as soon as I find the solution, but it takes a lot of trial and error before getting there.
The more of these tedious lessons I learn, the more wisdom I gain to avoid them in the future (hopefully).
I hope that you reading this will learn from my mistake so that you can avoid the frustration.
The Problem
My task was to solve the following linker error:
Link spargus_vehicle_prototype
src/Main.o: In function `btDantzigSolver::solveMLCP(btMatrixX<float> const&, btVectorX<float> const&, btVectorX<float>&, btVectorX<float> const&, btVectorX<float> const&, btAlignedObjectArray<int> const&, int, bool)':
Main.cpp:(.text._ZN15btDantzigSolver9solveMLCPERK9btMatrixXIfERK9btVectorXIfERS5_S7_S7_RK20btAlignedObjectArrayIiEib[_ZN15btDantzigSolver9solveMLCPERK9btMatrixXIfERK9btVectorXIfERS5_S7_S7_RK20btAlignedObjectArrayIiEib]+0x5ff):
undefined reference to `btSolveDantzigLCP(int, float*, float*, float*, float*, int, float*, float*, int*, btDantzigScratchMemory&)'
It seemed simple enough at first, just find the library I was missing from Bullet.
I tried several techniques for tracking down all the Bullet shared objects which were built to ensure I was getting everything which was compiled, but I was still getting the error. I knew I was successfully linking most of Bullet because other structures I was using worked fine.
At this point, it's very important that you know you are actually running the link command you think you are running. I confirmed this via Jam's very helpful verbose output when there are errors. This wasn't the problem.
I eventually resorted to copying the exact link call (via make VERBOSE=1
) which was being used to link the working Bullet example project. I knew that example used the code my project was tripping up on, so I thought it would solve the problem. Still no dice!
The Solution
Most everything in the standard compilation pipeline generates files. On Linux, those are .o
, .so
and .a
files.
These files seem opaque, but there are tools which can tell you what their contents are. This makes sense because if they were completely opaque, there wouldn't be any way for the linker to associate one piece of text (a function call) with another (a definition). Those .o
files are all the linker knows about your program.
objdump
allows you to see what functions are defined in a .o
file (it can do much more than that as well).
I found the file where the missing function was defined, then found that file's .o
. I ran objdump to see which symbols were defined in that .o:
objdump -t -C btDantzigLCP.o
-t displays the symbol table; -C decodes C++ mangled function names.
There's my missing function:
0000000000004b10 g F .text 0000000000001431 btSolveDantzigLCP(int, double*, double*, double*, double*, int, double*, double*, int*, btDantzigScratchMemory&)
Let's compare that to the linker error:
undefined reference to `btSolveDantzigLCP(int, float*, float*, float*, float*, int, float*, float*, int*, btDantzigScratchMemory&)'
The linker cannot find the function my code asking for because its parameters do not match!
Let's look at the definition of that function:
bool btSolveDantzigLCP(int n, btScalar *A, btScalar *x, btScalar *b,
int nub, btScalar *lo, btScalar *hi,
btScalar *outer_w, int *findex, btDantzigScratchMemory &scratchMem)
That btScalar
type must be typedef'd to something different than my program. Let's look at its definition:
//The btScalar type abstracts floating point numbers, to easily switch between double and single floating point precision.
#if defined(BT_USE_DOUBLE_PRECISION)
typedef double btScalar;
//this number could be bigger in double precision
#define BT_LARGE_FLOAT 1e30
#else
typedef float btScalar;
//keep BT_LARGE_FLOAT*BT_LARGE_FLOAT < FLT_MAX
#define BT_LARGE_FLOAT 1e18f
#endif
I took a look at the compilation defines for the working example project, and sure enough, -DBT_USE_DOUBLE_PRECISION
is defined explicitly. I added it to my compile commands, cleaned (there's a whole other lesson about cleaning…), and sure enough, it linked!
I should've known to try it sooner, but it usually takes some beating into my head to learn a lesson.
Another example
I used a similar solution previously. That time, I was at work debugging a problem where the game's behavior seemed impossible. The game was complaining about a missing string, but there was no way based on the SVN version that that was the case.
I ended up opening the executable itself in Emacs and searching for the missing string in the string table. Sure enough, the executable did not have the string, so it couldn't have my modification. This was perplexing because the game's SVN version said it should have that commit.
I discovered that the game's builder (an automated machine I do not have access to) had local modifications to the file. No wonder the string was missing! This was very surprising to my boss and me, especially because it must've had those modifications for a week or two.
My boss reverted the modifications and all was right in the world again. We still don't know who made those modifications and then left them there.
Lesson learned
Don't use guess-and-check style methods to stumble your way from one linker error to another (or in my case, bash my head against the same one).
Know what data your linker is actually working with!
An Aside: Bullet Impressions
Other than this -DBT_USE_DOUBLE_PRECISION
issue, getting Bullet set up was quite easy. I definitely appreciate libraries which compile and build with a single command. I didn't even have to run any dependency installation scripts.
Additionally, the Example Browser worked out of the box and gave me some motivation to continue with the integration.