mallocnot calling into your library, I think the solution in that case is to simply modify the library that's doing that. I don't know if you have access to the source code, but that's really the only option that you've got, and it's what I've had to do in quite a few cases.
pthread_setspecificin some cases) every time there was an allocation. This didn't affect the vast majority of benchmarks, as evidenced by extensive benchmarking, but it did affect one (1) obnoxious Fortran application which allocates at a totally-scientific rate of 2 bajillion allocations per second.
pthread_setspecificif you care about performance and need thread-local storage.
jemallocfinished in 7 minutes, and I killed my run after 2.5 hours.