summaryrefslogtreecommitdiff
path: root/drivers/nedmalloc/Readme.txt
diff options
context:
space:
mode:
Diffstat (limited to 'drivers/nedmalloc/Readme.txt')
-rw-r--r--drivers/nedmalloc/Readme.txt136
1 files changed, 0 insertions, 136 deletions
diff --git a/drivers/nedmalloc/Readme.txt b/drivers/nedmalloc/Readme.txt
deleted file mode 100644
index 85f9b6a0f1..0000000000
--- a/drivers/nedmalloc/Readme.txt
+++ /dev/null
@@ -1,136 +0,0 @@
-nedalloc v1.05 15th June 2008:
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-
-by Niall Douglas (http://www.nedprod.com/programs/portable/nedmalloc/)
-
-Enclosed is nedalloc, an alternative malloc implementation for multiple
-threads without lock contention based on dlmalloc v2.8.4. It is more
-or less a newer implementation of ptmalloc2, the standard allocator in
-Linux (which is based on dlmalloc v2.7.0) but also contains a per-thread
-cache for maximum CPU scalability.
-
-It is licensed under the Boost Software License which basically means
-you can do anything you like with it. This does not apply to the malloc.c.h
-file which remains copyright to others.
-
-It has been tested on win32 (x86), win64 (x64), Linux (x64), FreeBSD (x64)
-and Apple MacOS X (x86). It works very well on all of these and is very
-significantly faster than the system allocator on all of these platforms.
-
-By literally dropping in this allocator as a replacement for your system
-allocator, you can see real world improvements of up to three times in normal
-code!
-
-To use:
--=-=-=-
-Drop in nedmalloc.h, nedmalloc.c and malloc.c.h into your project.
-Configure using the instructions in nedmalloc.h. Run and enjoy.
-
-To test, compile test.c. It will run a comparison between your system
-allocator and nedalloc and tell you how much faster nedalloc is. It also
-serves as an example of usage.
-
-Notes:
--=-=-=
-If you want the very latest version of this allocator, get it from the
-TnFOX SVN repository at svn://svn.berlios.de/viewcvs/tnfox/trunk/src/nedmalloc
-
-Because of how nedalloc allocates an mspace per thread, it can cause
-severe bloating of memory usage under certain allocation patterns.
-You can substantially reduce this wastage by setting MAXTHREADSINPOOL
-or the threads parameter to nedcreatepool() to a fraction of the number of
-threads which would normally be in a pool at once. This will reduce
-bloating at the cost of an increase in lock contention. If allocated size
-is less than THREADCACHEMAX, locking is avoided 90-99% of the time and
-if most of your allocations are below this value, you can safely set
-MAXTHREADSINPOOL to one.
-
-You will suffer memory leakage unless you call neddisablethreadcache()
-per pool for every thread which exits. This is because nedalloc cannot
-portably know when a thread exits and thus when its thread cache can
-be returned for use by other code. Don't forget pool zero, the system pool.
-
-For C++ type allocation patterns (where the same sizes of memory are
-regularly allocated and deallocated as objects are created and destroyed),
-the threadcache always benefits performance. If however your allocation
-patterns are different, searching the threadcache may significantly slow
-down your code - as a rule of thumb, if cache utilisation is below 80%
-(see the source for neddisablethreadcache() for how to enable debug
-printing in release mode) then you should disable the thread cache for
-that thread. You can compile out the threadcache code by setting
-THREADCACHEMAX to zero.
-
-Speed comparisons:
--=-=-=-=-=-=-=-=-=
-See Benchmarks.xls for details.
-
-The enclosed test.c can do two things: it can be a torture test or a speed
-test. The speed test is designed to be a representative synthetic
-memory allocator test. It works by randomly mixing allocations with frees
-with half of the allocation sizes being a two power multiple less than
-512 bytes (to mimic C++ stack instantiated objects) and the other half
-being a simple random value less than 16Kb.
-
-The real world code results are from Tn's TestIO benchmark. This is a
-heavily multithreaded and memory intensive benchmark with a lot of branching
-and other stuff modern processors don't like so much. As you'll note, the
-test doesn't show the benefits of the threadcache mostly due to the saturation
-of the memory bus being the limiting factor.
-
-ChangeLog:
--=-=-=-=-=
-v1.05 15th June 2008:
- * { 1042 } Added error check for TLSSET() and TLSFREE() macros. Thanks to
-Markus Elfring for reporting this.
- * { 1043 } Fixed a segfault when freeing memory allocated using
-nedindependent_comalloc(). Thanks to Pavel Vozenilek for reporting this.
-
-v1.04 14th July 2007:
- * Fixed a bug with the new optimised implementation that failed to lock
-on a realloc under certain conditions.
- * Fixed lack of thread synchronisation in InitPool() causing pool corruption
- * Fixed a memory leak of thread cache contents on disabling. Thanks to Earl
-Chew for reporting this.
- * Added a sanity check for freed blocks being valid.
- * Reworked test.c into being a torture test.
- * Fixed GCC assembler optimisation misspecification
-
-v1.04alpha_svn915 7th October 2006:
- * Fixed failure to unlock thread cache list if allocating a new list failed.
-Thanks to Dmitry Chichkov for reporting this. Futher thanks to Aleksey Sanin.
- * Fixed realloc(0, <size>) segfaulting. Thanks to Dmitry Chichkov for
-reporting this.
- * Made config defines #ifndef so they can be overriden by the build system.
-Thanks to Aleksey Sanin for suggesting this.
- * Fixed deadlock in nedprealloc() due to unnecessary locking of preferred
-thread mspace when mspace_realloc() always uses the original block's mspace
-anyway. Thanks to Aleksey Sanin for reporting this.
- * Made some speed improvements by hacking mspace_malloc() to no longer lock
-its mspace, thus allowing the recursive mutex implementation to be removed
-with an associated speed increase. Thanks to Aleksey Sanin for suggesting this.
- * Fixed a bug where allocating mspaces overran its max limit. Thanks to
-Aleksey Sanin for reporting this.
-
-v1.03 10th July 2006:
- * Fixed memory corruption bug in threadcache code which only appeared with >4
-threads and in heavy use of the threadcache.
-
-v1.02 15th May 2006:
- * Integrated dlmalloc v2.8.4, fixing the win32 memory release problem and
-improving performance still further. Speed is now up to twice the speed of v1.01
-(average is 67% faster).
- * Fixed win32 critical section implementation. Thanks to Pavel Kuznetsov
-for reporting this.
- * Wasn't locking mspace if all mspaces were locked. Thanks to Pavel Kuznetsov
-for reporting this.
- * Added Apple Mac OS X support.
-
-v1.01 24th February 2006:
- * Fixed multiprocessor scaling problems by removing sources of cache sloshing
- * Earl Chew <earl_chew <at> agilent <dot> com> sent patches for the following:
- 1. size2binidx() wasn't working for default code path (non x86)
- 2. Fixed failure to release mspace lock under certain circumstances which
- caused a deadlock
-
-v1.00 1st January 2006:
- * First release