1. Topic du Jour:
- Object (Entity) Systems
- Memory Management
- Resource Management in General
2. C++ issues vs. generic issues
- lots to do with playing around with C++ details (yuck!)
- especially in memory management (custom allocators, ...)
the yucky stuff is trying to make the syntax of the language make the things you want to do clean / hidden
- understanding the core issues in resource management is important no matter what the language used is
- even if memory is managed for you, you still need to think about it
The real question:
->How do we use insider's knowledge of our problem to make resource management perform better.
but this takes a little bit of setup
3. The Generic Problem
Phases of the uses of a resource:
- Allocate resource
- Initialize resource
- Use resource
- Finalize resoruce
- Re-claim / re-use resources
Commentary:
- 5 is not really part of "this usage" - kindof "the next life"
- in an infinite resource situation, might not care about #5
- do 2 and 4 get mixed in with 3, or with with 1/5?
- different paradigms might mix some of these steps
- resource acquisition is initialization
- finalization at explicit destruction
- finalization besides reclaimation
3.1 Resource Lifecycles
- Sometimes easy (stack scope objects)
- Sometimes really hard
- Sometimes want explicit control
- Sometimes have knowledge that can make the problem easier
4. Basic Approaches
- Fully Automatic (Garbage Collection) - Lisp, Java, Python
note: technically, Python implements automatic memory management with reference counting
- Semi-Automatic (Reference Counting (*) ) - implemented in systems
- Syntactic-Sugar Coated Manual (explicit new/delete) - C++
- Explicit Management (manual allocate/de-allocate) - fortran
4.1 The big loses of manual allocation:
- why waste our time on something the computer can do for us
- bugs! bugs! and more bugs!
- complexity! (build complex conventions and protocols to avoid leakage, manage responsibility) - yields BUGS
- complexity! (to avoid those bugs)
- performance (?) - yes, manual allocation may not perform well
- memory leaks
4.2 Why not automatic memory management?
A lot of smart computer scientists have figured this stuff out. Why do we still bother with it?
- difficult to integrate automatic memory management with languages that give open interface to memory
- difficult to retrofit into legacy languages and systems
- when those legacy languages and systems were built, tradeoffs were different
- automatic systems make it harder to use the knowledge you have about your problem
- explicit management (using specific knowledge of problem, making explicit policy choices) is best when resources get really scarce
-> And in the old days, all resources were really scarse
But, before we can really talk about this...
We need to understand:
- Why is memory management hard (what does it mean to do it well)
- What kinds of knowledge might we have about our system that can use to make memory stuff address #1 "better"?
- What kinds of things might we do to actually do #2 (this is the language specific part).
5. Why is memory management hard?
- Shuffling / relocation
- fragmentation
- efficient shutdown
- efficient allocation/start up
- efficient access (data layout)
- unpredictability
- dealine with fixed sized pools (caches/finite texture memory)
- danglers
- leaks
- serialization
- predictable layout can be used to simplify many things
6. How does malloc work?
- header blocks (guard blocks) - since we don't know the size
- free lists
- search for free blocks
6.1 Why is malloc inefficient?
- distributes the work into little packets - can't amortize
- no global perspective
- doesn't really know much about the data
7. What can we learn from Garbage Collectors?
- free list
- object list / root set
- reachability analysis
- generations
- need to know what is a pointer (and what isn't)
- object mobility (since we know all the pointers)
- mark/sweep vs. copying
- the big performance win of an entire disconnected patch...
- tagged pointers
8. What might we "know" to help us out?
- understanding object lifetimes
- understanding object connections
- understanding allocation patterns
- understanding the common types / important cases
- understanding when we can afford to spend the time
- its OK to garbage collect now, nothing important is happening
9. Tips and Tricks
9.1 Weak references (indirection)
- allows for relocatability
- allows for late binding / instantiation
- load/unload/cache
- useful (and repeatable) numberings (serializability, debugging, ...)
- downsides:
- lock/unlock - scope issues
- indirection costs
9.2 Late Deletion
9.3 Some easy tricks
- pools of fixed sized objects
- don't need to remember how big they are (no guard blocks)
- regular layout (as an array) - good for debugging
- allocate / mass delete only
- caching / buffering / weak references