内存分配
MNN中有三种内存分配器:DefaultAllocator
,BufferAllocator
RecurseAllocator
.Recurse分配器,实际上只是BufferAlloator的简单封装,因此这里我们只讨论前两种.
DefaultAllocator
class DefaultAllocator : public BufferAllocator::Allocator {
public:
DefaultAllocator() {
// Do nothing
}
virtual ~ DefaultAllocator() {
// Do nothing
}
virtual std::pair<void*, size_t> onAlloc(size_t size, size_t align) {
return std::make_pair(MNNMemoryAllocAlign(size, MNN_MEMORY_ALIGN_DEFAULT), 0);
}
virtual void onRelease(std::pair<void*, size_t> ptr) {
MNN_ASSERT(ptr.second == 0);
MNNMemoryFreeAlign(ptr.first);
}
};
DefaultAllocator
提供了底层的malloc
实现,并提供了基本的内存对齐功能.onAlloc生成一个Pair\<内存指针,0>
MNNMemoryAllocAlign
的实现如下:
static inline void **alignPointer(void **ptr, size_t alignment) {
return (void **)((intptr_t)((unsigned char *)ptr + alignment - 1) & -alignment);
}
extern "C" void *MNNMemoryAllocAlign(size_t size, size_t alignment) {
MNN_ASSERT(size > 0);
#ifdef MNN_DEBUG_MEMORY
return malloc(size);
#else
void **origin = (void **)malloc(size + sizeof(void *) + alignment);
MNN_ASSERT(origin != NULL);
if (!origin) {
return NULL;
}
void **aligned = alignPointer(origin + 1, alignment);
aligned[-1] = origin;
return aligned;
#endif
}
首先,(void **)malloc(size + sizeof(void *) + alignment)
用于分配原始内存,sizeof(void*)是用于存储origin
地址,即 aligned[-1] = origin
, 多申请alignment是保证对齐之后仍然最少有size内存,强转成(void* )是为了传给alignPointer时直接+1跳过开始的部分保留给以后.alignPointer的实现是为了将地址对齐(这块上古魔术转换我也不太理解,看上去是NCNN的时代就在用了)
BufferAllocator
BufferAllocator的用途在于池化内存,每块内存优先从池中分配,释放内存时回归内存池.核心逻辑如下:
std::pair<void*, size_t> BufferAllocator::alloc(size_t size, bool separate, size_t align) {
#ifdef DUMP_USAGE
auto memoryUsed = size / 1024.0f / 1024.0f;
MNN_PRINT("Alloc: %f\n", memoryUsed);
#endif
if (0 == align) {
align = mAlign;
}
std::pair<void*, size_t> pointer;
// reuse if possible
if (!separate) {
if (nullptr != mCurrentFreeList) {
pointer = getFromFreeList(mCurrentFreeList, size, false, align);
}
if (nullptr != pointer.first) {
return pointer;
}
pointer = getFromFreeList(&mFreeList, size, true, align);
if (nullptr != pointer.first) {
return pointer;
}
}
// alloc otherwise
pointer = mAllocator->onAlloc(size, align);
if (nullptr == pointer.first) {
return pointer;
}
mTotalSize += size;
// save node
SharedPtr<Node> node(new Node);
node->size = size;
node->pointer = pointer;
mUsedList[pointer] = node;
node->outside = mAllocator.get();
MNN_ASSERT(pointer.second % align == 0);
#ifdef DUMP_USAGE
MNN_PRINT("mTotalSize: %f\n", mTotalSize / 1024.0f / 1024.0f);
#endif
return pointer;
}
当申请分配内存时,Allocator首先从自己的线程中的mCurrentFreeList空闲列表中获取(这是出于多线程数据竞争考虑),若取不到,再从全局的mFreeList中获取. 取出池中的内存的逻辑为给定一个内存大小数目,返回内存池列表(<Size, Node>)中Size大于指定内存的节点.另外还应思考一个问题,池中的内存的Align可能与请求的不同,因此应当对这种情况做出考量并修正内存地址.
从池中取出内存后,Allocator还会检查余下的空间是否足够再次分配,如果不足够或指定不允许再次分配内存,则直接返回.如果允许Split,则将父节点切分成两个子节点,将符合要求的一份子节点返回,剩下一份空闲内存块保留.需要考虑节点有父节点的情况,因此在切分子节点时增加父节点引用计数.当子节点被回收时,减小引用计数.当引用计数归0时,即表明所有子节点都已被回收,被切分的父节点会回归整块.
std::pair<void*, size_t> BufferAllocator::getFromFreeList(FREELIST* list, size_t size, bool permiteSplit, size_t align) {
size_t realSize = size;
bool needExtraSize = mAlign % align != 0;
if (needExtraSize) {
realSize = size + align - 1;
}
// get node larger than size
auto x = list->lower_bound(realSize); //
if (x == list->end()) {
return std::make_pair(nullptr, 0);
}
auto pointer = x->second->pointer;
// Align offset
if (needExtraSize) {
size_t originOffset = pointer.second;
pointer.second = UP_DIV(originOffset, align) * align; //计算Align后的内存地址
realSize = size + pointer.second - originOffset; // 计算Align后的Size
}
if (permiteSplit && nullptr != x->second->parent.get()) {
x->second->parent->useCount += 1;
}
// uses up all aligned space
auto sizeAlign = UP_DIV(realSize, mAlign) * mAlign;
//下一个mAlign后的内存大小
if (sizeAlign >= x->first || (!permiteSplit)) {
mUsedList.insert(std::make_pair(pointer, x->second));
list->erase(x);
MNN_ASSERT(pointer.second % align == 0);
return pointer;
}
// split otherwise
SharedPtr<Node> first(new Node);
first->parent = x->second;
first->size = sizeAlign;
first->pointer = x->second->pointer;
mUsedList.insert(std::make_pair(pointer, first));
x->second->useCount += 1;
//将x->second作为父节点,生成新的内存块,内存起始为父节点内存尾地址的下一个mAlign.
SharedPtr<Node> second(new Node);
second->parent = x->second;
second->size = x->second->size - sizeAlign;
second->pointer.first = x->second->pointer.first;
second->pointer.second = x->second->pointer.second + sizeAlign;
list->erase(x);
//将内存块插入池中
list->insert(std::make_pair(second->size, second));
MNN_ASSERT(pointer.second % align == 0);
return pointer;
}
MNN将分配的小块内存封装为Node,Node携带了内存大小,指向内存的指针,父对象和分配器等信息.其中parent是指向此块内存从哪个大块内存中切割出来(若有),useCount则表示此内存所有子内存的分配情况.
class Node : public RefCount {
public:
~Node();
std::pair<void*, size_t> pointer;
SharedPtr<Node> parent = nullptr;
size_t size;
size_t useCount = 0;
Allocator* outside = nullptr;
};
当某块内存被Free时,会将node从UsedList中转移到FREELIST并调用Allocator的returnMemory逻辑,先检测父节点的useCount是否为0,如果为0,则依次从子元素开始回收子节点,将list中所有父元素的子节点都删除,并将父节点放回freelist,同时再向上检测在上一级内存的RefCount情况.
void BufferAllocator::returnMemory(FREELIST* listP, SharedPtr<Node> node, bool permitMerge) {
auto& list = *listP;
list.insert(std::make_pair(node->size, node));
// update parent use count
if (nullptr != node->parent.get() && permitMerge) {
auto parent = node->parent;
parent->useCount -= 1;
// merge if all subnodes were freed
auto needMerge = parent->useCount == 0;
while (needMerge) {
// collect all subnodes
for (auto iter = list.begin(); iter != list.end();) {
if (iter->second->parent.get() == parent.get()) {
iter = list.erase(iter);
continue;
}
iter++;
}
// do merge downside up
list.insert(std::make_pair(parent->size, parent));
needMerge = false;
if (parent->parent.get() != nullptr) {
parent = parent->parent;
parent->useCount -= 1;
needMerge = parent->useCount == 0;
}
}
}
}