MNN相关代码阅读笔记

内存分配

MNN中有三种内存分配器:DefaultAllocator,BufferAllocator RecurseAllocator.Recurse分配器,实际上只是BufferAlloator的简单封装,因此这里我们只讨论前两种.

DefaultAllocator

class DefaultAllocator : public BufferAllocator::Allocator {
public:
    DefaultAllocator() {
        // Do nothing
    }
    virtual ~ DefaultAllocator() {
        // Do nothing
    }
    virtual std::pair<void*, size_t> onAlloc(size_t size, size_t align) {
        return std::make_pair(MNNMemoryAllocAlign(size, MNN_MEMORY_ALIGN_DEFAULT), 0);
    }
    virtual void onRelease(std::pair<void*, size_t> ptr) {
        MNN_ASSERT(ptr.second == 0);
        MNNMemoryFreeAlign(ptr.first);
    }
};

DefaultAllocator提供了底层的malloc 实现,并提供了基本的内存对齐功能.onAlloc生成一个Pair\<内存指针,0>

MNNMemoryAllocAlign的实现如下:

static inline void **alignPointer(void **ptr, size_t alignment) {
    return (void **)((intptr_t)((unsigned char *)ptr + alignment - 1) & -alignment);
}
extern "C" void *MNNMemoryAllocAlign(size_t size, size_t alignment) {
    MNN_ASSERT(size > 0);

#ifdef MNN_DEBUG_MEMORY
    return malloc(size);
#else
    void **origin = (void **)malloc(size + sizeof(void *) + alignment);
    MNN_ASSERT(origin != NULL);
    if (!origin) {
        return NULL;
    }

    void **aligned = alignPointer(origin + 1, alignment);
    aligned[-1]    = origin;
    return aligned;
#endif
}

首先,(void **)malloc(size + sizeof(void *) + alignment) 用于分配原始内存,sizeof(void*)是用于存储origin地址,即 aligned[-1] = origin, 多申请alignment是保证对齐之后仍然最少有size内存,强转成(void* )是为了传给alignPointer时直接+1跳过开始的部分保留给以后.alignPointer的实现是为了将地址对齐(这块上古魔术转换我也不太理解,看上去是NCNN的时代就在用了)

BufferAllocator

BufferAllocator的用途在于池化内存,每块内存优先从池中分配,释放内存时回归内存池.核心逻辑如下:

std::pair<void*, size_t> BufferAllocator::alloc(size_t size, bool separate, size_t align) {
#ifdef DUMP_USAGE
    auto memoryUsed = size / 1024.0f / 1024.0f;
    MNN_PRINT("Alloc: %f\n", memoryUsed);
#endif
    if (0 == align) {
        align = mAlign;
    }
    std::pair<void*, size_t> pointer;
    // reuse if possible
    if (!separate) {
        if (nullptr != mCurrentFreeList) {
            pointer = getFromFreeList(mCurrentFreeList, size, false, align);
        }
        if (nullptr != pointer.first) {
            return pointer;
        }
        pointer = getFromFreeList(&mFreeList, size, true, align);
        if (nullptr != pointer.first) {
            return pointer;
        }
    }

    // alloc otherwise
    pointer = mAllocator->onAlloc(size, align);
    if (nullptr == pointer.first) {
        return pointer;
    }
    mTotalSize += size;

    // save node
    SharedPtr<Node> node(new Node);
    node->size         = size;
    node->pointer      = pointer;
    mUsedList[pointer] = node;
    node->outside      = mAllocator.get();
    MNN_ASSERT(pointer.second % align == 0);

#ifdef DUMP_USAGE
    MNN_PRINT("mTotalSize: %f\n", mTotalSize / 1024.0f / 1024.0f);
#endif
    return pointer;
}

当申请分配内存时,Allocator首先从自己的线程中的mCurrentFreeList空闲列表中获取(这是出于多线程数据竞争考虑),若取不到,再从全局的mFreeList中获取. 取出池中的内存的逻辑为给定一个内存大小数目,返回内存池列表(<Size, Node>)中Size大于指定内存的节点.另外还应思考一个问题,池中的内存的Align可能与请求的不同,因此应当对这种情况做出考量并修正内存地址.

从池中取出内存后,Allocator还会检查余下的空间是否足够再次分配,如果不足够或指定不允许再次分配内存,则直接返回.如果允许Split,则将父节点切分成两个子节点,将符合要求的一份子节点返回,剩下一份空闲内存块保留.需要考虑节点有父节点的情况,因此在切分子节点时增加父节点引用计数.当子节点被回收时,减小引用计数.当引用计数归0时,即表明所有子节点都已被回收,被切分的父节点会回归整块.

std::pair<void*, size_t> BufferAllocator::getFromFreeList(FREELIST* list, size_t size, bool permiteSplit, size_t align) {
    size_t realSize = size;
    bool needExtraSize = mAlign % align != 0;
    if (needExtraSize) {
        realSize = size + align - 1;
    }
    // get node larger than size
    auto x = list->lower_bound(realSize); //
    if (x == list->end()) {
        return std::make_pair(nullptr, 0);
    }

    auto pointer = x->second->pointer;
    // Align offset
    if (needExtraSize) {
        size_t originOffset = pointer.second;
        pointer.second = UP_DIV(originOffset, align) * align; //计算Align后的内存地址
        realSize = size + pointer.second - originOffset; // 计算Align后的Size
    }
    if (permiteSplit && nullptr != x->second->parent.get()) {
        x->second->parent->useCount += 1; 
    }

    // uses up all aligned space
    auto sizeAlign = UP_DIV(realSize, mAlign) * mAlign;
  //下一个mAlign后的内存大小
    if (sizeAlign >= x->first || (!permiteSplit)) {
        mUsedList.insert(std::make_pair(pointer, x->second));
        list->erase(x);
        MNN_ASSERT(pointer.second % align == 0);
        return pointer;
    }

    // split otherwise
    SharedPtr<Node> first(new Node);
    first->parent  = x->second;
    first->size    = sizeAlign;
    first->pointer = x->second->pointer;
    mUsedList.insert(std::make_pair(pointer, first));
    x->second->useCount += 1;
  //将x->second作为父节点,生成新的内存块,内存起始为父节点内存尾地址的下一个mAlign.
    SharedPtr<Node> second(new Node);
    second->parent  = x->second;
    second->size    = x->second->size - sizeAlign;
    second->pointer.first = x->second->pointer.first;
    second->pointer.second = x->second->pointer.second + sizeAlign;
    list->erase(x);
  //将内存块插入池中
    list->insert(std::make_pair(second->size, second));
    MNN_ASSERT(pointer.second % align == 0);
    return pointer;
}

MNN将分配的小块内存封装为Node,Node携带了内存大小,指向内存的指针,父对象和分配器等信息.其中parent是指向此块内存从哪个大块内存中切割出来(若有),useCount则表示此内存所有子内存的分配情况.

    class Node : public RefCount {
    public:
        ~Node();
        std::pair<void*, size_t> pointer;
        SharedPtr<Node> parent = nullptr;
        size_t size;
        size_t useCount = 0;
        Allocator* outside = nullptr;
    };

当某块内存被Free时,会将node从UsedList中转移到FREELIST并调用Allocator的returnMemory逻辑,先检测父节点的useCount是否为0,如果为0,则依次从子元素开始回收子节点,将list中所有父元素的子节点都删除,并将父节点放回freelist,同时再向上检测在上一级内存的RefCount情况.

void BufferAllocator::returnMemory(FREELIST* listP, SharedPtr<Node> node, bool permitMerge) {
    auto& list = *listP;
    list.insert(std::make_pair(node->size, node));
    // update parent use count
    if (nullptr != node->parent.get() && permitMerge) {
        auto parent = node->parent;
        parent->useCount -= 1;

        // merge if all subnodes were freed
        auto needMerge = parent->useCount == 0;
        while (needMerge) {
            // collect all subnodes
            for (auto iter = list.begin(); iter != list.end();) {
                if (iter->second->parent.get() == parent.get()) {
                    iter = list.erase(iter);
                    continue;
                }
                iter++;
            }

            // do merge downside up
            list.insert(std::make_pair(parent->size, parent));
            needMerge = false;
            if (parent->parent.get() != nullptr) {
                parent = parent->parent;
                parent->useCount -= 1;
                needMerge = parent->useCount == 0;
            }
        }
    }
}
Edit with markdown