基于跳表实现的轻量级KV存储引擎项目总结_C

参考：https://github.com/youngyangyang04/Skiplist-CPP

项目介绍

KV存储引擎
众所周知，非关系型数据库 redis，以及levedb，rockdb其核心存储引擎的数据结构就是跳表。

本项目就是基于跳表实现的轻量级键值型存储引擎，使用C++实现。插入数据、删除数据、查询数据、数据展示、数据落盘、文件加载数据，以及数据库大小显示。

在随机写读情况下，该项目每秒可处理啊请求数（QPS）: 24.39w，每秒可处理读请求数（QPS）: 18.41w

项目存储文件

main.cpp 包含skiplist.h使用跳表进行数据 *** 作
skiplist.h 跳表核心实现
README.md 中文介绍
README-en.md 英文介绍
bin 生成可执行文件目录
makefile 编译脚本
store 数据落盘的文件存放在这个文件夹
stress_test_start.sh 压力测试脚本
LICENSE 使用协议

提供接口

insertElement（插入数据）
deleteElement（删除数据）
searchElement（查询数据）
displayList（展示已存数据）
dumpFile（数据落盘）
loadFile（加载数据）
size（返回数据规模）

存储引擎数据表现

插入 *** 作

跳表树高：18

采用随机插入数据测试：

插入数据规模（万条）	耗时（秒）
10	0.316763
50	1.86778
100	4.10648

每秒可处理写请求数（QPS）: 24.39w

取数据 *** 作

取数据规模（万条）	耗时（秒）
10	0.47148
50	2.56373
100	5.43204

每秒可处理读请求数（QPS）: 18.41w

项目运行方式

make            // complie demo main.cpp
./bin/main      // run

如果想自己写程序使用这个kv存储引擎，只需要在你的CPP文件中include skiplist.h 就可以了。

可以运行如下脚本测试kv存储引擎的性能（当然你可以根据自己的需求进行修改）

sh stress_test_start.sh

待优化

delete的时候没有释放内存
压力测试并不是全自动的
跳表的key用int型，如果使用其他类型需要自定义比较函数，当然把这块抽象出来更好
如果再加上一致性协议，例如raft就构成了分布式存储，再启动一个http server就可以对外提供分布式存储服务了

部分代码解析

项目涉及知识

函数模板、类模板
跳表结构的增、删、查以及 mutex的使用
压力测试、线程（pthread 和 chrono）

代码阅读
从main函数开始阅读，然后跳到头文件，最后压力测试
SkipList skipList(6); 定义了一个6层高度的跳表

头文件全局变量

node节点类

跳表类

该项目的跳表对Redis的跳表结构进行了一定的简化
Redis跳表
跳表数据结构：
代码实现：

本项目的跳表

节点
跳表

插入节点

template<typename K, typename V>
int SkipList<K, V>::insert_element(const K key, const V value) {
    
    mtx.lock();
    Node<K, V> *current = this->_header;

    // update 是个节点数组，用于后续插入位置的前向链接 和 后向链接
    Node<K, V> *update[_max_level+1];
    memset(update, 0, sizeof(Node<K, V>*)*(_max_level+1));

    // 寻找插入位置，并保存插入位置前面的节点
    // for进行高度遍历，while 行方向遍历
    for(int i = _skip_list_level; i >= 0; i--) {
        while(current->forward[i] != NULL && current->forward[i]->get_key() < key) {
            current = current->forward[i]; 
        }
        update[i] = current;
    }

    //  定位到 == key 或者 > key 的 节点
    current = current->forward[0];

    // 如果current节点存在且和key相等，提示，并解锁
    if (current != NULL && current->get_key() == key) {
        std::cout << "key: " << key << ", exists" << std::endl;
        mtx.unlock();
        return 1;
    }

    //否则，需要在update[0]和current node节点之间插入 [节点]
    if (current == NULL || current->get_key() != key ) {
        
        // 随机生成节点的高度
        int random_level = get_random_level();

        // 如果随机高度比当前的跳表的高度大，update数组在多余高出来的部分保存头节点指针
        if (random_level > _skip_list_level) {
            for (int i = _skip_list_level+1; i < random_level+1; i++) {
                update[i] = _header;
            }
            _skip_list_level = random_level;
        }

        // 用随机生成的高度 创建 insert node
        Node<K, V>* inserted_node = create_node(key, value, random_level);
        
        // node节点 forword 指向 所在位置后面的节点指针
        // node节点 所在位置前面的节点 指向node
        for (int i = 0; i <= random_level; i++) {
            inserted_node->forward[i] = update[i]->forward[i];
            update[i]->forward[i] = inserted_node;
        }
        std::cout << "Successfully inserted key:" << key << ", value:" << value << std::endl;
        _element_count ++;
    }
    mtx.unlock();
    return 0;
}

删除节点

template<typename K, typename V> 
void SkipList<K, V>::delete_element(K key) {

    mtx.lock();
    Node<K, V> *current = this->_header; 
    // update数组 用于保存 删除节点前面的节点 ，用于删除节点后的指向链接，步骤和插入节点类似
    Node<K, V> *update[_max_level+1];
    memset(update, 0, sizeof(Node<K, V>*)*(_max_level+1));

    // 寻找要删除的节点，并将该节点前面的节点保存到update数组中
    for (int i = _skip_list_level; i >= 0; i--) {
        while (current->forward[i] !=NULL && current->forward[i]->get_key() < key) {
            current = current->forward[i];
        }
        update[i] = current;
    }

    //  ==key  或者 > key 的元素节点
    current = current->forward[0];

    // 如果相等进行下面的 *** 作，不相等直接解锁 return
    if (current != NULL && current->get_key() == key) {
       
        // 从低级别开始删除每层的节点，
        for (int i = 0; i <= _skip_list_level; i++) {

            // 如果在第i曾，其下一个节点不是所要删除的节点直接break
            if (update[i]->forward[i] != current) 
                break;
            // 如果是要删除的节点 重新设置指针的指向，将其指向下一个节点的位置
            update[i]->forward[i] = current->forward[i];
        }

        // 减少没有元素的层，更新跳表的level
        while (_skip_list_level > 0 && _header->forward[_skip_list_level] == 0) {
            _skip_list_level --; 
        }

        std::cout << "Successfully deleted key "<< key << std::endl;
        _element_count --;
    }
    mtx.unlock();
    return;
}

搜索节点

// Search for element in skip list 
/*
                           +------------+
                           |  select 60 |
                           +------------+
level 4     +-->1+                                                      100
                 |
                 |
level 3         1+-------->10+------------------>50+           70       100
                                                   |
                                                   |
level 2         1          10         30         50|           70       100
                                                   |
                                                   |
level 1         1    4     10         30         50|           70       100
                                                   |
                                                   |
level 0         1    4   9 10         30   40    50+-->60      70       100
*/
template<typename K, typename V> 
bool SkipList<K, V>::search_element(K key) {

    std::cout << "search_element-----------------" << std::endl;
    Node<K, V> *current = _header;

    // for循环高度遍历，while循环水平遍历每一层。
    for (int i = _skip_list_level; i >= 0; i--) {
        while (current->forward[i] && current->forward[i]->get_key() < key) {
            current = current->forward[i];
        }
    }

    // 找到 > 或者 =  key的 节点
    current = current->forward[0];

    // 如果存在且相等，成功找到
    if (current and current->get_key() == key) {
        std::cout << "Found key: " << key << ", value: " << current->get_value() << std::endl;
        return true;
    }
    // 否则 没有找到 进行提示
    std::cout << "Not Found Key:" << key << std::endl;
    return false;
}

进一步优化

原始代码使用key进行排序，不具备通用型，增加score变量，通过score权重进行排序;
增加反向back指针
删除节点的时候没有释放内存，增加智能指针

github网址：

性能测试

虚拟机测试：

项目官方测试：

知识补充：
4. c语言中__attribute__的意义取消内存对齐
5. C++多线程之使用Mutex和Critical_Section
mutex和临界区的区别

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/1330989.html

基于跳表实现的轻量级KV存储引擎项目总结

发表评论

评论列表（0条）

基于跳表实现的轻量级KV存储引擎 项目总结

发表评论

评论列表（0条）

基于跳表实现的轻量级KV存储引擎项目总结