Graphene 源码阅读 ~ 数据库篇 ~ 对象索引库

cifer (60)in #bitshares • 8 years ago (edited)

大家好, 欢迎来继续阅读数据库篇的第 5 篇 - 对象索引库, 这个名字可能听起来有点奇怪, 什么叫 “对象索引库”, 实际上是根据 object_database 这个模块的功能直译过来的, 因为 object_database 管理的是 bitshares 中所有的对象, 而且这些对象数据又直接存储在索引中, 所以本章就叫 “对象索引库” 了.

到后面我们会说到在数据库篇里除了对象数据, 还会有区块数据, 区块数据和对象数据存储在不同的子目录下, 不过区块数据的存储和索引结构比起对象来就简单多了. 我们就先开始看一下对象索引库: object_database.

object_database

定义于 <db>/object_database.hpp 中的 object_database 类型, 这就是所有对象索引数据的直接操作者了. object_database 里没有什么高深的代码需要剖析, 值得讲的就是其具体操作索引的流程, 前面几章铺垫了对象模型, 索引模型以及对象的反射和序列化, 现在终于可以站在高维度把它们串起来系统的看一看了.

object_database 维护了一个 _index 二维数组, 在其初始化时会调用 add_index<> 创建所有的索引结构并加入到 _index 二维数组中, _index 数组分别以 space_id 和 type_id 作为一二维索引:

// 代码 5.1

130  template<typename IndexType>
131  IndexType* add_index()
132  {
133     typedef typename IndexType::object_type ObjectType;
134     if( _index[ObjectType::space_id].size() <= ObjectType::type_id  )
135         _index[ObjectType::space_id].resize( 255 );
136     assert(!_index[ObjectType::space_id][ObjectType::type_id]);
137     unique_ptr<index> indexptr( new IndexType(*this) );
138     _index[ObjectType::space_id][ObjectType::type_id] = std::move(indexptr);
139     return static_cast<IndexType*>(_index[ObjectType::space_id][ObjectType::type_id].get());
140  }

要注意的是 add_index<> 只是 new 了每个空间下每个类型的索引结构, 这个时候还并没有实际的对象被挂到索引树上, 目前为止每一个索引结构实际上都是一颗空的红黑树 (multi_index_container<>).

比如 witness_index 会被这样添加到 _index 上, 为方便起见代码 5.2 中我们也贴出了 witness_index 的定义,

// 代码 5.2

add_index< primary_index<witness_index> >();

// libraries/chain/include/graphene/chain/witness_object.hpp
 56    using witness_multi_index_type = multi_index_container<
 57       witness_object,
 58       indexed_by<
 59          ordered_unique< tag<by_id>,
 60             member<object, object_id_type, &object::id>
 61          >,
 62          ordered_unique< tag<by_account>,
 63             member<witness_object, account_id_type, &witness_object::witness_account>
 64          >,
 65          ordered_unique< tag<by_vote_id>,
 66             member<witness_object, vote_id_type, &witness_object::vote_id>
 67          >
 68       >
 69    >;
 70    using witness_index = generic_index<witness_object, witness_multi_index_type>;

从 witness_index 的定义我们一眼就看出它存储的是 witness_object 类型, 而从之前的章节我们知道 witness_object 的 space_id 和 type_id 分别是 protocold_ids, witness_object_type, 这两个枚举的值分别是 1, 6, 所以 new 出来的 witness_index 索引对象 (确切说是 primary_index<witness_index>) 就被挂到了 _index[1][6] 上面.

序列化与反序列化

当各种对象的索引结构初始化完之后, 接下来就是往里填充对象了, 这就是 object_database::open() 的工作了,

// 代码 5.3

 98 void object_database::open(const fc::path& data_dir)
 99 { try {
100    _data_dir = data_dir;
101    if( fc::exists( _data_dir / "object_database" / "lock" ) )
102    {
103        wlog("Ignoring locked object_database");
104        return;
105    }
106    ilog("Opening object database from ${d} ...", ("d", data_dir));
107    for( uint32_t space = 0; space < _index.size(); ++space )
108       for( uint32_t type = 0; type  < _index[space].size(); ++type )
109          if( _index[space][type] )
110             _index[space][type]->open( _data_dir / "object_database" / fc::to_string(space)/fc::to_string(type) );
111    ilog( "Done opening object database." );
112
113 } FC_CAPTURE_AND_RETHROW( (data_dir) ) }

object_database::open() 负责从磁盘中读入对象索引数据并填充到 _index 上, 这也就是将磁盘数据加载到内存的反序列化过程, 这会调用 _index 中各个索引对象的 open() 方法, 这就是我们在对象序列化一文中介绍的过程了.

相反的, object_database::flush() 则负责将内存数据序列化到磁盘, 它首先创建一个 object_database.tmp/ 目录, 将所有的索引数据先写到这个目录, 然后删除老的 object_database/ 目录, 并把 object_database.tmp/ 重命名为 object_database/ 目录. 类似的, 往 object_database.tmp/ 目录写索引数据时, 会调用每个索引结构的 save() 方法, 详见对象序列化.

// libraries/db/object_database.cpp

// 代码 5.3

 71 void object_database::flush()
 72 {
 73    ilog("Save object_database in ${d}", ("d", _data_dir));
 74    fc::create_directories( _data_dir / "object_database.tmp" / "lock" );
 75    for( uint32_t space = 0; space < _index.size(); ++space )
 76    {
 77       fc::create_directories( _data_dir / "object_database.tmp" / fc::to_string(space) );
 78       const auto types = _index[space].size();
 79       for( uint32_t type = 0; type  <  types; ++type )
 80          if( _index[space][type] )
 81             _index[space][type]->save( _data_dir / "object_database.tmp" / fc::to_string(space)/fc::to_string(type) );
 82    }
 83    fc::remove_all( _data_dir / "object_database.tmp" / "lock" );
 84    if( fc::exists( _data_dir / "object_database" ) )
 85       fc::rename( _data_dir / "object_database", _data_dir / "object_database.old" );
 86    fc::rename( _data_dir / "object_database.tmp", _data_dir / "object_database" );
 87    fc::remove_all( _data_dir / "object_database.old" );
 88 }

落盘与加载路径: data_dir

说完序列化与反序列化, 就不得不想到落盘路径, object_database::_data_dir 成员指向区块数据的路径, 这个值会在节点启动时初始化, 默认会是 witness_node_data_dir/blockchain/. 其中索引数据存放在 _data_dir 的子目录 object_database 目录下, 并且想必在上面的 open/flush 方法中你也看到了, 每个索引结构的目录还会用对象的 space_id, type_id 划分, 将不同空间不同类型的对象序列化存放到在不同子目录的不同文件里. 如下所示:

➜  bitshares-core git:(92eb45cb) ✗ ls witness_node_data_dir/blockchain/object_database/
0   103 109 114 12  125 130 136 141 147 152 158 163 169 174 18  185 190 196 200 206 211 217 222 228 233 239 244 25  26  31  37  42  48  53  59  64  7   75  80  86  91  97
1   104 11  115 120 126 131 137 142 148 153 159 164 17  175 180 186 191 197 201 207 212 218 223 229 234 24  245 250 27  32  38  43  49  54  6   65  70  76  81  87  92  98
(总共 0 ~ 254 目录, 其余不予显示)

➜  bitshares-core git:(92eb45cb) ✗ ls witness_node_data_dir/blockchain/object_database/1
10 12 13 14 15 2  3  4  5  6  7  8

➜  bitshares-core git:(92eb45cb) ✗ ls witness_node_data_dir/blockchain/object_database/2
0  1  10 11 12 13 14 15 16 17 3  4  5  6  7  8

➜  bitshares-core git:(92eb45cb) ✗ ls witness_node_data_dir/blockchain/object_database/0

➜  bitshares-core git:(92eb45cb) ✗ ls witness_node_data_dir/blockchain/object_database/3

注意我们看到 object_database/ 目录下有 0 ~ 254 这 255 个一级子目录, 它们代表 space_id, 实际上只有 1, 2 这两个 space_id 下有对象, 其它目录也被创建出来是因为 object_database::_index 成员的大小被设为 255.

1, 2 则分别代表 protocol space id 和 implement space id, 它们定义于 <chain>/protocol/types.hpp:

// 代码 5.4

110    enum reserved_spaces
111    {
112       relative_protocol_ids = 0,
113       protocol_ids          = 1,
114       implementation_ids    = 2
115    };

其它方法

object_database 里还提供了几个方便获取索引和对象的方法, 为了篇幅短点这里只贴出他们的函数签名. find_object 和 get_object 方便我们根据对象 id 找对象, 它们的唯一区别就是一个返回指针一个返回引用.

get_index 和 get_mutable_index 则是让我们直接获得指定 space 和 type 的索引对象, 这样我们就能遍历这个索引下所有的对象, 这俩方法的区别从名字就能看出, 一个允许你改变索引结构一个不允许改变. 从方法定义中一个带 const 一个不带 const 也印证了这一点.

// 代码 5.5

const object* object_database::find_object( object_id_type id )const
const object& object_database::get_object( object_id_type id )const
const index& object_database::get_index(uint8_t space_id, uint8_t type_id)const
index& object_database::get_mutable_index(uint8_t space_id, uint8_t type_id)

后记

到此为止数据库篇最麻烦的部分差不多讨论完了, 剩下的整个数据库篇都没有什么难以理解的代码了.

感谢阅读~

#graphene #cn-programming #cn

8 years ago in #bitshares by cifer (60)

$60.13

Sort:

Trending

[-]

cn-naughty.boy (51) 8 years ago

@cifer, 不错不错！

$0.00

[-]

cn-cutie.pie (50) 8 years ago

@cifer, 这是小可可我在steemit最好的邂逅，好喜欢你的编程贴（＾∀＾）哇~~~

BTW, @cn-naughty.boy 淘气包你讨厌，抢伦家沙发~哼~~~ (>_<、)

$0.00

[-]

cifer (60) 8 years ago

你俩真是 666666

$0.00

[-]

cryptodelos (57) 8 years ago

CryptoDelos Supporting Steemians promote posts on Steemit, 4|4|18

Welcome Steemians!
I will vote 7 steemians who creates good quality content daily.
This is to motivate steemians to create great and relevant posts.
You must Follow me @cryptodelos
RE-STEEM this POST

Let’s build our community together!

https://steemit.com/support/@cryptodelos/cryptodelos-supporting-steemians-promote-posts-on-steemit-4-or-4-or-18

$0.00