NVMe over fabric 内核&SPDK实现比较

NVMe over fabric 内核&SPDK实现比较,第1张

目录

  1、SPDK启动器主流程

1.1 spdk_nvme_connect 返回了一个spdk_nvme_ctrlr

1.1.1 spdk_nvme_connect_async 

1.1.2 nvme_init_controllers,初始化完成后状态为NVME_CTRLR_STATE_READY

1.2 create_ctrlr

1.3  nvme_ctrlr_populate_namespaces

2、几个重要的概念(poller,channel,group概念)

3、分支流程

3.1 kernel心跳

3.1.1 host心跳机制

3.1.2 kernel--target心跳机制

3.2 SPDK 心跳

3.2.1 spdk-- ini

3.2.2 spdk-- tgt

3.3 SPDK异步事件管理

3.3.1 spdk_ini

3.3.2 spdk_tgt


  1、SPDK启动器主流程
bdev_modules_init
    bdev_nvme_library_init
        spdk_nvme_connect
        create_ctrlr
        nvme_ctrlr_populate_namespaces
        spdk_nvme_probe
        spdk_bdev_nvme_set_hotplug
1.1 spdk_nvme_connect 返回了一个spdk_nvme_ctrlr
spdk_nvme_connect
    spdk_nvme_connect_async   返回 probe_ctx (里面有probe_cb, attach_cb, remove_cb, trid)
    nvme_init_controllers
1.1.1 spdk_nvme_connect_async 
spdk_nvme_connect_async
    nvme_driver_init     #暂时没看出来怎么使用的???
    spdk_nvme_probe_ctx_init 初始化 probe_ctx, probe_cb  attach_cb remove_cb赋值给probe_ctx
    spdk_nvme_probe_internal
        nvme_transport_ctrlr_scan :nvme_fabric_ctrlr_scan (tcp_ops)
            nvme_ctrlr_probe
                nvme_transport_ctrlr_construct :  nvme_tcp_ctrlr_construct
                    nvme_ctrlr_construct     spdk_nvme_ctrlr状态为 NVME_CTRLR_STATE_INIT
                    nvme_tcp_ctrlr_create_qpair
                    nvme_ctrlr_get_cap
                    nvme_ctrlr_get_vs
                ctrlr->adminq的state置为 NVME_QPAIR_ENABLED
            
nvme_tcp_ctrlr_construct    
    nvme_ctrlr_construct     spdk_nvme_ctrlr状态设置为 NVME_CTRLR_STATE_INIT
    nvme_tcp_ctrlr_create_qpair
        nvme_qpair_init
        nvme_tcp_alloc_reqs
        nvme_transport_ctrlr_connect_qpair 设置qp_state为 NVME_QPAIR_CONNECTING
            nvme_tcp_ctrlr_connect_qpair
                spdk_sock_connect    发起tcp连接请求
                并设置tcp层的qp recv_state为  NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY
                nvme_tcp_qpair_icreq_send 发起nof连接请求
                    nvme_tcp_qpair_write_pdu   icreq 发给对端
                    nvme_tcp_qpair_process_completions
                        spdk_sock_flush        至此,sock缓冲区才被清空,请求才被真正发出
                        nvme_tcp_read_pdu
                            tqpair->recv_state变为 NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_CH
                            nvme_tcp_read_data
                            nvme_tcp_pdu_ch_handle
                                ***状态变为 NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH
                            nvme_tcp_pdu_psh_handle
                                nvme_tcp_icresp_handle
                                    ****tqp的state变为 NVME_TCP_QPAIR_STATE_RUNNING
                                    qp的recv状态NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY
    
nvme_fabric_qpair_connect
    spdk_nvme_ctrlr_cmd_io_raw  (nvme_completion_poll_cb这个东西改了 status->done = true)
        nvme_qpair_submit_request
            *****
        spdk_nvme_wait_for_completion
1.1.2 nvme_init_controllers,初始化完成后状态为NVME_CTRLR_STATE_READY
ctrlr管理与整个流程息息相关,一堆 *** 作设置控制器***
nvme_init_controllers
    spdk_nvme_probe_poll_async    
        nvme_ctrlr_poll_internal
            nvme_ctrlr_process_init
            
1.2 create_ctrlr
create_ctrlr
    spdk_nvme_ctrlr_get_num_ns
    spdk_io_device_register    : nvme_bdev_ctrlr
    spdk_poller_register       :bdev_nvme_poll_adminq

1.3  nvme_ctrlr_populate_namespaces
nvme_ctrlr_populate_namespaces
    nvme_ctrlr_populate_standard_namespace     : g_populate_namespace_fn的成员
        spdk_bdev_register        : nvmelib_fn_table        nvme_if
            bdev_start
              **examine
                ***
                  spdk_bdev_get_io_channel
                    spdk_get_io_channel
                      bdev_channel_create
                        bdev_nvme_get_io_channel
                          spdk_get_io_channel
                            bdev_nvme_create_cb
                              spdk_nvme_ctrlr_alloc_io_qpair
                              并为该io_channel注册一个定时poller     : bdev_nvme_poll
                                定时poller回调函数由线程定时自己进行调度执行,非定时poller插入线程的active_pollers中
        nvme_bdev_attach_bdev_to_ns
        nvme_ctrlr_populate_namespace_done 
nvme_ctrlr_populate_standard_namespace
  spdk_bdev_register
    bdev_start
      vbdev_gpt_examine
        ****
        spdk_bdev_read
          spdk_bdev_read_blocks
            bdev_read_blocks_with_md
              bdev_io_submit
                _bdev_io_submit
                  bdev_io_do_submit
                    bdev_nvme_submit_request
                      _bdev_nvme_submit_request
                        spdk_bdev_io_get_buf
                          bdev_nvme_get_buf_cb
                            bdev_nvme_readv
                              spdk_nvme_ns_cmd_readv_with_md
                                nvme_qpair_submit_request     调到传输层,进行对应处理
                          

2、几个重要的概念(poller,channel,group概念)

poller:

bdev_nvme_poll_adminq

bdev_nvme_poll

bdev_nvme_hotplug

3、分支流程 3.1 kernel心跳 3.1.1 host心跳机制
nvme_start_ctrl
 nvme_start_ctrl
  nvme_start_keep_alive
   schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);  :nvme_keep_alive_work
    nvme_keep_alive   发送一个心跳报文,看是否能回
	超时即 nvme_reset_ctrl
  async_event_work    : nvme_async_event_work
3.1.2 kernel--target心跳机制
target 心跳机制
nvmet_alloc_ctrl
 nvmet_start_keep_alive_timer
  nvmet_keep_alive_timer   :INIT_DELAYED_WORK(&ctrl->ka_work, nvmet_keep_alive_timer)
  schedule_delayed_work

需要在一定时间收到心跳包,然后重新修改timer时间。
 mod_delayed_work(system_wq, &ctrl->ka_work, ctrl->kato * HZ);
3.2 SPDK 心跳 3.2.1 spdk-- ini
keep_alive只是发了一个心跳报文,没做任何处理
create_ctrlr
 spdk_nvme_ctrlr_register_timeout_callback   : timeout_cb

nvme_tcp_qpair_process_completions
 nvme_tcp_qpair_check_timeout
  nvme_request_check_timeout
   timeout_cb
    bdev_nvme_reset
3.2.2 spdk-- tgt
每次收到host的心跳报文时更新ctrl的last_keep_alive_tick,看是否超时
_spdk_nvmf_subsystem_add_ctrlr
 _spdk_nvmf_ctrlr_add_admin_qpair
  spdk_nvmf_ctrlr_start_keep_alive_timer
   keep_alive_poller    :spdk_nvmf_ctrlr_keep_alive_poll
    spdk_nvmf_ctrlr_disconnect_qpairs_on_pg
	 spdk_nvmf_qpair_disconnect
	spdk_nvmf_ctrlr_disconnect_qpairs_done
3.3 SPDK异步事件管理 3.3.1 spdk_ini
nvme_ctrlr_process_init
 nvme_ctrlr_configure_aer
  nvme_ctrlr_cmd_set_async_event_config --- set_feature设置异步事件
   nvme_ctrlr_configure_aer_done
	nvme_ctrlr_construct_and_submit_aer  --- 提交异步事件请求
	  nvme_ctrlr_async_event_cb       --- 此cb_fn是在tgt回了以后才调用?  
        active_proc->aer_cb_fn(active_proc->aer_cb_arg, cpl) : 即aer_cb
        
nvme_tcp_qpair_process_completions
 nvme_tcp_read_pdu
  nvme_tcp_pdu_psh_handle
   nvme_tcp_capsule_resp_hdr_handle
    nvme_tcp_req_complete
     nvme_complete_request
      cb_fn(cb_arg, cpl);     调用nvme_ctrlr_async_event_cb

aer_cb在创建ctrlr时已经注册
create_ctrlr
  spdk_nvme_ctrlr_register_aer_callback(ctrlr, aer_cb, nvme_bdev_ctrlr);
	aer_cb
	  nvme_ctrlr_populate_namespaces
3.3.2 spdk_tgt
spdk_nvmf_tcp_poll_group_poll
 spdk_nvmf_tcp_req_process
  spdk_nvmf_request_exec
   spdk_nvmf_ctrlr_async_event_request
    spdk_nvmf_ctrlr_process_admin_cmd
     spdk_nvmf_ctrlr_set_features
      spdk_nvmf_ctrlr_set_features_async_event_configuration

第二次poll,
     spdk_nvmf_ctrlr_async_event_request  (host请求异步事件,压根没做任何处理,只是填了一下状态码)
	 ctrlr->aer_req = req    并将req赋值给了ctrlr->aer_req.,当有异步事件的时候,tgt就会通过这个req_rsp回过去

############### SPDK TGT
spdk_nvmf_subsystem_remove_ns
spdk_nvmf_ns_resize  -- _spdk_nvmf_ns_resize
spdk_nvmf_subsystem_add_ns   -- spdk_nvmf_subsystem_pause  --spdk_rpc_nvmf_subsystem_add_ns

三种都会引起 spdk_nvmf_subsystem_ns_changed

其实tgt是可以感知ns_change变化的,只是好像啥也没做。
spdk_rpc_bdev_lvol_resize
   vbdev_lvol_resize
	_vbdev_lvol_resize_cb 
	  spdk_bdev_notify_blockcnt_change
		_resize_notify             或者  _remove_notify


************
spdk_bdev_notify_blockcnt_change    :   _resize_notify

_resize_notify						或者  _remove_notify
 spdk_nvmf_ns_event					SPDK_BDEV_EVENT_RESIZE || SPDK_BDEV_EVENT_REMOVE
  spdk_nvmf_ns_resize
   _spdk_nvmf_ns_resize
 	spdk_nvmf_subsystem_ns_changed         只是将ctrlr中的ns_list进行更新
  	spdk_nvmf_subsystem_resume 
	  spdk_nvmf_subsystem_state_change		  
	   subsystem_state_change_on_pg
	    spdk_nvmf_poll_group_resume_subsystem
		 poll_group_update_subsystem
   	      spdk_nvmf_ctrlr_async_event_ns_notice
		   spdk_nvmf_request_complete        :   填充 resp并回给host






####### 下面流程仅供参考
nvmf_tgt_parse_conf_start
 spdk_nvmf_parse_conf
  spdk_nvmf_parse_nvmf_tgt
   spdk_nvmf_tgt_create

nvmf_target_advance_state
 nvmf_create_nvmf_tgt
  spdk_nvmf_tgt_create
  
spdk_rpc_nvmf_create_target
 spdk_nvmf_tgt_create

spdk_nvmf_tgt_create_poll_group
 spdk_nvmf_poll_group_add_subsystem
  poll_group_update_subsystem
   spdk_nvmf_ctrlr_async_event_ns_notice
    将事件填充到aer_req里面回给host


####这里应该是理解错了,这个是另外一个流程,reservation request的流程
spdk_nvmf_tcp_req_process
 spdk_nvmf_request_exec
  spdk_nvmf_ctrlr_process_io_cmd

spdk_nvmf_ns_reservation_request
 spdk_nvmf_subsystem_update_ns
  subsystem_update_ns_on_pg
   spdk_nvmf_poll_group_update_subsystem
    poll_group_update_subsystem
	 spdk_nvmf_ctrlr_async_event_ns_notice
	  spdk_nvmf_request_complete
	   spdk_nvmf_tcp_req_complete
	    spdk_nvmf_tcp_req_process
		 request_transfer_out
		  spdk_nvmf_tcp_send_capsule_resp_pdu

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/925535.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-16
下一篇 2022-05-16

发表评论

登录后才能评论

评论列表(0条)

保存