##JOB执行流程 先看下官网对于master端的工作流程的:
The Salt master works by always publishing commands to all connected minions and the minions decide if the command is meant for them by checking themselves against the command target.The typical lifecycle of a salt job from the perspective of the master might be as follows:1) A command is issued on the CLI. For example, 'salt my_minion test.ping'.使用命令行工具生成一个条命令,如:'salt my_minion test.ping'。2) The 'salt' command uses LocalClient to generate a request to the salt master by connecting to the ReqServer on TCP:4506 and issuing the job.'salt' 命令使用LocalClient连接本地的4506端口来发送命令。3) The salt-master ReqServer sees the request and passes it to an available MWorker over workers.ipc.salt-master ReqServer接收请求,然后把请求通过workers.ipc分发到一个可用的MWorker中去。4) A worker picks up the request and handles it. First, it checks to ensure that the requested user has permissions to issue the command. Then, it sends the publish command to all connected minions. For the curious, this happens in ClearFuncs.publish().一个worker线程认领请求并且处理它。首先,它检查用户是否有权限发送命令。然后,它发送一个publish类型的命令到所有连接的minions。这一步发生在ClearFuncs.publish()中。5) The worker announces on the master event bus that it is about to publish a job to connected minions. This happens by placing the event on the master event bus (master_event_pull.ipc) where the EventPublisher picks it up and distributes it to all connected event listeners on master_event_pub.ipc.worker线程生成一个事件,说它准备将命令发送给minons。步骤是(1)worker将事件发送到master的事件总线中去(master_event_pull.ipc)。(2)EventPublisher获取这个事件,并通过master_event_pub.ipc分发给所有的订阅者。6) The message to the minions is encrypted and sent to the Publisher via IPC on publish_pull.ipc.发送个minions的消息加密后通过publish_pull.ipc发送给Publisher。7) Connected minions have a TCP session established with the Publisher on TCP port 4505 where they await commands. When the Publisher receives the job over publish_pull, it sends the jobs across the wire to the minions for processing.在线的minions通过TCP会话连接到master端的4505端口来等待命令。当Publisher在publish_pull接收到命令后,便把命令通过4505端口发送给minions。8) After the minions receive the request, they decrypt it and perform any requested work, if they determine that they are targeted to do so.minions接收到请求后,首先解密请求,如果确定命令是发送给自己的,便去执行命令。9) When the minion is ready to respond, it publishes the result of its job back to the master by sending the encrypted result back to the master on TCP 4506 where it is again picked up by the ReqServer and forwarded to an available MWorker for processing. (Again, this happens by passing this message across workers.ipc to an available worker.)当minion处理完命令后,便通过master的4506端口返回执行结果。master端的ReqServer接收到结果,再次将结果发送给MWorker去处理。(ReqServer是通过workers.ipc将消息分发给一个可用的worker线程的。)10) When the MWorker receives the job it decrypts it and fires an event onto the master event bus (master_event_pull.ipc). (Again for the curious, this happens in AESFuncs._return().MWorker接收这个job并解密它,然后它会在master的事件总线中发布一个事件(master_event_pull.ipc)(这一步发生在AESFuncs._return()中)。11) The EventPublisher sees this event and re-publishes it on the bus to all connected listeners of the master event bus (on master_event_pub.ipc). This is where the LocalClient has been waiting, listening to the event bus for minion replies. It gathers the job and stores the result.EventPublisher接收到这个事件,再次把它分发给所有的订阅者(通过master_event_pub.ipc)。LocalClient就在这里监听事件,等待自己需要的结果。它搜集并存储命令执行结果。12) When all targeted minions have replied or the timeout has been exceeded, the salt client displays the results of the job to the user on the CLI.当所有的minions返回结果或者执行超时,salt客户端在界面显示结果。
##源码分析
下面介绍master执行salt模块用到的几个类,参照上面的流程阅读源码。
###salt.master.Master
创建ReqServer的代码在run_reqserver()中:
def run_reqserver(self): reqserv = ReqServer( self.opts, self.key, self.master_key) reqserv.run()
###salt.master.ReqServer
打开salt.master.ReqServer
:
class ReqServer(object): ''' Starts up the master request server, minions send results to this interface. ''' def __init__(self, opts, key, mkey): ''' Create a request server :param dict opts: The salt options dictionary :key dict: The user starting the server and the AES key :mkey dict: The user starting the server and the RSA key :rtype: ReqServer :returns: Request server ''' self.opts = opts self.master_key = mkey # Prepare the AES key self.key = key def __bind(self): ''' Binds the reply server ''' dfn = os.path.join(self.opts['cachedir'], '.dfn') if os.path.isfile(dfn): try: os.remove(dfn) except os.error: pass self.process_manager = salt.utils.process.ProcessManager(name='ReqServer_ProcessManager') req_channels = [] for transport, opts in iter_transport_opts(self.opts): chan = salt.transport.server.ReqServerChannel.factory(opts) chan.pre_fork(self.process_manager) req_channels.append(chan) for ind in range(int(self.opts['worker_threads'])): self.process_manager.add_process(MWorker, args=(self.opts, self.master_key, self.key, req_channels, ), ) self.process_manager.run() def run(self): ''' Start up the ReqServer ''' try: self.__bind() except KeyboardInterrupt: log.warn('Stopping the Salt Master') raise SystemExit('\nExiting on Ctrl-c') def destroy(self): if hasattr(self, 'clients') and self.clients.closed is False: self.clients.setsockopt(zmq.LINGER, 1) self.clients.close() if hasattr(self, 'workers') and self.workers.closed is False: self.workers.setsockopt(zmq.LINGER, 1) self.workers.close() if hasattr(self, 'context') and self.context.closed is False: self.context.term() # Also stop the workers if hasattr(self, 'process_manager'): self.process_manager.kill_children() def __del__(self): self.destroy()
代码比较简单,主要的功能在_bind()方法中,它根据配置文件的中worker_threads
生成数个worker线程。
###salt.master.MWorker
在salt.master.MWorker
类中,也是通过_bind()方法来接收请求的:
def __bind(self): ''' Bind to the local port ''' # using ZMQIOLoop since we *might* need zmq in there zmq.eventloop.ioloop.install() self.io_loop = zmq.eventloop.ioloop.ZMQIOLoop() for req_channel in self.req_channels: req_channel.post_fork(self._handle_payload, io_loop=self.io_loop) # TODO: cleaner? Maybe lazily? self.io_loop.start()
核心语句在req_channel.post_fork(self._handle_payload, io_loop=self.io_loop)
,它将接收到的请求交给self._handle_payload
处理,我们看下_handle_payload
方法:
@tornado.gen.coroutinedef _handle_payload(self, payload): ''' The _handle_payload method is the key method used to figure out what needs to be done with communication to the server Example cleartext payload generated for 'salt myminion test.ping': {'enc': 'clear', 'load': {'arg': [], 'cmd': 'publish', 'fun': 'test.ping', 'jid': '', 'key': 'alsdkjfa.,maljf-==adflkjadflkjalkjadfadflkajdflkj', 'kwargs': {'show_jid': False, 'show_timeout': False}, 'ret': '', 'tgt': 'myminion', 'tgt_type': 'glob', 'user': 'root'}} :param dict payload: The payload route to the appropriate handler ''' key = payload['enc'] load = payload['load'] ret = {'aes': self._handle_aes, 'clear': self._handle_clear}[key](load) raise tornado.gen.Return(ret)
在代码的最后一行可以看到,如果key是'aes'的话就调用self._handle_aes方法,它是用来处理minion返回的结果的;如果key是'clear'的话就调用self._handle_clear方法,它是用来处理master发送的命令的。
看下self. _handle_clear
方法:
def _handle_clear(self, load): ''' Process a cleartext command :param dict load: Cleartext payload :return: The result of passing the load to a function in ClearFuncs corresponding to the command specified in the load's 'cmd' key. ''' log.trace('Clear payload received with command {cmd}'.format(**load)) if load['cmd'].startswith('__'): return False return getattr(self.clear_funcs, load['cmd'])(load), {'fun': 'send_clear'}
重点是最后一句,它根据load['cmd']
的值来调用self.clear_funcs
中的对应方法,执行salt模块时,load['cmd']
的值是publish
。self.clear_funcs
是salt.master.ClearFuncs
的实例化对象,salt.master.ClearFuncs
介绍见下文。
self. _handle_aes
方法跟self. _handle_clear
方法类似:
def _handle_aes(self, data): ''' Process a command sent via an AES key :param str load: Encrypted payload :return: The result of passing the load to a function in AESFuncs corresponding to the command specified in the load's 'cmd' key. ''' if 'cmd' not in data: log.error('Received malformed command {0}'.format(data)) return {} log.trace('AES payload received with command {0}'.format(data['cmd'])) if data['cmd'].startswith('__'): return False return self.aes_funcs.run_func(data['cmd'], data)
当salt-minion返回命令的结果时data['cmd']
的值是_return
,看下run_func
的源码可知其调用的是salt.master.AESFuncs
的_return
方法,salt.master.AESFuncs
介绍见下文。
###salt.master.ClearFuncs
ClearFuncs.publish方法开始的部分是进行身份认证,认证通过后会生成一条事件来说明即将发送消息:
payload = self._prep_pub(minions, jid, clear_load, extra)
self._prep_pub
中核心代码是这一行:
self.event.fire_event(new_job_load, tagify([clear_load['jid'], 'new'], 'job'))
最后发送消息给minions:
self._send_pub(payload)
self._send_pub
方法很简单,调用底层的消息队列发送消息:
def _send_pub(self, load): ''' Take a load and send it across the network to connected minions ''' for transport, opts in iter_transport_opts(self.opts): chan = salt.transport.server.PubServerChannel.factory(opts) chan.publish(load)
###salt.master.AESFuncs
看下_return
方法源码:
def _return(self, load): ''' Handle the return data sent from the minions. Takes the return, verifies it and fires it on the master event bus. Typically, this event is consumed by the Salt CLI waiting on the other end of the event bus but could be heard by any listener on the bus. :param dict load: The minion payload ''' try: salt.utils.job.store_job( self.opts, load, event=self.event, mminion=self.mminion) except salt.exception.SaltCacheError: log.error('Could not store job information for load: {0}'.format(load))
可以看到,主要代码在salt.utils.job.store_job
中,核心代码在这里:
if event: # If the return data is invalid, just ignore it log.info('Got return from {id} for job {jid}'.format(**load)) event.fire_event(load, tagify([load['jid'], 'ret', load['id']], 'job')) event.fire_ret_load(load)
往事件总线里面发送消息。
##总结 这里只是大致介绍了大致的流程,其中关于数据如何在消息队列间流转的,没有细写,以后有机会再单独写篇博客介绍下。