【自动化测试】selenium的底层实现是怎么样的?

事件的起因就是一直会用selenium但没有去探究过底层的实现原理,结果面试被问就答不上来了,被建议回去多看看底层实现。selenium是怎么样打开浏览器的?selenium是怎么执行对应操作的?这不,就来总结了!

文章目录

  • 一、selenium是什么?
      • 1.selenium 介绍
  • 二、selenium打开浏览器的过程?
      • 1. WebDriver类
      • 2. 打开浏览器驱动程序
      • 3. 打开浏览器
  • 三、执行selenium各种操作?
      • 1. 定义不同的接口
      • 2.执行命令
      • 3. 总结流程

一、selenium是什么?

1.selenium 介绍

Selenium是ThroughtWorks公司一个强大的开源Web功能测试工具系列,支持多平台、多浏览器、多语言去实现自动化测试,Selenium2将浏览器原生的API封装成WebDriver API,可以直接操作浏览器页面里的元素,甚至操作浏览器本身(截屏,窗口大小,启动,关闭,安装插件,配置证书之类的),所以就像真正的用户在操作一样。

我们使用Selenium实现自动化测试,主要需要3个东西

  1. 测试脚本
  2. 浏览器驱动, 这个驱动是根据不同的浏览器开发的,不同的浏览器使用不同的webdriver驱动程序且需要对应相应的浏览器版本
  3. 浏览器,目前selenium支持市面上大多数浏览器,如:火狐,谷歌,IE等

二、selenium打开浏览器的过程?

首先,当我们输入最简单的获取浏览器的driver时,通常会写下这样的代码,这里我使用的是火狐浏览器:

from selenium import webdriver
driver = webdriver.Firefox()

1. WebDriver类

这里让我们来探究下WebDriver的相应源码,按住CTRL后点击Firefox() 就可以进入。
首先来看init初始化方法,它的作用就是 启动一个新的Firefox本地会话
(源码比较长,就不全部放出来了,大家可以自己打开对应着看)
这里展现了init方法中的各个参数

def __init__(self, firefox_profile=None, firefox_binary=None,
                 timeout=30, capabilities=None, proxy=None,
                 executable_path="geckodriver", options=None,
                 service_log_path="geckodriver.log", firefox_options=None,
                 service_args=None, desired_capabilities=None, log_path=None,
                 keep_alive=True):

				.....(此处省略)
				
        if capabilities.get("marionette"):
            capabilities.pop("marionette")
            self.service = Service(  # 这里是实例化一个service的对象
                executable_path,
                service_args=service_args,
                log_path=service_log_path)
            self.service.start()    # 执行了这个对象的start()方法

            capabilities.update(options.to_capabilities())

            executor = FirefoxRemoteConnection(    # 执行了连接的方法
                remote_server_addr=self.service.service_url)
            RemoteWebDriver.__init__(		# 对RemoteWebDriver进行初始化
                self,
                command_executor=executor,
                desired_capabilities=capabilities,
                keep_alive=True)
		
			.....

2. 打开浏览器驱动程序

可以看到,首先时实例化了一个Service的对象,并且执行了这个对象的start()方法。
首先来看Service类:

    def __init__(self, executable_path, port=0, service_args=None,
                 log_path="geckodriver.log", env=None):
        """Creates a new instance of the GeckoDriver remote service proxy.

        GeckoDriver provides a HTTP interface speaking the W3C WebDriver
        protocol to Marionette.
        ...
        """

通过查看Service类中的的init方法的注释,可以发现,初始化Service对象就是创建了一个新的远程代理实例,这个实例提供了一个使用W3C webdriver 的HTTP协议的接口

创建好这样的实例后,就调用了这个实例的start()方法启动服务:

    def start(self):
        """
        Starts the Service.

        :Exceptions:
         - WebDriverException : Raised either when it can't start the service
           or when it can't connect to the service
        """
        try:
            cmd = [self.path]
            cmd.extend(self.command_line_args())
            self.process = subprocess.Popen(cmd, env=self.env,
                                            close_fds=platform.system() != 'Windows',
                                            stdout=self.log_file,
                                            stderr=self.log_file,
                                            stdin=PIPE)
        except TypeError:
            raise
        except OSError as err:
            if err.errno == errno.ENOENT:
                raise WebDriverException(
                    "'%s' executable needs to be in PATH. %s" % (
                        os.path.basename(self.path), self.start_error_message)
                )
            elif err.errno == errno.EACCES:
                raise WebDriverException(
                    "'%s' executable may have wrong permissions. %s" % (
                        os.path.basename(self.path), self.start_error_message)
                )
            else:
                raise
        except Exception as e:
            raise WebDriverException(
                "The executable %s needs to be available in the path. %s\n%s" %
                (os.path.basename(self.path), self.start_error_message, str(e)))
        count = 0
        while True:
            self.assert_process_still_running()
            if self.is_connectable():
                break
            count += 1
            time.sleep(1)
            if count == 30:
                raise WebDriverException("Can not connect to the Service %s" % self.path)

我们可以看到,在start方法中,它定义了一个cmd的命令,命令的作用就是启动了Firefox浏览器的驱动程序。启动后,绑定了9515的端口号,只允许本地来访问。

3. 打开浏览器

回到WebDriver类中,start方法调用后启动了浏览器的驱动程序,接着我们看到它继续对RemoteWebDriver类进行初始化
【自动化测试】selenium的底层实现是怎么样的?_第1张图片
进入这个init方法中,可以看到:

    def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
                 desired_capabilities=None, browser_profile=None, proxy=None,
                 keep_alive=False, file_detector=None, options=None):
        """
        Create a new driver that will issue commands using the wire protocol.

        :Args:
         - command_executor - Either a string representing URL of the remote server or a custom
             remote_connection.RemoteConnection object. Defaults to 'http://127.0.0.1:4444/wd/hub'.
         - desired_capabilities - A dictionary of capabilities to request when
             starting the browser session. Required parameter.
         - browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object.
             Only used if Firefox is requested. Optional.
         - proxy - A selenium.webdriver.common.proxy.Proxy object. The browser session will
             be started with given proxy settings, if possible. Optional.
         - keep_alive - Whether to configure remote_connection.RemoteConnection to use
             HTTP keep-alive. Defaults to False.
         - file_detector - Pass custom file detector object during instantiation. If None,
             then default LocalFileDetector() will be used.
         - options - instance of a driver options.Options class
        """
        capabilities = {}
        if options is not None:
            capabilities = options.to_capabilities()
        if desired_capabilities is not None:
            if not isinstance(desired_capabilities, dict):
                raise WebDriverException("Desired Capabilities must be a dictionary")
            else:
                capabilities.update(desired_capabilities)
        if proxy is not None:
            warnings.warn("Please use FirefoxOptions to set proxy",
                          DeprecationWarning, stacklevel=2)
            proxy.add_to_capabilities(capabilities)
        self.command_executor = command_executor
        if type(self.command_executor) is bytes or isinstance(self.command_executor, str):
            self.command_executor = RemoteConnection(command_executor, keep_alive=keep_alive)
        self._is_remote = True
        self.session_id = None
        self.capabilities = {}
        self.error_handler = ErrorHandler()
        self.start_client()
        if browser_profile is not None:
            warnings.warn("Please use FirefoxOptions to set browser profile",
                          DeprecationWarning, stacklevel=2)
        self.start_session(capabilities, browser_profile)
        self._switch_to = SwitchTo(self)
        self._mobile = Mobile(self)
        self.file_detector = file_detector or LocalFileDetector()

在这个类的初始化方法中,注意这句代码:
self.start_session(capabilities, browser_profile)(倒数第四行)
点击查看源码,得知这个方法的作用就是创建一个新的会话

    def start_session(self, capabilities, browser_profile=None):
        """
        Creates a new session with the desired capabilities.
		"""
		....
		w3c_caps = _make_w3c_caps(capabilities)
        parameters = {"capabilities": w3c_caps,
                      "desiredCapabilities": capabilities}
        response = self.execute(Command.NEW_SESSION, parameters)
        if 'sessionId' not in response:
            response = response['value']
        self.session_id = response['sessionId']
        self.capabilities = response.get('value')
        ...

怎么创建呢?通过看源码发现这句代码:
response = self.execute(Command.NEW_SESSION, parameters)
说明它是调用了 execute() 进行执行命令,让浏览器的驱动向地址localhost:9515/session发送了一个post请求,返回JSON格式的响应,就表现为打开了浏览器页面(并且新建了一个sessionID)。

总结: 打开浏览器驱动——>初始化RemoteWebDriver(使用浏览器驱动的发送POST请求,返回相应即为打开)

三、执行selenium各种操作?

打开浏览器后,我们也获取到了driver,此时就要进行浏览器的其它各种操作了,那这些操作又是怎么执行的呢?

1. 定义不同的接口

回到WebDriver类中,执行了start()方法后,就是执行这样一句代码:
executor = FirefoxRemoteConnection( remote_server_addr=self.service.service_url)

那让我们看看FirefoxRemoteConnection类,这个类很简短:

class FirefoxRemoteConnection(RemoteConnection):
    def __init__(self, remote_server_addr, keep_alive=True):
        RemoteConnection.__init__(self, remote_server_addr, keep_alive)

        self._commands["GET_CONTEXT"] = ('GET', '/session/$sessionId/moz/context')
        self._commands["SET_CONTEXT"] = ("POST", "/session/$sessionId/moz/context")
        self._commands["ELEMENT_GET_ANONYMOUS_CHILDREN"] = \
            ("POST", "/session/$sessionId/moz/xbl/$id/anonymous_children")
        self._commands["ELEMENT_FIND_ANONYMOUS_ELEMENTS_BY_ATTRIBUTE"] = \
            ("POST", "/session/$sessionId/moz/xbl/$id/anonymous_by_attribute")
        self._commands["INSTALL_ADDON"] = \
            ("POST", "/session/$sessionId/moz/addon/install")
        self._commands["UNINSTALL_ADDON"] = \
            ("POST", "/session/$sessionId/moz/addon/uninstall")

它的初始化是调用了RemoteConnection类的初始化,那我们顺势来看看这个类:
这个类注释中说明作用是与远程WebDriver服务器的连接
再来看看它的init方法:

    def __init__(self, remote_server_addr, keep_alive=False, resolve_ip=True):
        # Attempt to resolve the hostname and get an IP address.尝试解析主机名得到IP地址

		...
		self._commands = {
            Command.STATUS: ('GET', '/status'),
            Command.NEW_SESSION: ('POST', '/session'),
            Command.GET_ALL_SESSIONS: ('GET', '/sessions'),
            Command.QUIT: ('DELETE', '/session/$sessionId'),
            Command.GET_CURRENT_WINDOW_HANDLE:
                ('GET', '/session/$sessionId/window_handle'),
            Command.W3C_GET_CURRENT_WINDOW_HANDLE:
                ('GET', '/session/$sessionId/window'),
            Command.GET_WINDOW_HANDLES:
                ('GET', '/session/$sessionId/window_handles'),
            Command.W3C_GET_WINDOW_HANDLES:
                ('GET', '/session/$sessionId/window/handles'),
            Command.GET: ('POST', '/session/$sessionId/url'),
            Command.GO_FORWARD: ('POST', '/session/$sessionId/forward'),
            Command.GO_BACK: ('POST', '/session/$sessionId/back'),
            Command.REFRESH: ('POST', '/session/$sessionId/refresh'),
            Command.EXECUTE_SCRIPT: ('POST', '/session/$sessionId/execute'),
            Command.W3C_EXECUTE_SCRIPT:
                ('POST', '/session/$sessionId/execute/sync'),
            Command.W3C_EXECUTE_SCRIPT_ASYNC:
                ('POST', '/session/$sessionId/execute/async'),
            Command.GET_CURRENT_URL: ('GET', '/session/$sessionId/url'),
            Command.GET_TITLE: ('GET', '/session/$sessionId/title'),
            Command.GET_PAGE_SOURCE: ('GET', '/session/$sessionId/source'),
            Command.SCREENSHOT: ('GET', '/session/$sessionId/screenshot'),
            Command.ELEMENT_SCREENSHOT: ('GET', '/session/$sessionId/element/$id/screenshot'),
            Command.FIND_ELEMENT: ('POST', '/session/$sessionId/element'),
            Command.FIND_ELEMENTS: ('POST', '/session/$sessionId/elements'),
            Command.W3C_GET_ACTIVE_ELEMENT: ('GET', '/session/$sessionId/element/active'),
          ....(还有很多,此处省略)
        }

以上的commands中包含了之后使用selenium的各种操作的接口,这些接口地址全部封装在浏览器的驱动程序中,所有的浏览器操作都是访问这些接口来实现的

2.执行命令

上面我们知道已经定义了各种各样的接口可以用来访问和操作浏览器,那怎么执行这些命令呢?
接着这些命令的下一个方法就是这个类中的excute()方法。

    def execute(self, command, params):
        """
        Send a command to the remote server.

        Any path subtitutions required for the URL mapped to the command should be
        included in the command parameters.

        :Args:
         - command - A string specifying the command to execute.
         - params - A dictionary of named parameters to send with the command as
           its JSON payload.
        """
        command_info = self._commands[command]
        assert command_info is not None, 'Unrecognised command %s' % command
        path = string.Template(command_info[1]).substitute(params)
        if hasattr(self, 'w3c') and self.w3c and isinstance(params, dict) and 'sessionId' in params:
            del params['sessionId']
        data = utils.dump_json(params)
        url = '%s%s' % (self._url, path)
        return self._request(command_info[0], url, body=data)
	def _request(self, method, url, body=None):
        """
        Send an HTTP request to the remote server.

        :Args:
         - method - A string for the HTTP method to send the request with.
         - url - A string for the URL to send the request to.
         - body - A string for request body. Ignored unless method is POST or PUT.

        :Returns:
          A dictionary with the server's parsed JSON response.
        """
        LOGGER.debug('%s %s %s' % (method, url, body))

        parsed_url = parse.urlparse(url)
        headers = self.get_remote_connection_headers(parsed_url, self.keep_alive)
        resp = None
        if body and method != 'POST' and method != 'PUT':
            body = None

        if self.keep_alive:
            resp = self._conn.request(method, url, body=body, headers=headers)

            statuscode = resp.status
        else:
            http = urllib3.PoolManager(timeout=self._timeout)
            resp = http.request(method, url, body=body, headers=headers)

            statuscode = resp.status
            if not hasattr(resp, 'getheader'):
                if hasattr(resp.headers, 'getheader'):
                    resp.getheader = lambda x: resp.headers.getheader(x)
                elif hasattr(resp.headers, 'get'):
                    resp.getheader = lambda x: resp.headers.get(x)

        data = resp.data.decode('UTF-8')
        .....

可以看到主要是通过execute方法调用_request方法通过urilib3标准库向服务器发送对应操作HTTP请求地址,进而实现了浏览器各种操作。

其实,打开浏览器也是发送请求,请求会返回一个sessionid,后面操作的各种接口地址,你也会发现接口地址中存在一个变量$sessionid,那么不难猜测打开浏览器和操作浏览器就是用过sessionid关联到一起,达到在同一个浏览器中做操作。

3. 总结流程

大家有可能觉得有点绕,我们试试操作一个命令后的源码执行流程:
当获得driver后,执行find_element_by_xpath操作

 self.driver.find_element_by_xpath("/html/body/section[1]/div/div/div/a").click()

跳转到了remote包下的WebDriver类的方法中
【自动化测试】selenium的底层实现是怎么样的?_第2张图片
继续点击跳转到还是该类下的find_element方法下
【自动化测试】selenium的底层实现是怎么样的?_第3张图片
再次点击红框中的方法跳转到当前类下的execute方法中:
【自动化测试】selenium的底层实现是怎么样的?_第4张图片
在这里插入图片描述

此时发现返回的响应是需要执行command_executor类中的excute方法,也就是定义了很多接口命令的那个类的excute方法,继续跳转:
【自动化测试】selenium的底层实现是怎么样的?_第5张图片
接下来就是上面讲的调用request方法向远程服务器发出命令,对浏览器进行操作后返回页面的响应。


参考文献:https://www.cnblogs.com/linuxchao/p/linux-selenium-webdriver.html#autoid-0-4-2
本人能力有限,如果有什么不对的地方,大佬们多多指出来。欢迎点赞评论,如果加个关注就更好了

你可能感兴趣的:(软件测试)