Skip to content
标签
note
字数
1231 字
阅读时间
6 分钟

Web 应用本质

了解了 HTTP 协议和 HTML 文档,我们其实就明白了一个 Web 应用的本质就是:

  1. 浏览器发送一个 HTTP 请求;
  2. 服务器收到请求,生成一个 HTML 文档;
  3. 服务器把 HTML 文档作为 HTTP 响应的 Body 发送给浏览器;
  4. 浏览器收到 HTTP 响应,从 HTTP Body 取出 HTML 文档并显示。

所以,最简单的 Web 应用就是先把 HTML 用文件保存好,用一个现成的 HTTP 服务器软件,接收用户请求,从文件中读取 HTML,返回。Apache、Nginx、Lighttpd 等这些常见的静态服务器就是干这件事情的

问题

如果要动态生成 HTML,就需要把上述步骤自己来实现。不过,接受 HTTP 请求、解析 HTTP 请求、发送 HTTP 响应都是苦力活,如果我们自己来写这些底层代码,还没开始写动态 HTML 呢,就得花个把月去读 HTTP 规范。

正确的做法是底层代码由专门的服务器软件实现,我们用 Python 专注于生成 HTML 文档。因为我们不希望接触到 TCP 连接、HTTP 原始请求和响应格式,所以,需要一个统一的接口,让我们专心用 Python 编写 Web 业务。

这个接口就是 WSGI:Web Server Gateway Interface

WSGI

SGI 规定应用程序必须是一个可调用对象(可调用对象可以是函数,也可以是类,还可以是实现了 __call__ 的实例对象),而且必须接受两个参数,该对象的返回值必须是可迭代对象。

我们可以写个最简单的应用程序的例子

python
HELLO_WORLD = b"Hello world!\n"

def application(environ, start_response):
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [HELLO_WORLD]

application 是一个函数,肯定是可调用对象,然后接收两个参数,两个参数分别是:environ 和 start_response

  • environ 是一个字典,里面储存了 HTTP request 相关的所有内容,比如 header、请求参数等等
  • start_response 是一个 WSGI 服务器传递过来的函数,用于将 response header,状态码传递给 Server。

调用 start_response 函数负责将响应头、状态码传递给服务器, 响应体则由 application 函数返回给服务器, 一个完整的 http response 就由这两个函数提供。

但凡是实现了 wsgi 的 web 框架都会有这样一个可调用对象

WSGI 服务器

WSGI 服务器端做的事情就是每次接收 HTTP 请求,构建 environ 对象,然后调用 application 对象,最后将 HTTP Response 返回给浏览器。

下面就是一个完整的 wsgi server 的代码

python
import socket
import sys
from io import StringIO

class WSGIServer(object):
    address_family = socket.AF_INET
    socket_type = socket.SOCK_STREAM
    request_queue_size = 1

    def __init__(self, server_address):
        # Create a listening socket
        self.listen_socket = listen_socket = socket.socket(
            self.address_family,
            self.socket_type
        )
        # Allow to reuse the same address
        listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        # Bind
        listen_socket.bind(server_address)
        # Activate
        listen_socket.listen(self.request_queue_size)
        # Get server host name and port
        host, port = self.listen_socket.getsockname()[:2]
        self.server_name = socket.getfqdn(host)
        self.server_port = port
        # Return headers set by Web framework/Web application
        self.headers_set = []

    def set_app(self, application):
        self.application = application

    def serve_forever(self):
        listen_socket = self.listen_socket
        while True:
            # New client connection
            self.client_connection, client_address = listen_socket.accept()
            # Handle one request and close the client connection. Then
            # loop over to wait for another client connection
            self.handle_one_request()

    def handle_one_request(self):
        self.request_data = request_data = self.client_connection.recv(1024)
        # Print formatted request data a la 'curl -v'
        print(''.join(
            '< {line}\n'.format(line=line)
            for line in request_data.splitlines()
        ))
        self.parse_request(request_data)
        # Construct environment dictionary using request data
        env = self.get_environ()
        # It's time to call our application callable and get
        # back a result that will become HTTP response body
        result = self.application(env, self.start_response)
        # Construct a response and send it back to the client
        self.finish_response(result)

    def parse_request(self, text):
        request_line = text.splitlines()[0]
        request_line = request_line.rstrip('\r\n')
        # Break down the request line into components
        (self.request_method,  # GET
         self.path,  # /hello
         self.request_version  # HTTP/1.1
         ) = request_line.split()

    def get_environ(self):
        env = {}
        # The following code snippet does not follow PEP8 conventions
        # but it's formatted the way it is for demonstration purposes
        # to emphasize the required variables and their values
        #
        # Required WSGI variables
        env['wsgi.version'] = (1, 0)
        env['wsgi.url_scheme'] = 'http'
        env['wsgi.input'] = StringIO.StringIO(self.request_data)
        env['wsgi.errors'] = sys.stderr
        env['wsgi.multithread'] = False
        env['wsgi.multiprocess'] = False
        env['wsgi.run_once'] = False
        # Required CGI variables
        env['REQUEST_METHOD'] = self.request_method  # GET
        env['PATH_INFO'] = self.path  # /hello
        env['SERVER_NAME'] = self.server_name  # localhost
        env['SERVER_PORT'] = str(self.server_port)  # 8888
        return env

    def start_response(self, status, response_headers, exc_info=None):
        # Add necessary server headers
        server_headers = [
            ('Date', 'Tue, 31 Mar 2015 12:54:48 GMT'),
            ('Server', 'WSGIServer 0.2'),
        ]
        self.headers_set = [status, response_headers + server_headers]
        # To adhere to WSGI specification the start_response must return
        # a 'write' callable. We simplicity's sake we'll ignore that detail
        # for now.
        # return self.finish_response

    def finish_response(self, result):
        try:
            status, response_headers = self.headers_set
            response = 'HTTP/1.1 {status}\r\n'.format(status=status)
            for header in response_headers:
                response += '{0}: {1}\r\n'.format(*header)
            response += '\r\n'
            for data in result:
                response += data
            # Print formatted response data a la 'curl -v'
            print(''.join(
                '> {line}\n'.format(line=line)
                for line in response.splitlines()
            ))
            self.client_connection.sendall(response)
        finally:
            self.client_connection.close()

SERVER_ADDRESS = (HOST, PORT) = 'localhost', 8080

def make_server(server_address, application):
    server = WSGIServer(server_address)
    server.set_app(application)
    return server

if __name__ == '__main__':
    httpd = make_server(SERVER_ADDRESS, application)
    print('WSGIServer: Serving HTTP on port {port} ...\n'.format(port=PORT))
    httpd.serve_forever()

当然,如果只是写个用于开发环境用的 server,用不着这么麻烦自己造轮子,因为 python 内置模块中就提供有 wsgi server 的功能。

python
from wsgiref.simple_server import make_server
srv = make_server('localhost', 8080, application)

只要 3 行代码就可以提供 wsgi 服务器,是不是超级方便,最后来访问测试下浏览器发起一个请求的效果
image.png

贡献者

The avatar of contributor named as jiechen jiechen

页面历史

撰写