3. HTTP Proxy
• There are some proxies for caching or load balancing
• But the “proxy” in this talk is a little different with these
3
4. Do you know Proxomitron?
• http://www.proxomitron.info/
• Since 1999 till 2003
4
5. Local debug proxy
• Intercept and modify the HTTP request/response
5
Request
Response
Logging and modifying
6. Major debugging proxies
• Useful for debugging and security testing
• Burp Proxy
• https://portswigger.net/burp/proxy.html
• Fiddler
• http://www.telerik.com/fiddler
• OWASP ZAP
• https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_
Project
• Charles
• https://www.charlesproxy.com/
• mitmproxy
• https://mitmproxy.org/
6
7. These are useful but …
• Not intended for automated translation
• Not intended for large-scale logging and statistics
• Able to extend but not handy
• I need a proxy like tcpdump (or like tail -f)
• I need a proxy that is easy to use with crawlers
• I need a proxy fully customizable
7
8. proxy2
• https://github.com/inaz2/proxy2
• Single python script
• Require no external modules
• Support IPv6
• Support HTTP/1.1 persistent connection
• Support HTTPS relay/intercept
• Easy to customize with Python!
8
15. Design policy
• Make it simple, less dependent
• Single python script
• Use standard modules only
• Implement it as base class
• Prepare {request,response,save}_handler()
• Users derive the class and override each handler
• Default handlers dump HTTP headers and some useful
info
15
16. Connection flow and handlers
16
client proxy2 server
Request
Request
Response
Response
request_handler(req)
(modify the request)
response_handler(req, res)
(modify the response)
save_handler(req, res)
(task that takes long time)
17. Making HTTP server is easy
• Use BaseHTTPServer module
• https://hg.python.org/cpython/file/2.7/Lib/BaseHTTPServer.py
• Server with multi-threading and IPv6 support
• Request handler
17
19. HTTP/1.1 Persistent Connection
• Reusing connection to the same server
• httplib.HTTPConnection()
• Low-level http client
• threading.local()
• Thread-local storage (as the server is multi-thread)
19
20. Content-Encoding
• Response body can be compressed
• For handlers, proxy2 decompress and re-compress it
• gzip and deflate module
20
21. Hop-by-hop Headers
• In RFC 2616 (deprecated), proxy must remove the below
headers:
• Connection, Keep-Alive, Proxy-Authenticate, Proxy-Authorization,
TE, Trailers, Transfer-Encoding, Upgrade
• RFC 7230 no longer defines the implicit list
• "hop-by-hop" header fields are required to appear in the Connection
header field (A.2)
• http://lists.w3.org/Archives/Public/ietf-http-
wg/2014JulSep/1771.html
• Although, proxy2 remove the above headers for
compatibility
21
22. Handling HTTPS
• HTTPS = HTTP over SSL/TLS
• When you access “https://www.example.com/”, the client
sends the HTTP request:
• CONNECT www.example.com:443 HTTP/1.1
• The proxy returns the HTTP response:
• 200 Connection Established
• After that, the client starts SSL/TLS handshake and
encrypted transmission
22
23. HTTPS relay
• Just relay handshakes and encrypted payloads
• proxy2 can’t understand the content
23
client proxy2 server
CONNECT
Connection Established
Handshake and
encrypted transmission
25. HTTPS intercept (Man-in-the-Middle)
• The proxy generates the certificate for a requested domain
• And works as a HTTPS server with the generated certificate
25
client proxy2 server
CONNECT
Connection Established
Handshake and transmission Handshake and transmission
26. HTTPS intercept (Man-in-the-Middle)
• ssl.wrap_socket()
• Make a socket over SSL/TLS
• with a private key and the corresponding public key’s certificate
• wrap BaseHTTPRequestHandler.connection
26
27. Generating SSL/TLS certificates
• In this case, proxy2 depends on OpenSSL
• You know poor implementations cause severe security risks
• OpenSSL makes a Certificate Authority “proxy2 CA” and generates
certificates signed by the CA
• The browser can install the CA certificate from “http://proxy2.test/”
through proxy2
27
proxy2 CA
signed certificates
sign
“I’ll trust your sign.”
client
30. Recap
• Proxy is fun
• Python’s “batteries” are very powerful
• BaseHTTPServer, httplib, threading, gzip, deflate, select, ssl
• HTTP proxy is easy to understand but not simple
• proxy2 made it simple
30