diff options
Diffstat (limited to 'docs/pathod/intro.rst')
-rw-r--r-- | docs/pathod/intro.rst | 307 |
1 files changed, 307 insertions, 0 deletions
diff --git a/docs/pathod/intro.rst b/docs/pathod/intro.rst new file mode 100644 index 00000000..f4c8b974 --- /dev/null +++ b/docs/pathod/intro.rst @@ -0,0 +1,307 @@ +.. _intro: + +Pathology 101 +============= + + +pathod +------ + +Pathod is a pathological HTTP daemon designed to let you craft almost any +conceivable HTTP response, including ones that creatively violate the +standards. HTTP responses are specified using a :ref:`small, terse language +<language>` which pathod shares with its evil twin :ref:`pathoc`. To start +playing with pathod, fire up the daemon: + +>>> pathod + +By default, the service listens on port 9999 of localhost, and the default +crafting anchor point is the path **/p/**. Anything after this URL prefix is +treated as a response specifier. So, hitting the following URL will generate an +HTTP 200 response with 100 bytes of random data: + + http://localhost:9999/p/200:b@100 + +See the :ref:`language documentation <language>` to get (much) fancier. The +pathod daemon also takes a range of configuration options. To view those, use +the command-line help: + +>>> pathod --help + +Mimicing a proxy +^^^^^^^^^^^^^^^^ + +Pathod automatically responds to both straight HTTP and proxy requests. For +proxy requests, the upstream host is ignored, and the path portion of the URL +is used to match anchors. This lets you test software that supports a proxy +configuration by spoofing responses from upstream servers. + +By default, we treat all proxy CONNECT requests as HTTPS traffic, serving the +response using either pathod's built-in certificates, or the cert/key pair +specified by the user. You can over-ride this behaviour if you're testing a +client that makes a non-SSL CONNECT request using the **-C** command-line +option. + +Anchors +^^^^^^^ + +Anchors provide an alternative to specifying the response in the URL. Instead, +you attach a response to a pre-configured anchor point, specified with a regex. +When a URL matching the regex is requested, the specified response is served. + +>>> pathod -a "/foo=200" + +Here, "/foo" is the regex specifying the anchor path, and the part after the "=" +is a response specifier. + + +File Access +^^^^^^^^^^^ + +There are two operators in the :ref:`language <language>`` that load contents +from file - the **+** operator to load an entire request specification from +file, and the **>** value specifier. In pathod, both of these operators are +restricted to a directory specified at startup, or disabled if no directory is +specified: + +>>> pathod -d ~/staticdir" + + +Internal Error Responses +^^^^^^^^^^^^^^^^^^^^^^^^ + +Pathod uses the non-standard 800 response code to indicate internal errors, to +distinguish them from crafted responses. For example, a request to: + + http://localhost:9999/p/foo + +... will return an 800 response because "foo" is not a valid page specifier. + + + + + +.. _pathoc: + + +pathoc +------ + +Pathoc is a perverse HTTP daemon designed to let you craft almost any +conceivable HTTP request, including ones that creatively violate the standards. +HTTP requests are specified using a :ref:`small, terse language <language>`, +which pathod shares with its server-side twin pathod. To view pathoc's complete +range of options, use the command-line help: + +>>> pathoc --help + + +Getting Started +^^^^^^^^^^^^^^^ + +The basic pattern for pathoc commands is as follows: + + pathoc hostname request [request ...] + +That is, we specify the hostname to connect to, followed by one or more +requests. Lets start with a simple example:: + + > pathoc google.com get:/ + 07-06-16 12:13:43: >> 'GET':/ + << 302 Found: 261 bytes + +Here, we make a GET request to the path / on port 80 of google.com. Pathoc's +output tells us that the server responded with a 302 redirection. We can tell +pathoc to connect using SSL, in which case the default port is changed to 443 +(you can over-ride the default port with the **-p** command-line option):: + + > pathoc -s www.google.com get:/ + 07-06-16 12:14:56: >> 'GET':/ + << 302 Found: 262 bytes + + +Multiple Requests +^^^^^^^^^^^^^^^^^ + +There are two ways to tell pathoc to issue multiple requests. The first is to specify +them on the command-line, like so:: + + > pathoc google.com get:/ get:/ + 07-06-16 12:21:04: >> 'GET':/ + << 302 Found: 261 bytes + 07-06-16 12:21:04: >> 'GET':/ + << 302 Found: 261 bytes + +In this case, pathoc issues the specified requests over the same TCP connection - +so in the above example only one connection is made to google.com + +The other way to issue multiple requests is to use the **-n** flag:: + + > pathoc -n 2 google.com get:/ + 07-06-16 12:21:04: >> 'GET':/ + << 302 Found: 261 bytes + 07-06-16 12:21:04: >> 'GET':/ + << 302 Found: 261 bytes + +The output is identical, but two separate TCP connections are made to the +upstream server. These two specification styles can be combined:: + + pathoc -n 2 google.com get:/ get:/ + + +Here, two distinct TCP connections are made, with two requests issued over +each. + + + +Basic Fuzzing +^^^^^^^^^^^^^ + +The combination of pathoc's powerful request specification language and a few +of its command-line options makes for quite a powerful basic fuzzer. Here's an +example:: + + pathoc -e -I 200 -t 2 -n 1000 localhost get:/:b@10:ir,@1 + +The request specified here is a valid GET with a body consisting of 10 random bytes, +but with 1 random byte inserted in a random place. This could be in the headers, +in the initial request line, or in the body itself. There are a few things +to note here: + +- Corrupting the request in this way will often make the server enter a state where + it's awaiting more input from the client. This is where the + **-t** option comes in, which sets a timeout that causes pathoc to + disconnect after two seconds. +- The **-n** option tells pathoc to repeat the request 1000 times. +- The **-I** option tells pathoc to ignore HTTP 200 response codes. + You can use this to fine-tune what pathoc considers to be an exceptional + condition, and therefore log-worthy. +- The **-e** option tells pathoc to print an explanation of each logged + request, in the form of an expanded pathoc specification with all random + portions and automatic header additions resolved. This lets you precisely + replay a request that triggered an error. + + +Interacting with Proxies +^^^^^^^^^^^^^^^^^^^^^^^^ + +Pathoc has a reasonably sophisticated suite of features for interacting with +proxies. The proxy request syntax very closely mirrors that of straight HTTP, +which means that it is possible to make proxy-style requests using pathoc +without any additional syntax, by simply specifying a full URL instead of a +simple path: + +>>> pathoc -p 8080 localhost "get:'http://google.com'" + +Another common use case is to use an HTTP CONNECT request to probe remote +servers via a proxy. This is done with the **-c** command-line option, which +allows you to specify a remote host and port pair: + +>>> pathoc -c google.com:80 -p 8080 localhost get:/ + +Note that pathoc does **not** negotiate SSL without being explictly instructed +to do so. If you're making a CONNECT request to an SSL-protected resource, you +must also pass the **-s** flag: + +>>> pathoc -sc google.com:443 -p 8080 localhost get:/ + + + +Embedded response specification +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +One interesting feature of the Request specification language is that you can +embed a response specification in it, which is then added to the request path. +Here's an example: + +>>> pathoc localhost:9999 "get:/p/:s'401:ir,@1'" + +This crafts a request that connects to the pathod server, and which then crafts +a response that generates a 401, with one random byte embedded at a random +point. The response specification is parsed and expanded by pathoc, so you see +syntax errors immediately. This really becomes handy when combined with the +**-e** flag to show the expanded request:: + + 07-06-16 12:32:01: >> 'GET':/p/:s'401:i35,\x27\\x1b\x27:h\x27Content-Length\x27=\x270\x27:h\x27Content-Length\x27=\x270\x27':h'Host'='localhost' + << 401 Unauthorized: 0 bytes + +Note that the embedded response has been resolved *before* being sent to +the server, so that "ir,@1" (embed a random byte at a random location) has +become "i15,\'o\'" (embed the character "o" at offset 15). You now have a +pathoc request specification that is precisely reproducible, even with random +components. This feature comes in terribly handy when testing a proxy, since +you can now drive the server response completely from the client, and have a +complete log of reproducible requests to analyze afterwards. + + +Request Examples +---------------- + +.. list-table:: + :widths: 50 50 + :header-rows: 0 + + * - get:/ + - Get path / + + * - get:/:b@100 + - 100 random bytes as the body + + * - get:/:h"Etag"="&;drop table browsers;" + - Add a header + + * - get:/:u"&;drop table browsers;" + - Add a User-Agent header + + * - get:/:b@100:dr + - Drop the connection randomly + + * - get:/:b@100,ascii:ir,@1 + - 100 ASCII bytes as the body, and randomly inject a random byte + + * - ws:/ + - Initiate a websocket handshake. + + +Response Examples +----------------- + +.. list-table:: + :widths: 50 50 + :header-rows: 0 + + + * - 200 + - A basic HTTP 200 response. + + * - 200:r + - A basic HTTP 200 response with no Content-Length header. This will hang. + + * - 200:da + - Server-side disconnect after all content has been sent. + + * - 200:b\@100 + - 100 random bytes as the body. A Content-Length header is added, so the disconnect + is no longer needed. + + * - 200:b\@100:h"Etag"="';drop table servers;" + - Add a Server header + + * - 200:b\@100:dr + - Drop the connection randomly + + * - 200:b\@100,ascii:ir,@1 + - 100 ASCII bytes as the body, and randomly inject a random byte + + * - 200:b\@1k:c"text/json" + - 1k of random bytes, with a text/json content type + + * - 200:b\@1k:p50,120 + - 1k of random bytes, pause for 120 seconds after 50 bytes + + * - 200:b\@1k:pr,f + - 1k of random bytes, but hang forever at a random location + + * - 200:b\@100:h\@1k,ascii_letters='foo' + - 100 ASCII bytes as the body, randomly generated 100k header name, with the value + 'foo'. |