aboutsummaryrefslogtreecommitdiffstats
path: root/docs/pathod/intro.rst
blob: bf0c531fd55a682028d9703c2ac71f6c280b0a3c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
.. _intro:

Pathology 101
=============


pathod
------

Pathod is a pathological HTTP daemon designed to let you craft almost any
conceivable HTTP response, including ones that creatively violate the
standards. HTTP responses are specified using a :ref:`small, terse language
<language>` which pathod shares with its evil twin :ref:`pathoc`. To start
playing with pathod, fire up the daemon:

>>> pathod

By default, the service listens on port 9999 of localhost, and the default
crafting anchor point is the path **/p/**. Anything after this URL prefix is
treated as a response specifier. So, hitting the following URL will generate an
HTTP 200 response with 100 bytes of random data:

    http://localhost:9999/p/200:b@100

See the :ref:`language documentation <language>` to get (much) fancier. The
pathod daemon also takes a range of configuration options. To view those, use
the command-line help:

>>> pathod --help

Mimicing a proxy
^^^^^^^^^^^^^^^^

Pathod automatically responds to both straight HTTP and proxy requests. For
proxy requests, the upstream host is ignored, and the path portion of the URL
is used to match anchors. This lets you test software that supports a proxy
configuration by spoofing responses from upstream servers.

By default, we treat all proxy CONNECT requests as HTTPS traffic, serving the
response using either pathod's built-in certificates, or the cert/key pair
specified by the user. You can over-ride this behaviour if you're testing a
client that makes a non-SSL CONNECT request using the **-C** command-line
option.

Anchors
^^^^^^^

Anchors provide an alternative to specifying the response in the URL. Instead,
you attach a response to a pre-configured anchor point, specified with a regex.
When a URL matching the regex is requested, the specified response is served.

>>> pathod -a "/foo=200"

Here, "/foo" is the regex specifying the anchor path, and the part after the "="
is a response specifier.


File Access
^^^^^^^^^^^

There are two operators in the :ref:`language <language>` that load contents
from file - the **+** operator to load an entire request specification from
file, and the **>** value specifier. In pathod, both of these operators are
restricted to a directory specified at startup, or disabled if no directory is
specified:

>>> pathod -d ~/staticdir"


Internal Error Responses
^^^^^^^^^^^^^^^^^^^^^^^^

Pathod uses the non-standard 800 response code to indicate internal errors, to
distinguish them from crafted responses. For example, a request to:

    http://localhost:9999/p/foo

... will return an 800 response because "foo" is not a valid page specifier.





.. _pathoc:


pathoc
------

Pathoc is a perverse HTTP daemon designed to let you craft almost any
conceivable HTTP request, including ones that creatively violate the standards.
HTTP requests are specified using a :ref:`small, terse language <language>`,
which pathod shares with its server-side twin pathod. To view pathoc's complete
range of options, use the command-line help:

>>> pathoc --help


Getting Started
^^^^^^^^^^^^^^^

The basic pattern for pathoc commands is as follows:

    pathoc hostname request [request ...]

That is, we specify the hostname to connect to, followed by one or more
requests. Lets start with a simple example::

    > pathoc google.com get:/
    07-06-16 12:13:43: >> 'GET':/
    << 302 Found: 261 bytes

Here, we make a GET request to the path / on port 80 of google.com. Pathoc's
output tells us that the server responded with a 302 redirection. We can tell
pathoc to connect using SSL, in which case the default port is changed to 443
(you can over-ride the default port with the **-p** command-line option)::

    > pathoc -s www.google.com get:/
    07-06-16 12:14:56: >> 'GET':/
    << 302 Found: 262 bytes


Multiple Requests
^^^^^^^^^^^^^^^^^

There are two ways to tell pathoc to issue multiple requests. The first is to specify
them on the command-line, like so::

    > pathoc google.com get:/ get:/
    07-06-16 12:21:04: >> 'GET':/
    << 302 Found: 261 bytes
    07-06-16 12:21:04: >> 'GET':/
    << 302 Found: 261 bytes

In this case, pathoc issues the specified requests over the same TCP connection -
so in the above example only one connection is made to google.com

The other way to issue multiple requests is to use the **-n** flag::

    > pathoc -n 2 google.com get:/
    07-06-16 12:21:04: >> 'GET':/
    << 302 Found: 261 bytes
    07-06-16 12:21:04: >> 'GET':/
    << 302 Found: 261 bytes

The output is identical, but two separate TCP connections are made to the
upstream server. These two specification styles can be combined::

    pathoc -n 2 google.com get:/ get:/


Here, two distinct TCP connections are made, with two requests issued over
each.



Basic Fuzzing
^^^^^^^^^^^^^

The combination of pathoc's powerful request specification language and a few
of its command-line options makes for quite a powerful basic fuzzer. Here's an
example::

    pathoc -e -I 200 -t 2 -n 1000 localhost get:/:b@10:ir,@1

The request specified here is a valid GET with a body consisting of 10 random bytes,
but with 1 random byte inserted in a random place. This could be in the headers,
in the initial request line, or in the body itself. There are a few things
to note here:

- Corrupting the request in this way will often make the server enter a state where
  it's awaiting more input from the client. This is where the
  **-t** option comes in, which sets a timeout that causes pathoc to
  disconnect after two seconds.
- The **-n** option tells pathoc to repeat the request 1000 times.
- The **-I** option tells pathoc to ignore HTTP 200 response codes.
  You can use this to fine-tune what pathoc considers to be an exceptional
  condition, and therefore log-worthy.
- The **-e** option tells pathoc to print an explanation of each logged
  request, in the form of an expanded pathoc specification with all random
  portions and automatic header additions resolved. This lets you precisely
  replay a request that triggered an error.


Interacting with Proxies
^^^^^^^^^^^^^^^^^^^^^^^^

Pathoc has a reasonably sophisticated suite of features for interacting with
proxies. The proxy request syntax very closely mirrors that of straight HTTP,
which means that it is possible to make proxy-style requests using pathoc
without any additional syntax, by simply specifying a full URL instead of a
simple path:

>>> pathoc -p 8080 localhost "get:'http://google.com'"

Another common use case is to use an HTTP CONNECT request to probe remote
servers via a proxy. This is done with the **-c** command-line option, which
allows you to specify a remote host and port pair:

>>> pathoc -c google.com:80 -p 8080 localhost get:/

Note that pathoc does **not** negotiate SSL without being explictly instructed
to do so. If you're making a CONNECT request to an SSL-protected resource, you
must also pass the **-s** flag:

>>> pathoc -sc google.com:443 -p 8080 localhost get:/



Embedded response specification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

One interesting feature of the Request specification language is that you can
embed a response specification in it, which is then added to the request path.
Here's an example:

>>> pathoc localhost:9999 "get:/p/:s'401:ir,@1'"

This crafts a request that connects to the pathod server, and which then crafts
a response that generates a 401, with one random byte embedded at a random
point. The response specification is parsed and expanded by pathoc, so you see
syntax errors immediately. This really becomes handy when combined with the
**-e** flag to show the expanded request::

    07-06-16 12:32:01: >> 'GET':/p/:s'401:i35,\x27\\x1b\x27:h\x27Content-Length\x27=\x270\x27:h\x27Content-Length\x27=\x270\x27':h'Host'='localhost'
    << 401 Unauthorized: 0 bytes

Note that the embedded response has been resolved *before* being sent to
the server, so that "ir,@1" (embed a random byte at a random location) has
become "i15,\'o\'" (embed the character "o" at offset 15). You now have a
pathoc request specification that is precisely reproducible, even with random
components. This feature comes in terribly handy when testing a proxy, since
you can now drive the server response completely from the client, and have a
complete log of reproducible requests to analyze afterwards.


Request Examples
----------------

.. list-table::
    :widths: 50 50
    :header-rows: 0

    * - get:/
      - Get path /

    * - get:/:b@100
      - 100 random bytes as the body

    * - get:/:h"Etag"="&;drop table browsers;"
      - Add a header

    * - get:/:u"&;drop table browsers;"
      - Add a User-Agent header

    * - get:/:b@100:dr
      - Drop the connection randomly

    * - get:/:b@100,ascii:ir,@1
      - 100 ASCII bytes as the body, and randomly inject a random byte

    * - ws:/
      - Initiate a websocket handshake.


Response Examples
-----------------

.. list-table::
    :widths: 50 50
    :header-rows: 0


    * - 200
      - A basic HTTP 200 response.

    * - 200:r
      - A basic HTTP 200 response with no Content-Length header. This will hang.

    * - 200:da
      - Server-side disconnect after all content has been sent.

    * - 200:b\@100
      - 100 random bytes as the body. A Content-Length header is added, so the disconnect
        is no longer needed.

    * - 200:b\@100:h"Etag"="';drop table servers;"
      - Add a Server header

    * - 200:b\@100:dr
      - Drop the connection randomly

    * - 200:b\@100,ascii:ir,@1
      - 100 ASCII bytes as the body, and randomly inject a random byte

    * - 200:b\@1k:c"text/json"
      - 1k of random bytes, with a text/json content type

    * - 200:b\@1k:p50,120
      - 1k of random bytes, pause for 120 seconds after 50 bytes

    * - 200:b\@1k:pr,f
      - 1k of random bytes, but hang forever at a random location

    * - 200:b\@100:h\@1k,ascii_letters='foo'
      - 100 ASCII bytes as the body, randomly generated 100k header name, with the value
        'foo'.