The Command-Line RESTafarian
Wednesday, 15 Jul 2015
Almost any modern-day service or application provides an HTTP endpoint to work with. Whether they provide metrics, allows remote administration, or accepts complex requests, a system administrator will spend a lot of time working in the terminal accessing and updating such APIs.
There are many many tools to help us, but today we’re going to look at just four key tools: curl, jq, yajl, and httpie.
curl
curl
is probably available on every non-Windows system out of the box
these days, except minimal builds. It supports almost any protocol you can
name, and many you’ve likely never even heard of. For most people, curl is
both the reference tool when testing an HTTP API for compatibility with the
HTTP spec (note, in 2014 now split into multiple specifications), and the
work horse used in many applications, either directly, or as an integrated
libcurl library built into the application.
Particularly useful options are:
-v
to produce verbose output including headers and indicating transfer direction-s
to suppress all output, including progress indicators during download-X
to specify the HTTP verb used, for example-XPUT
or-XPOST
-H
to add arbitrary HTTP headers, for example-HContent-Type:application/json;charset=utf-8
-o
to write any data output to a file, verbose info such as headers go tostderr
still-#
to get a hash onstderr
each time a chunk of data is received-L
to follow a link, for example when checking a 301 redirect
Uploading large files
When uploading large files, it’s important to consider the data format used
(binary, ascii, UTF-8, BASE64 encoded), and whether the data is streamed or
resident in memory. Most people use the -d | --data
option, which will
load the entire file into memory, and then send it with
Content-Type:application/x-www-form-urlencoded
which is usually not what
you expected to happen - lots of CPU to encode the data, lots of memory to
hold the file.
“if you start the data with the letter
@
, the rest should be a file name to read the data from, or-
if you want curl to read the data from stdin. The contents of the file must already be URL-encoded. Multiple files can also be specified. Posting data from a file named ‘foobar’ would thus be done with--data-binary @foobar
“.
The --data-binary
option is normally more appropriate, as it at least
avoids base64-encoding, but it will still load the entire file into memory
I recommend using -T | --upload-file
pretty much all the time. It streams
data, so your memory usage is not excessive, does not encode the data
unnecessarily — most of the time it just Does What You Mean.
jq
jq is one of those tools that you wonder how you ever did without it. It’s a
pipe-capable terminal tool that can be used to reformat or select streaming
JSON-based data, in the same way you might use grep
or wc
on some
arbitrary data.
Let’s take a look at a simple example, piping the output of curl
directly
into jq. The .
parameter represents the identity function, and as by
default jq also pretty-prints JSON, it takes the single-line curl response
and gives it pretty colours and indentation. This alone makes me happy.
$ curl -s skunkwerks.cloudant.com | jq .
{
"couchdb": "Welcome",
"version": "1.0.2",
"cloudant_build": "2409"
}
But let’s say we are only interested in the version number of Cloudant’s API. Maybe this is part of a script or cron job, and we want to confirm that the version number is compatible with some operation we want to perform. Easy-peasy.
$ curl -s skunkwerks.cloudant.com | jq .version
"1.0.2"
Perhaps we need to transform that JSON in some fashion. In this case, we’ll
destructure the JSON object, and produce a new one that happens to have
different keys than the original. This is useful when, for example, you
migrate JSON data out of one system that uses id
as a key for a document,
into CouchDB that wants _id
.
$ curl -s skunkwerks.cloudant.com | \
jq '{message: .couchdb, release: .version}'
{
"message": "Welcome",
"release": "1.0.2"
}
Here’s a more complex version to whet your appetite. We’ll query a database
view, _all_docs
because I’m extremely lazy, and pull out a subset of the
returned data.
$ curl -s skunkwerks.cloudant.com/aspiring/_all_docs | \
jq '.rows[0].id'
"south_face"
$ curl -s skunkwerks.cloudant.com/aspiring/_all_docs | \
jq '[. .rows[] | {_id: .id, revision: .value.rev} ]'
[
{
"_id": "south_face",
"revision": "3-6b27b124e8669007fbf1a6222974ae7f"
},
{
"_id": "west_face",
"revision": "3-b6bb3256d343aa05bb79818ecc8f75ca"
}
]
The second operation pulls out all document ids, and the current revision
from the value
object, and finally wraps them as a convenient array. Neat.
unbuffered
The final feature of jq that I love is its support for streaming APIs. jq’s default mode loops over data until it gets to a suitable spot, and flushes data when required to the next part of our shell command - perhaps another pipe. But sometimes we need to deal with sources that deliver data at a different rate to how fast it can be processed.
The --unbuffered
flag tells jq to spit out JSON data as soon as it has
something useful, rather than waiting for enough data to complete parsing.
Our example is a bit contrived, as it would work perfectly well without streaming support. In this case we receive a continual stream of updated metrics from a riemann server that is monitoring a number of Erlang servers.
$ wsc 'ws://lolcat:5556/index?subscribe=true&query=(service =~ "vmstats%")'
| \ jq --unbuffered .
{
"host": "icouch.wintermute.skunkwerks.at",
"service": "vmstats memory_atoms",
"state": "ok",
"description": null,
"metric": 256480,
"tags": [
"katja_vmstats",
"instance: icouch",
"couch",
"beam"
],
"time": "2015-07-15T12:58:22.000Z",
"ttl": 60
}
{
"host": "icouch.wintermute.skunkwerks.at",
"service": "vmstats error_logger_message_queue",
"state": "ok",
"description": null,
"metric": 0,
"tags": [
"katja_vmstats",
"instance: icouch",
"couch",
"beam"
],
"time": "2015-07-15T12:58:22.000Z",
"ttl": 60
}
...
yajl
yajl is actually the first JSON library I encountered, and it’s still one of my favourites. It’s extremely fast, available as both a library and a command-line tool on every platform I’ve used recently.
I tend to use yajl for validating and reformatting JSON from pretty-printed to packed form. The latter can be done with jq as well of course, so let’s just see validation of piped data. Often somebody will ask why CouchDB is rejecting their valid JSON, and I point them to yajl so that they can check themselves. This is almost always invalid UTF-8, by the way.
$ echo '{"foo": schmnoo }' | json_verify
lexical error: invalid char in json text.
{"foo": schmnoo }
(right here) ------^
JSON is invalid
$ echo '{"foo": true }' | json_verify
JSON is valid
httpie
HTTPie is another very powerful tool written in python, using the well known requests library under the hood. I use it a lot when talking to JSON webservices to build up JSON objects rather than copy and paste text into the shell from elsewhere. This way, HTTPie takes care of ensuring I’m supplying valid JSON.
http --verbose --style fruity \
PUT http://localhost:5984/testy/kv \
content-type:application/json;charset=utf-8 \
key=value \
foo:=true
PUT /testy/kv HTTP/1.1
Accept: application/json
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 29
Host: localhost:5984
User-Agent: HTTPie/0.9.2
content-type: application/json;charset=utf-8
{
"foo": true,
"key": "value"
}
HTTP/1.1 201 Created
Cache-Control: must-revalidate
Content-Length: 65
Content-Type: application/json
Date: Wed, 15 Jul 2015 13:45:37 GMT
ETag: "1-158688661a13aa0a0a25849e7ed78da4"
Location: http://localhost:5984/testy/kv
Server: CouchDB/1.6.1 (Erlang OTP/18)
{
"id": "kv",
"ok": true,
"rev": "1-158688661a13aa0a0a25849e7ed78da4"
}
While HTTPIe is capable of much more, you can see above a few things:
-
Using colon-separated fields like
content-type:application/json
to specify HTTP headers. In this case, we could have used the inbuilt--json
flag instead. -
Using
=
for normal JSON strings and content -
Using
:=
to embed raw JSON - otherwise"foo": true
would have been stored as a literal"true"
string and not the JSON nativetrue
value. -
All HTTP headers sent and received are visible via the
--verbose
flag, once again using colour for terminal happiness.
Other Tools
Please let me know if you have other tools you use and love! I’m aware that there’s an equivalent of httpie for most programming languages, for example: