DIY Seltzer, Club Soda, Soda, etc

While I’ve been on paternity leave I have increased the amount of club soda that I drink hugely. This is mostly because I wanted to have a refreshing beverage while in the non-air conditioned apartment. I did a little research and found out how to make my own so I could have as much as I wanted, and because Googling for how to do it was hard, I figured I’d document it clearly here.

What are we talking about here?

First off, soda: when I say soda I don’t mean Coca Cola, I mean club soda, though for most of this post I actually mean seltzer. Seltzer is simply water and carbon dioxide (CO₂.) Seltzer is actually a little harsh, so people add various salts (again, more on that later) to tone it down and taste more neutral. I really like club soda, but step one is to make seltzer.

Why Build a Carbonator?

There are a few reasons. One is cost. I’ll get into that more later, but at some point, building your own rig is going to be cheaper than buying. The bigger reason for me is that it is easier to make soda than to buy it. I live in Santa Monica and walk to the grocery store, so adding heavy or large bottles to my personal grocery cart is a drag. Also it’s always fun to make a thing instead of buying it.

Build it

First off, here is a bill of materials:

Bill of Materials

Item Cost
CO₂ Tank - 5lb $58
Carbonation Cap $12
Single Gauge Regulator $40
Dual Gauge Regulator $47
Gas Line Assembly $11

Total material cost: $121-$128

All Parts

You only need one regulator, but the dual one will let you know when your tank is getting empty, so that might be nice to have. On the other hand it makes the full assembly much less compact, and I hear the other gauge tends to get broken since it sticks out so much.

There is one other cost that is recurring, and that’s the CO₂. I got my CO₂ at a homebrew supply store. People say you can get it at a welding store. I would have expected that to be unsafe but if Dave Arnold does it, it’s probably ok. At the homebrew store 5lb of CO₂ was $20. I saw all kinds of ranges for how far 5lb of CO₂ would go, the low was about 30 gallons and the high was 155 gallons, so I’m gonna use 100 gallons, or 378 liters. I suspect water might be more soluble than beer, which would imply that the CO₂ won’t go as far, but I was unable to find scientific studies backing up my belief (or more clearly they all used different units and none of the ranges overlapped and I am not interested enough to make a graph and extrapolate.) Links to papers on the truth here are very welcome.

Assemble the Carbonator

Putting together the carbonator is pretty easy. I took a bunch of pictures when I did it, but I’d be surprised if anyone felt like they needed them.

Tools you need:

Tool Cost
Flathead Screwdriver $6
Big wrench or Pliers $16

I had these basic tools on hand, but you might not. A crescent wrench would work instead of the pliers but crescent wrenches are a big hassle so I avoid them. You’ll also need a bottle to put the soda in; I used a 1L bottle since my 2L bottle still has soda in it. As long as you can screw on the carbonator cap it should be fine.

First, put together the gas line assembly:

Gas Line Assembly 1

Use your screwdriver to tighten the clamp around the ball lock:

Gas Line Assembly 2

Next connect the gas line assembly to the regulator. Don’t forget to leave the clamp on before you connect it!

Gas Line Assembly and Regulator 1

Tighten the clamp with your screwdriver from before:

Gas Line Assembly and Regulator 2

Connect the regulator to the tank; you’ll need the wrench or pliers to tighten it sufficiently:

Regulator and Tank

Follow instructions:

Always follow instructions

Screw on the carbonator cap:

Carbonator Cap and Bottle

Connect the gas line assembly to the carbonator cap. Pull back the ball lock:

Carbonator Cap and Gas Line Assembly 1

and push it onto the cap till it’s flush:

Carbonator Cap and Gas Line Assembly 2

The complete assembly will look like this:

Fully Assembled

Make soda!

So that everything is abundantly clear, here is the assembly with labels:

Carbonation Rig with Labels

My system has no inlet gauge, but if it did, it would be where the label is. The inlet gauge measures how much CO₂ you have, and the outlet gauge measures how much pressure is in regulator. The tank knob opens the tank to the regulator, the regulator knob allows (a regulated amount of) pressure into the regulator, and the regulator valve allows the gas assembly (and thus the bottle) to be included in the regulator system.

I zero the system first so I know everything is normal.

  1. Make sure the tank knob is turned all the way tight (which is how I store my tank.)
  2. Connect the carbonator cap to the gas line assembly, but not screwed into a bottle.
  3. Open the valve at the bottom of the regulator. Now the outlet gauge should be at zero.
  4. Remove the carbonator cap and turn off the regulator valve.

So as I said before, I started with seltzer. All you need for this is water and a bottle. There is a relationship to water temperature, pressure, and the solubility of the CO₂. Or in layman’s terms: get the water good and cold, so that as much CO₂ as possible will go into the water. I put about a half a cup of crushed ice into the 1L bottle, filled it till it had a headroom of another half a cup of air, and then shook it till the ice melted. Prechilled water would be easier, less of a hassle, and if your ice maker is like mine: better tasting.

  1. Once you have the water ready, connect the bottle to the gas line assembly as explained in the last step above.
  2. Unscrew the tank knob a few turns.
  3. Open the regulator valve so that it’s in the on position (parallel with the tube.)
  4. Turn the regulator knob till it’s showing 40 psi. This is confusing: you turn clockwise to increase pressure. The water should be pressurized now.
  5. Shake the bottle (I shook for about 30 seconds) to help get the CO₂ into the water.
  6. Turn the regulator valve back into the off position.
  7. Disconnect the carbonator cap from the gas line assembly.
  8. Replace the carbonator cap with a plastic cap.
  9. If you are only doing one bottle, close the tank knob.
  10. Enjoy your soda!


Cost Breakdown

I mentioned that cost could be a reason to do this. To justify it, here is the total cost breakdown and when this setup “pays for itself.”

Here’s the generic equation I came up with to solve for when it’s worth getting the carbonator:

amount of liters to break even =
      (upfront carbonator cost - other upfront cost       ) /
      (per liter other cost    - per liter carbonator cost)
Type Upfront Cost Recurring Cost Recurring Cost per Liter Break Even
Carbonator $121 $20 / 5lb of CO₂ = 100 Gallons = 378 L
Sodastream $74 $64.44 / 180 L 35¢ 143 L = one refill
Store brand 0 $2 / 12 pack 12 floz cans = 7.2 L 27¢ 550 L = 1550 cans = 129 twelve packs
Schwepps 0 $5 / 6 pack of 10 floz bottles = 1.8 L $2.7 45 L = 152 bottles = 25 six packs
Q 0 $6 / 4 pack of 8 floz bottles = 0.9 L $6.67 18 L = 76 bottles = 19 four packs

Assumptions / Drawbacks: the fundamental premise above is that the soda you are making is equivalent to what you would be buying. For the store brand that’s pretty likely. With Schwepps I’m pretty confident that with a little effort you could get there. The Q stuff is likely much harder, though I haven’t spent a lot of effort trying to replicate it yet. If you are using a Sodastream you are almost sure to be making exact same product as the carbonator.

The other problem, ignoring the salts added, which is much of what differentiates the brands, is the water quality. When I use the filtered water from my fridge I get pretty good results. When I use the ice from my fridge I get noticeable off-flavors, which is a pretty big deal when it comes to something as subtly flavored as plain soda.

Next steps

Club Soda

The whole idea (and cost breakdown above!) is that I can make Club Soda with this. Typical club sodas contain varying amounts of:

I have read that you use 2 teaspoons of salt for 1 L of club soda, but I have not tested this yet. I will have a followup post with more detail on this, but in the meantime, this might get you started.


You don’t need to be able to carbonate a shrub directly in order to make a shrub, but it seems like a fun idea. If you don’t know what a shrub is, here are a few links on how to make them:

Gin and Tonics

Dave Arnold makes bottled Gin and Tonics with quinine sulfate, sugar, clarified lime juice, gin, and water, all carbonated in a bottle. I’d like to try that, but the clarified lime is too much right now (you need a centrifuge.) On top of that, for some reason tonic makes me nauseous nowadays, so I might try simply Gin and Club Soda.

Carbonated Chocolate Milk

One of my friends told me that he carbonated chocolate milk and really enjoyed that. It should be really easy to try. Another friend compared the idea to an egg cream, which I have not had so that’s not helpful to me, but maybe for you, dear reader.

More Typical Soda

I’d like to try to make a good root beer, since my wife and I drink it every Saturday in root beer floats, but I suspect that that is even more work than I have done just for the carbonator.

I also am interested in some basic sodas, like a lemongrass tea soda or a chamomile tea soda, or other simple beverages like that that are refreshing and interesting.

I hope this is as informative as anyone could want out of a post like this. I know it would have helped me. If you are interested in this, but not my more typical programming topics, you can check out the drinks tag and even subscribe to an rss feed just for that tag if that’s your jam.

Special Thanks

My friend C4 of Awesome Brewers, Great Job! was the one who gave me the initial bill of materials for this, and deserves a lot of credit for this happening. Thanks C4! Thee4!

See Also

Posted Tue, Aug 23, 2016

The Pomodoro Technique: Three Years Later

A few years ago I posted about my use of The Pomodoro Technique. I’ve been asked more than once for an update on if I still use it and how. Answers are here.

So the short answer is no; I am generally not using The Pomodoro Technique anymore. I do use it every now and then, but certainly not as much as I did before (probably 80% of my day or more.) At ZipRecruiter I have used it maybe five times in the year I’ve been here. The problem, in my mind, is that the entire idea is predicated on having a todo list. We do have an issue tracker at ZR and I can pull from that and often do, but for better or worse, a lot of what I do at ZR is reactionary.

I think this is because the team I am on (Core Infrastructure) by it’s very nature is on the hook when it comes to many emergency type situations. And while you can pull stuff out of the issue tracker and only work on that, many times the stuff in the issue tracker is just not as important as the thing you just discovered when you got back from lunch.

Another reason that The Pomodoro Technique is hard for me at ZR is that we have an open office plan. At Mitsi I had an office with a door that closed. While I didn’t close the door much of the time, it was easy for me to get into the zone and knock out some tasks. At ZR deep work typically happens early in the morning when no one else has arrived at the office yet. It’s not that people interrupt me, it’s more the general fear that they will. I suspect the way to get over this is to either move to the couches where I’m more secluded, or get better at telling people I’m working. Either way interruptions at least appear to abound.

Something I’ve realized is that, for me, The Pomodoro Technique is a way to compensate for low to medium grade ADD. What that means is that if I can focus “normally,” The Pomodoro Technique is unnecessary. For whatever reason, I can manage my ADD better at ZR so far. I suspect part of that is novelty, and part of that is that I have fallback tasks that are both worth doing and ok if I’m stuck doing while my ADD goes away (documentation, helping people, etc.)

Finally, at Mitsi I tended to have a lot of small tasks that were well suited to me as the lead developer; I could knock something out in one pomodoro that would take another engineer a whole day. It wasn’t because I’m smart; it’s because I’d learned the system and it’s history inside and out. At ZR I’m not even close to that, so estimation is not only often wrong, and things will take way longer than a pomodoro.

Posted Thu, Aug 18, 2016

Docker pstree: From The Inside

I recently posted about my docker-pstree tool and in the post mentioned that at some point I might port the tool to be 100% “in-container.” Well I couldn’t help myself and figured out how.

Saturday morning I was tinkering with making docker-pstree run from within the container itself. Recall that the intermediate goal is to find all “root” processes in the tree. I started off trying to port docker-root-pids to POSIX sh. This is difficult because the it relies on a hashmap, and POSIX sh doesn’t even have arrays. I considered actually doing math to generate a hash (as in numeric digest) of the keys and then using that to do a seek within a null terminated string, but that seems crazy and because you are doing a seek you might as well just seek using the string anyway (because it’s O(n).)

Environment based solution

At some point I realized I could use the environment has a sort of hashmap, and came up with the following:

eval `ps -eo pid= -o ppid= | \
  awk '{
    list=list "A" $2 " ";
    print "A" $1 "=A" $2 "; export A" $1
  END { print "export LIST=\"" list "\""}'`

Before I go further, let me break that down:

ps -e says to grab all processes, ps -eo pid= -o ppid= basically says we want the process id, the parent process id, and no header. The awk code tokenizes the input (so $1 is the process id and $2 is the parent process id.) The awk code is two blocks. The first block concatenates A, the parent process id, and a space, to itself every time, so it’s a space separated list of parent process ids. Then the main block prints A$pid=A$ppid; export A$pid. A space concatenates in awk, just like in Python and MySQL. Finally the END block prints the built up list as export LIST="$ppids". The output looks something like this:

A1=A0; export A1
A7=A1; export A7
A8=A1; export A8
A9=A1; export A9
A10=A1; export A10
A11=A1; export A11
A12=A7; export A12
A14=A11; export A14
A15=A10; export A15
A16=A9; export A16
A35094=A8; export A35094
A36488=A0; export A36488
A38783=A36488; export A38783
A38784=A36488; export A38784
export LIST="A0 A1 A1 A1 A1 A1 A7 A11 A10 A9 A8 A0 A36488 A36488 "

When this is evaluated with eval you can then iterate over the parent process ids like this:

for ppid in $LIST; do; echo $ppid; done

Or resolve the parent process ids like this:

for ppid in $LIST; do eval "echo \$$ppid"; done

ps based solution

Then I went on a walk to the farmer’s market (where I saw Ted Danson) and realized, after looking at the output above, that it can be easier:

ps -eopid= -oppid= | \
  grep '\s0$' | \
  awk '{ print $1 }'

So this prints all processes with a parent process id of zero, and then we can do this, like in the original docker-pstree:

ps -eopid= -oppid= | \
  grep '\s0$' | \
  awk '{ print $1 }' | \
  xargs -n1 pstree

pstree based solution

But then my pattern seeking brain realized I could do this:

pstree 0

This is a useful command to run even on a full Linux machine because the kernel threads do not run under init, and on my system the kernel threads are actually process id 2 with a parent process id of zero.


The nice thing about this is that it should work in any setup. If you have a way to run the docker client, the following should work reliably to get you complete and accurate pstree output:

docker exec -it my-container watch -n 1 pstree 0

The bummer is that on a really basic system, watch is limited to integers, and pstree can’t show arguments (and can’t do pretty unicode output.) To overcome the former, assuming you are not on windows, you can probably do this:

watch -n 0.3 docker exec -it my-container pstree 0

The other problem can only be solved by adding a more powerful pstree to your container. If you are using a traditional base image, like Ubuntu or Debian, your pstree probably came with psmisc and is already good enough. For something like Alpine which all my containers are based on, you just need to install psmisc yourself.

Container Tooling

For a long time I have felt that containers should be as minimal as possible and should not be “tooled up” to help debugging. This is motivated by the desire to create the smallest possible artifact with the lowest attack surface possible. I have colleagues who I respect who think it’s fine and good to put strace, psmisc, bash, etc into their containers for simpler debugging. I think that is definitely the pragmatic path forward, but I have a few ideas that I think would be superior.

rkt’s Stage 1

In rkt there is the concept of a “Stage 1“ which is basically the supervisor and log aggregator for your container. By default, this is systemd. I think it would be interesting to add more tooling to that image, so if you need to debug a container, you just run it differently and suddenly have all this extra tooling. I looked into doing this yesterday but gave up.

I think that the rkt architecture is superior to Docker’s because there is no central daemon. If the Docker daemon crashes, all the containers go down with it (by default,) with rkt this is not even a problem because there is no central daemon. There are other reasons I like rkt better but this is the core reason why.

On the other hand, I can add a user to the docker group and the user is thus fully able to control the docker daemon. In rkt this is impossible without setuiding the rkt binary, which people are reasonably not doing. So all that together means that I’m unlikely to switch my stuff to rkt just yet. I would consider making a super tiny script that just does rkt run commands and setuid‘ing that. We’ll see.


What I would love would be the ability to run a command like this:

sudo nsenter --pid --mount --target $pid pstree -Ua 0

The idea being that I would be in the pid namespace, and have access to the container’s /proc, but sadly the --mount namespace is basically chroot. It probably doesn’t have to be, but that’s how it works with nsenter. Because it’s a chroot the pstree implementation is the one inside of the container. Sad.

Docker Volumes

Docker allows you to mount volumes in a container. Volumes could be data (or binaries) from the host, that end up in a specific directory in the container. For super basic tools this could work, but how many tools work without a boatload of dependencies? At some point, going down this route, you’d end up back in the dependency hell that Docker is supposed to help solve.

The Magical Overlay

What I really want is a way to bolt on additional tools to a container while it’s running. Today that means using the host, but I think that container systems like Docker and rkt could provide ways to do this while still using a managed container. The rkt fly Stage 1 sorta works, as you get to use a container you made, but running with full host privileges, but it just sounds so difficult and frustrating to use fly to run gdb to debug a container.

I think all that I would need would be a fresh container that has the “to-debug” container mounted in /to-debug (or something, it could be configurable) and has complete access to all of the processes in that container. I suspect that to allow this containers would need to be more nested than they are today in Docker and rkt, more like how lmctfy had child containers.

I definitely don’t think we are there yet when it comes to container debugging, and I think that there is a pattern yet to be realized that will make container debugging more effective than even current systems where you simply have root on a VM.

Posted Mon, Aug 15, 2016

Linux Containers and Docker pstree

Once in a while I find myself wanting to see the state of a container from a bird’s eye view. My favorite way to do this is with a special tool I wrote called docker-pstree. Here is how it works. (Stay tuned for angst at the end.)

Typically in a virtual machine or in a container there is one root process which all other processes descend from. On a traditional system this is init(1), but in containers it is often simply your application.

The problem comes when one uses docker exec to run a process within a container:

$ docker exec -it /bin/sh -c 'ps ; echo "---"; pstree'
    1 1000       0:00 {} /bin/sh /bin/ KBIX KSMO
   43 1000       0:00 /bin/sh
   67 1000       0:00 sleep 600
   82 1000       0:00 /bin/sh -c ps ; echo "---"; pstree
   87 1000       0:00 ps

(note that the pstree does not include the current process, but ps does.)

The cause is that pstree (at least in busybox, which is what is used in this example) starts at pid 1 and walks from there and when you use docker exec, the new process is not under the “init” of the container, it’s under some other thing (the docker daemon, to be precise.)

Containers in linux are simply attributes on processes, set by modifying files under cgroupfs. The defacto location would be something like /sys/fs/cgroup/pids/$cgroup/tasks, where you add the pid to that file. So it makes perfect sense that a process could be in a container but not run by one of the other processes in the container.


There’s a fairly easy fix for this. The first is a tool I wrote to find “root” processes of a docker container:

#!/usr/bin/env perl

use 5.24.0;
use warnings;

my $target = shift;

my $container = `docker inspect --format {{.Id}} $target`;

my %pids;

# build hash to map pid->ppid of all procs in container
for my $line (map s/^\s+//r, grep m/\Q$container/, `ps -ww -eo pid= -o ppid= -o cgroup=`) {
   my ($pid, $ppid) = split /\s+/, $line;
   $pids{$pid} = $ppid;

# find ppids that aren't in the hash and dedup
my %result = map { $_ => 1 } grep !$pids{$_}, values %pids;
say $_ for keys %result;

And then I have a super simple wrapper around pstree called docker-pstree:


docker-root-pids "$1" | xargs -n1 pstree "${2:--U}"

This uses the host pstree, instead of the container pstree, which means that the pstree implementation is more powerful. I could reimplement all of the tooling to run inside the container without a lot of effort, but I’d end up rewriting the perl script since many of my containers have no scripting language at all except for dash. Oh and some of my containers might not have pstree either.

I tend to run the above with watch -n 0.3 docker-pstree

nsenter and solving problems

There’s a more generic tool than docker exec called nsenter that comes with util-linux, which includes such venerable tools as cfdisk, more, reset, and dmesg. I have blogged about unshare before, which is sortav a micro Docker that ships with util-linux. nsenter is a micro docker exec. I find it useful to use if only to see how it works:

nsenter -m -u -i -n -p -t "$(docker inspect --format '{{.State.Pid}}'" /bin/sh 

When a process is created by nsenter the core system calls (verified by calling strace on nsenter) are setns and clone. Here is the meat of the trace:

# 6968 is the pid of
open("/proc/6968/ns/ipc", O_RDONLY)     = 3
open("/proc/6968/ns/uts", O_RDONLY)     = 4
open("/proc/6968/ns/net", O_RDONLY)     = 5
open("/proc/6968/ns/pid", O_RDONLY)     = 6
open("/proc/6968/ns/mnt", O_RDONLY)     = 7

# These each corespond with one of the flags passed to nsenter
setns(3, CLONE_NEWIPC)                  = 0
close(3)                                = 0
setns(4, CLONE_NEWUTS)                  = 0
close(4)                                = 0
setns(5, CLONE_NEWNET)                  = 0
close(5)                                = 0
setns(6, CLONE_NEWPID)                  = 0
close(6)                                = 0
setns(7, CLONE_NEWNS)                   = 0
close(7)                                = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f849eb56ad0) = 38559

The frustrating thing here is that it doesn’t work with the tooling I created above! The problem is that containers in linux are complicated. I mentioned before that they are basically membership of cgroups. Well they are also membership of namespaces. nsenter merely enters the namespaces of the docker container, and doesn’t do anything with the cgroups. I wrote a little script to enter the cgroups of a container:

#!/usr/bin/env perl

use 5.22.0;
use warnings;

use IO::All;

my $pid = shift;

my @cgroups = map {
   my ($id, $subs, $cgroup) = split /:/;
   my @subs = map s/name=//r, split /,/, $subs;

   map "$_$cgroup", @subs;
} io->file("/proc/$pid/cgroup")->slurp;

io->file("/sys/fs/cgroup/$_/tasks")->append("$$\n") for @cgroups;

exec @ARGV;

So now, if for some reason you wanted to use nsenter instead of docker exec, you could do this:

PID="$(docker inspect --format '{{.State.Pid}}'"
sudo \
  cgenter $PID \
  nsenter -m -u -i -n -p -t $PID \

It’s not perfect, but it’s interesting!

The Inevitable Angst

To some extent I feel like this whole nsenter side-trip is evidence that the ad-hoc nature of Linux containers does end up leaving something to be desired. Without something like docker or LXC to tie the disparate pieces together, it just ends up being a hassle.

What I find even weirder is that while namespaces and cgroups, taken together, make containers, they act pretty differently. One is controlled with the magical cgroupfs filesystem and the other is controlled with system calls. There’s a handy, clear manpage for namespaces (namespaces(7)) while cgroups are documented in the not typically installed kernel documentation (/Documentation/cgroup-v1/ specifically.) I can see the cgroups for a process as a user with ps, as above, but to see the pidns (or presumably other namespaces) I have to be root. Why aren’t they more similar?

At the very least, my tooling works, and I could make it use namespaces if I end up being willing to run it as root. The easiest and most robust option will probably involve rewriting the perl script in shell and implementing docker-pstree and a gnarly docker exec call. I might do that if I ever end up using this tool for more than my containers on my laptop. While the situation is frustrating, the tooling still ends up being fairly straightforward and useful.

Posted Fri, Aug 12, 2016

Open Source Infrastructure and DBIx::Class Diagnostics Improvements

Many people know that Peter Rabbitson has been wrapping up his time with DBIx::Class after his attempt to get funding for working on it didn’t work out. I have long had some scraps of notes on a post about that whole situation and how troubling it is but I could just never make it happen. The following is the gigantic commit message of the merge of a large chunk of his work. I offered to host it since I think that it should actually get read. I have left it almost completely unchanged, except to make things proper links. More thoughts after the post.

Merge the ResultSource diagnostics rework

…And this is what the products that we make do: these are the consequences. They either empower people, or they steal bits of their lives. Because experiences are all we have in life: if you think about them as grains of sand in an hour glass, once those grains are gone – they are gone. And experiences with people and experiences with things: they use up the same grains.

That’s why we have a profound responsibility to respect the experiences of the people that we build for…

Aral Balkan: Free is a Lie TNW 2014

This set of commits is unusual - the 2+kloc of changes (in lib/ alone) do not add any new runtime functionality, nor do these changes alter significantly any aspect of DBIC’s runtime operation. Instead this is a culmination of a nearly 4 months long death-march ensuring the increasingly complex and more frequent (courtesy of rising use of Moo(se)) failure modes can be reasoned about and acted upon by ordinary users, without the need to reach out to a support channel.

The changeset has been extensively tested against 247 downstream CPAN dists (as described at the end of commit 12e7015) and against several darkpan test suites. As of this merge there are no known issues except RT#114440 and a number of dists (enumerated in 12e7015) now emitting REALLY LOUD though warranted and actionable, diagnostic messages.

The diagnostic is emitted directly on STDERR - this was a deliberate choice designed to:

  1. prevent various test suites from failing due to unexpected warnings

  2. make the warnings harder to silence by a well meaning but often too eager-yet-not-sufficiently-dilligent staffer, before the warnings had a chance to reach a senior developer

What follows is a little bit of gory technical details on the commit series, as the work is both generic/interesting enough to be applied to other large scale systems, and is “clever” enough to not be easily reasoned about without a summary. Think of this as a blog post within an unusual medium ;)


Some necessary history: DBIC as a project is rather old. When it got started Moose wasn’t a thing. Neither (for perspective) was jQuery or even Tw(i)tt(e)r. The software it was modeled on (Class::DBI) has “single-level” metadata: you have one class per table, and columns/accessor were defined on that class and that was it. At the time mst made the brilliant decision to keep the original class-based API (so that the CDBI test suite can be reused almost verbatim, see ea2e61b) while at the same time moving the metadata to a “metaclass instance” of sorts. The way this worked was for each level of:

  • Individual Result Class (class itself, not instance)
  • Result Class attached to a Schema class
  • Result Class attached to a Schema instance

to have a separate copy-on-the-spot created metadata instance object of DBIx::Class::ResultSource. One can easily see this by executing:

~/dbic_checkout$ perl -Ilib -It/lib -MDBICTest -MData::Dumper -e '
  my $s = DBICTest->init_schema;
  $Data::Dumper::Maxdepth = 1;
  warn Dumper [

The technique (and ingenious design) worked great. The downside was that nobody ever really audited the entire stack past the original implementation. The codebase grew, and mistakes started to seep in: sometimes modifications (add_columns, etc) would happen on a derivative metadata instance, while the getters would still be invoked on the “parent” (which at this point was oblivious of its “child” existence, and vice versa). In addition there was a weird accessor split: given a result instance one could reach different metadata instances via either result_source() or result_source_instance(). To add insult to the injury the latter method is never defined anywhere, and was always dynamically brought to life at runtime via an accessor maker call on each individual class.

If that weren’t bad enough, some (but crucially not all) routines used to manipulate resultsource metadata were proxied to the main Result classes, also aiming at allowing the reuse of the existing Class::DBI test suite, and to provide a more familiar environment to Class::DBI converts. The complete map of current metadata manipulation methods and their visibility from a typical ResultClass can be seen at the end of commit message 28ef946.

The downside was that to an outsider it would seem only natural that if in order to make something metadata-related happen, one normally calls:


then it makes sense that one should be able to override it via:

sub SomeResultClass::set_primary_key {
  my $ret = shift->next::method(@_);
  { do extra stuff }

That thinking has been applied to pretty much all straight-pass-through getters in the wild, with the expectation that DBIC will respect them throughout, like e.g.. In reality this never happened - half of DBIC would never even look at the Result class and instead simply called the needed method on the result source instance directly. As noted in 28ef946: the overwhelmingly common practice is to hook a method in a Result class and to “hope for the best”. A rare example of “doing it right” would be DBIx::Class::ResultSource::MultipleTableInheritance, but as can be seen from its SYNOPSIS the API is rather counterintuitive (what is table_class() anyway?!) and more importantly - the earlier example seems “just right”.

Another innovation (remember: pre-Moose) was the use of the just-in-time implemented alternative C3 method resolution order (MRO) right on top of the default perl DFS MRO. While DBIC used multiple inheritance (MI) from the start, with all the corresponding problems and non-scalable “solutions”, it wasn’t until C3 MRO became available that the true potential of the resulting plugin system became clear. To this day (mid-2016) MI, as used within the DBIC ecosystem, remains the single most flexible (and thus superior given the problem domain) plugin-system on CPAN, easily surpassing rigid delegation, and having an upper hand on role-based solutions as promoted by the Moo(se) ecosystem. It must be noted that delegation and/or roles are not without uses - they are an excellent (and frankly should be a default) choice for many application-level systems. It is the mid-level to low-level libraries like DBIC, where the stateless nature of a predictable yet non-coordinated call-order resolution truly begins to shine.


Things stayed undisturbed for a while, until around 2012~2013 folks started showing up with more and more complaints which all traced to Moo(se)-based subclassing. Originally the C3 MRO composition worked just fine, because almost invariably a ->load_components() call (which explicitly switches the callER MRO) would have happened early enough in the life of any end-user Result/ResultSource class. But when extends()/with() got more prominent this was lost. The more complex the inheritance chain - the more likely that the topmost leaf class is in fact stuck under DFS mro with everything going sideways from there. Sometimes with truly mindbending failure cases. There was no clear solution at the time, and aside from some toothless documentation warnings nothing was done to address this (in fact even the doc-patch itself is incomplete.)

The inconsistencies, and the resulting mistakes, however, were all localized, and even though the problems were often major, each instance was sufficiently different (and bizarre) that each individual deployment could neither report them properly, nor find the time to reason through the layers of history in order to arrive at a solution they fully understand. Yet the original design which solidified towards the end of 2007 was just good enough to keep being kicked down the road.

But people kept writing more and more MOP-inspired stuff. Given the general tendency of perl code to get “all over the place”, the desire was only natural to standardize on “one true way” of doing OO throughout an entire end-user project/app. And there were more and more ways in the wild to combine/abstract individual Result classes and ResultSet components. The comprehensive DBIx::Class::Helpers are just the tip of the heap of all possible permutations DBIC is exposed to. Towards mid-2015 it became utterly untenable to brush off problems with “meh, just don’t do that and all will be be fine”.

On the personal front I first ran into the baroque jenga tower head-on when I tried to make sense of the ResultSource subsystem in an airport lounge pre-YAPC::EU 2011 (Riga). I honestly do not remember why I started digging in this direction but the result of that attempt (and the later effort to revive it) got immortalized in my local tree. Enough said.

Next was the dash to implement sane relationship resolution semantics in 03f6d1f, and then in 350e8d5 (which was actually needed to allow for d0cefd9 to take place… sigh). During that journey 4006691 made a subtle but fatal in the long run change - it upset the balance of which source instance object we looked at during some (but not all) codepaths. The really sad part is that I had the feeling that something is not right, and even made a record of it as the last paragraph of 350e8d5. But light testing did not reveal anything, and I irresponsibly shipped everything as-is a bit later. It wasn’t until Oct 2015 that someone noticed this being an actual problem. Early attempts to fix it quickly demonstrated just how deep the rabbit hole goes, and were the main reason the entirety of this work was undertaken: the accumulated debt simply did not leave any room for a half-way solution :/


The writeup below describes only the final set of commits: it does not cover driving into and backing out of at least 3 dead-ends, nor does it cover the 5 distinct rewrites and re-shuffles of the entire stack as more and more involved testing revealed more and more involved failure modes. I must stress that if you plan to undertake a similar crusade against another projects architectural debt you are in for a rough (but not impossible!) ride. The height of the “tenacity-bar” necessary to pull off such work is not reflected in any way within the seemingly effortless walkthrough that follows. It is also worth acknowledging that the code at times is incredibly terse and hard to follow: this was a deliberate choice as the extra diagnostic sites that are enabled during runtime had to be implemented as “close to the VM”, so to speak, as possible. In isolation none of the contortions are warranted, but because I ended up with so many of them the result does pay off. See comments within individual commit messages for various performance impacts for more info.

As first order of business some mechanism was needed to track the logical relationship between the 3 levels of ResultSource instances as shown earlier in this writeup. Luckily, the user-unfriendly nature of the metadata stack meant there are very few spots on CPAN (and to the best of my knowledge on DarkPAN) that do anything exotic with the subsystem. This means the simplest thing would in fact work and was implemented as 534aff6: corral all instantiations of ResultSource objects (and Schema objects while we are at it.) This code ensured that nothing in the stack will create an instance of either class-type without our knowledge. With that in place, we also provide an explicit clone method encouraging folks to use that whenever possible. The switch of all relevant callsites within DBIC itself was verified through another check within new, guarded by the same compile-time assertion constant (which in turn was provided by both the CI and the local smoke-script from 5b87fc0)

With the above in place, ensuring 99.99% of the ResultSource “derivative” instances were obtained via $rsrc->clone, it was time for 0ff3368. A simple private registry hash with object addresses as keys and this hash as values:

  derivatives => {
    addr_derived_rsrc_1 => $reference_to_infohash_of_derived_rsrc_1,
    addr_derived_rsrc_2 => $reference_to_infohash_of_derived_rsrc_2,
  weakref => $weak_reference_of_self,

As necessary for any structure holding addresses of object references, a CLONE “renumbering” routine takes care of keeping everything in sync on iThread spawns (if you believe that iThreads are evil and one shouldn’t go through the trouble: be reminded that any call of fork() within a Win32 perl is effectively an iThread, and fork() can and is being called by some CPAN modules implicitly).

Now that we had a good handle on “what came from where”, the first major diagnostic milestone 73f54e2 could be covered. As can be seen in the table of methods in commit 28ef946 there are only a handful of attributes on an actual ResultSource class. A couple new Class::Accessor::Grouped method types were added, which would behave just like the ‘simple’ and ‘component_class’ they were replacing, but with a twist:

The result is the exact warning as described in commit message 73f54e2. Of course there are some extra considerations - some high-level setters (e.g. remove_columns) do call a getter underneath to do their job. These cases had to be short-circuited by using a local()-based “setter callstack” mark. But in general the changeset has been surprisingly non-invasive: once the proper hook points were identified the rest was a breeze. There was also a brief scratching of heads when the last stages of DarkPAN tests emitted errors which I myself could not explain for a while, until the reason (and trivial solution) were identified in d56e05c and here.

As a brief detour, I considered switching ResultSource to a proper Moo class, but quickly abandoned this idea as there are no provision for clean get-time triggers. Nevertheless the attempt was a useful demonstration what does it take to switch a low-level class (which means many somewhat questionable uses by consumers in the wild) to Moo(se) with zero loss of functionality. The result is preserved for posterity as 8ae83f0e.

While working on the above and f064a2a (the solution to RT#107462), it occurred to me that the confusion of having both result_source_instance() and result_source() can be reduced further by forcing all “getter” calls to go through result_source() which is defined in and is thus always available. The result was the improved diagnostic as described in the commit message of e570488, but also a useful set of assertions that were used to weed out many of the wrinkles.

The next major step was to resolve once and for all the fallout from incorrect inheritance composition. The highly dynamic nature of all Perl’s programs, an “eternal compile/execute/compile/execute… cycle”, meant that just “fixing things” as DBIC sees them would not work - calling set_mro() could do little when called late enough. This led to the revert of the originally-promising “forced c3-fication” of the stack 7648acb. Instead the practical design turned out to be “let the user know and carry on”.

The first part of getting there was to devise a way to precisely and very quickly tell “what does a class look like right now?” I have been brooding over how to do this since mid-February, but it wasn’t until I noticed the excellent App::Isa::Splain by @kentfredric, that the final interface came into focus: 296248c (with several minor fixups later on). Here I want to take a moment to apologize to @kentfredric, as he was led on a several week long wild-goose chase due to a misguided comment of mine :(

Amusingly while implementing this I hit a wall related to perl 5.8 (for the first time in 6+ years): As stated in the timings at the end of commit message 296248c and as elaborated here - the non-core MRO is just too expensive to work with. This resulted in a 1.5 week long detour to try to squeeze every last ounce of performance. Amusingly I ran into a lot of “interesting” stuff along the way. The result was not only a semi-usable 5.8 implementation, but even running on 5.10+ was sped up about 2 times in the end, which translated into tangible gains in the end: the number cited as 16% in 12e7015 was originally 28%(!). The moral of this story? - gerontoperlia makes your modern foundation code better.

With a reliable way to tell what each methods “variant stack” looks like, it was trivial to implement the valid_c3_composition part of ::SanityChecker - one would simply check a class’ MRO, and in case of dfs compare all stacks to what they would look like if the MRO were c3.

In parallel but unrelated to the above the ever increasing tightening of various DBIC internal callpaths (e505369, d99f2db, 3b02022) had to be addressed in some way. The urgency truly “hit home” when testing revealed RT#114440 - it was nothing short of a miracle this code survived that long without being utterly broken by other components. The solution came out of crossing the work on describe_class_methods (296248c) with the concept of the fail_on_internal_call guard (77c3a5d). We already have a list of method “shadowing stacks” (to borrow @kentfredric’s terminology) - if we find a way to annotate methods in a way that we can tell when a “non-overrideable” method was in fact overridden - we will be able to report this to the user.

The somewhat fallen out of favor subsystem of function attributes was chosen to carry out the “annotation” task. It must be noted that this is one of the few uses of attributes on CPAN that is architecturally consistent with how attributes were originally implemented. An attribute is meant to attach to a specific reference ( in our case a code reference ), instead of a name. This is also why the FETCH/MODIFY_type_ATTRIBUTE API operates strictly with references. As an illustration why tracking attributes by name is fraught with peril consider the following:

perl -e '
  use Data::Dumper;
  use Moose;
  use MooseX::MethodAttributes;

  sub somemethod :Method_expected_to_always_returns_true { return 1 };

  around somemethod => sub { return 0 };

  warn Dumper {
    attributes => __PACKAGE__->meta->get_method("somemethod")->attributes,
    result => __PACKAGE__->somemethod

It should also be noted that as of this merge describe_class_methods lacks a mechanism to “see” code references captured by around-type modifiers, and by extension the “around-ed” function’s attributes will not appear in the “shadowed stack”. A future modification of Class::Method::Modifiers, allowing minimal introspection of what was done to which coderef should alleviate most of this problem.

Once all relevant methods were tagged with a DBIC_method_is_indirect_sugar attribute in 1b822bd, it was trivial to implement the schema sanity check no_indirect_method_overrides which simply ensures no user-provided method “shadows” a superclass method with the sugar attribute set.

The success of the attribute-based approach prompted a pass of annotating all the methods DBIC generates for one reason or another: 09d8fb4. Aside from enabling the last improvement, it also allowed to replicate a part of the DBIx::Class::IntrospectableM2M functionality in core, without elevating the status of the m2m sugar methods in any way (the historic labeling of these helpers as relationships is a long standing source of confusion). See the commit message of 09d8fb4 for a couple use-cases.

The last piece of the puzzle 28ef946 addressed the “override and hope for the best” duality of ResultSource proxied methods as described at the start of this writeup. What we essentially do is add an around() for every method in ResultSource, which then checks whether it was called via ResultSourceProxy (inherited from DBIx::Class::Core), or directly via the ResultSource instance: i.e. MySchema::Result::Foo->proxied vs $rsrc->proxied IFF we are called directly and there is an override of the same method on the currently-used $rsrc->result_class we either follow one of the options as given by an attribute annotation, or we emit a diag message so that the user can do something about it.

That was easy wasn’t it?

Final Thoughts

This work took about 50 person-days to carry out, and for obvious reasons expanded to cover a much larger period of actual wall-time. While I am by far not the most efficient developer that I have met, I am pretty sure that the process of planning, designing, implementing and testing all of this could not have been significantly accelerated. Even at the (laughable) rate of $50/h The Perl Foundation is willing to pay for unique talent this endeavor would cost at least $20,000 USD - way beyond the scope (and aim?) of a TPF grant. On the other hand it would be surprising if this work can be qualified as unnecessary. I personally estimate that the savings due to the proper diagnostics alone will “make up” for the effort within the first month of wide deployment of these improvements. Time will tell of course, as the stream of questions is only about to start come the first days of August.

In any case - this project is by far not the only one in dire need of such “humane” overhaul. Moo, Catalyst, various pieces of the toolchain, and other staples of what is known as “modern perl5” are in similar or worse shape: a situation which can not be rectified simply by “writing patches” without a concerted effort directed by a single dedicated individual.

I yet again strongly urge the “powers of Perl” to reconsider their hands-off approach to funding the consistently shrinking pool of maintainers. PLEASE consider stealing (in the spirit of tradition) the proven successful model of RubyTogether before you end up losing even more maintainers like myself.

Peter “ribasushi” Rabbitson

Outgoing maintainer of a cornerstone Perl5 ecosystem project

Here are a few takeaways from the post from my perspective.

Volunteer Burnout

People get burned out all the time. The causes are myriad: not enough rest, depression, unrewarding work, etc. I feel like OSS and volunteers in general get burnt out for more specific reasons.

First off, they aren’t getting paid, so they tend to be doing this OSS in their rest time. ribasushi tried to resolve this, but companies who use OSS were unable, unwilling, or too out-of-touch to fund him in his efforts.

Second, I think that people who are volunteering get an unreasonable amount of responsibility put on their shoulders. This is internal: feeling guilty for not fixing a bug or whatever. (I know ribasushi suffer’s from this.) And ashamedly, external: when people point the finger and demand work. I have seen this happen within the Perl community and am disgusted that someone would demand a gift like this. Luckily for the individual, I didn’t save the link in my notes. (Ask me privately and I may relay the story; I recall it, and it’s public, but don’t think it’s worth the effort to preserve the event in infamy.)

OSS Infrastructure is Not Solved

Open Source is great for scratching your own itch. I consider my body of work to be a testament to this. In general, most of my own effort is small bits of effort that pay off well for me, and maybe a few other people. I think this is often why OSS software is often hastily put together. The entire idea is predicated on New Jersey Style work, of which I am an unashamed believer in.

The problem comes with things that are too complicated to be built this way. A webserver is simple. An ORM that supports many databases as well as DBIx::Class is not. I think it is reasonable and unsurprising the the DBIx::Class tarball is over 4 times as big as Plack. The problem is that the money for pure OSS (that is, stuff that is not using a freemium model like Chef or Puppet) is fairly scant.

The Linux Kernel is funded, but honestly not really, because the majority of the contributors are paid to work on it for hardware or software their employers use. A model where a single individual (or even a small group) support a piece of software that many companies use does not seem to be common. We all know what happened to OpenSSL in 2014, and that is another critical piece of OSS infrastructure. There has been some effort to fund it, but given that it’s so foundational compared to DBIx::Class, I just cannot see that model working for DBIx::Class either.

Stack Decoration with Attributes

I think this is a really interesting technique and think that it would be an interesting way to, for example, filter backtraces. Like it would be nice not the have Try::Tiny in the middle of all my backtraces; this technique could resolve that. The current method involves weird regular expressions to filter packages etc. Could be cool.

I know this has been a long post. I hope some people read it and consider how some of these problems can be solved in the long run. For my part, I bid ribasushi a fond farewell. I do not have any problem with how long his work has taken except that was more time stolen from him to live his life. If DBIx::Class does indeed get a new janitor, I wish him or her luck. These are big shoes to fill.

Posted Mon, Aug 1, 2016

Building Secure UserAgents

I have been working on making an HTTP client (also known as a user agent) that is safe for end-users to control. I investigated building it in Perl, Python, asynchronous Perl, and Go.

During my brief downtime during my paternity leave I’ve been toying with a new application. One of the things this application will do is make web requests on behalf of users. There are plenty of examples of applications that do this already: RSS Readers, anything that has OpenID login support, and things that do postbacks; when someone sends an SMS to my Twilio number, it hits an endpoint of my choosing.

Sometimes applications that do these kinds of requests can be vulnerable to attack. Last year Clint Ruoho found a handful of problems with Pocket, a service Mozilla had recently bundled with Firefox.

The vulnerabilities listed there are only the beginning. Here are some things that an attacker could do:

  • Connect to private services, listening only on localhost, assumed to be secure
  • Read from AWS EC2 UserData (which Ruoho did in the example above)
  • Connect to private services running on other servers, that are not normally addressible to the outside world

How do we protect against this?

I suspect that most people protect against this by analyzing the url in the request.

if ($req->url->host eq '') { ... }

For example, today, if you go to (or the localhost version) it knows that you are hitting a “non-internet” URL. I made a domain ( that resolves to and today, if you go to it claims that the site is actually up. And that’s just the tip of the iceberg. We can tell that is running in an AWS-like environment because seems to be “up” from the server’s perspective. There are a non-trivial number of private IP addresses like this (details in the appendix.)

So at the very least we cannot merely inspect the request, we need to verify the resolution of the domain.

use Socket 'getaddrinfo', 'NI_NUMERICHOST';
my (undef, @addrs) = getaddrinfo($req->uri->host, NI_NUMERICHOST);
my @ips = map {
   my (undef, $ip, $service) = getnameinfo($_->{addr}, NI_NUMERICHOST);
} @addrs;
if (grep { $_ eq '' } @ips) { ... }

Even that is insufficient though. As Ruoho found, many user agents will automatically handle redirects, so even though the implementor may have done all of the above (which I think is non-trivial; I left out a lot of error handling in the second part and none of it correctly handles all of the various IP masks,) a domain could be validated, and then redirect to an IP that should have been blocked.

There’s also what is sometimes called “tarpits.” Some user agents define timeouts as “stall” timeouts: they reset when any progress is made. Consider the Slowloris attack, but implemented at the server side instead of at the client. Similarly a DNS server can return long chains of CNAMEs to cause the same kind of problem. This should be fixed with a global timeout (instead of the more common stall timeouts referenced before.)

Another vulnerability is unexpected schemata for requests. Some clients are smart enough to access file://, ftp://, etc. Clients like this must be defanged such that they only access http:// and https://. I tend to only use less magical clients, but support for the above is only a patch away.


The redirect detail makes it clear that the post resolution verification must happen within the user agent. A solid user agent design should make this reasonably doable. The first user agent I’d heard of that tackled these problems (though likely not the first in existence) is LWPx::ParanoidAgent, made by Brad Fitzpatrick almost surely while at LiveJournal to protect against attacks originating from OpenID servers. LWP::UserAgent::Paranoid has since supplanted it with better, more modular code; but the general idea and usage is the same.


The problem with these two modules is that they are written in the classic blocking style. If you need to make 20 HTTP requests and each takes 0.5s you just spent 10s. Newer tools are asynchronous, and so could do 20 HTTP requests in parallel. When I do async in Perl I use IO::Async. In IO::Async here is how you could create a safe client:

#!/usr/bin/env perl

use 5.24.0;
use warnings;

use Net::Async::HTTP;
use IO::Async::Loop::Epoll;

use Net::Subnet;

# this list is incomplete, see the appendix
my $private = subnet_matcher qw(

my $loop = IO::Async::Loop::Epoll->new;
my $http = Net::Async::HTTP->new(
   timeout => 10,

$loop->add( $http );

my ( $response ) = $http->do_request(
   uri => URI->new( shift ),
   on_ready => sub {
      my $sock = $_[0]->read_handle->peerhost;
      if ($private->($sock)) {
        close $sock;
        return Future->fail('Illegal IP') 

print $response->code;

If I end up using Perl for this project I’ll likely publish a subclass of naHTTP, or submit a patch, allowing the on_ready handler to be set for the whole class instead of requiring it to be set per request.


Before I came up with the async Perl option above I had come to the conclusion that it was be a ton of work to get it working in IO::Async and that I should just use Go. I might still use Go, as it’s more well supported for code of this nature. In Go I was able to basically use the same technique as above:

package main

import (

func main() {
  _, net1, _ := net.ParseCIDR("")
  _, net2, _ := net.ParseCIDR("")
  _, net3, _ := net.ParseCIDR("")
  _, net4, _ := net.ParseCIDR("")
  _, net5, _ := net.ParseCIDR("")
  nets := [](*net.IPNet){net1, net2, net3, net4, net5}

  internalClient := &http.Client{
    Timeout: 10 * time.Second,
    Transport: &http.Transport{
      Dial: func(network, addr string) (net.Conn, error) {
        conn, err := net.Dial(network, addr)

        if err != nil {
          return nil, err

        ipStr, _, err := net.SplitHostPort(conn.RemoteAddr().String())
        // no idea how this could happen
        if err != nil {
          return nil, err

        ip := net.ParseIP(ipStr)
        for _, net := range nets {
          if net.Contains(ip) {
            err := conn.Close()
            if err != nil {
              // wtf
            return nil, errors.New("Illegal IP")

        return conn, nil

  res, err := internalClient.Get(os.Args[1])

  if err != nil {


The above is very similar to the IO::Async version. Basically we set a global timeout on the client, and then in the code that connects to a socket, vet the socket before continuing.


Perl is not really the “big dog” of dynamic languages anymore, so I figured I’d document how to do this with a more popular language. I mentioned that I’ve been toying with Python lately already, so it seemed like the most natural choice. If you know how to do this with other languages hit me up.

I looked at urllib2, urllib3, and requests, and it seemed like this kind of feature is impossible in these popular Python libraries without significant rewriting, duplication, or patches. I would love to be wrong here, and will update this post if someone can show me how to do what needs to be done. Otherwise, if you are using Python and need to do requests on behalf of the user, best of luck: you may end up writing your own HTTP client.

Also beware that at least urllib2 is helpful enough to provide support for file://. Make sure that if you are using urllib2, even indirectly, you remove support for untrusted handlers.

As with all security concerns, this is about measuring the cost of failure. There is no bug free code; the cost of eternal vigilance and perfection are too high. The only other option I know of would be to spin up a completely separate virtual machine isolated as much as possible from the rest of your system, in it’s own DMZ maybe. This is feasible, but it is certainly a high cost alternative to something that’s not technically difficult.

I was surprised at how easy this was in both Go and IO::Async after striking upon the post-connection verification idea. Initially I had assumed that this was a nearly impossible to solve problem, because I assumed it needed to hook into DNS resolution directly.

The other big win in this modern day and age is that timeouts are easier to implement, and tend to be more trustworthy.

I hope this helps!

Appendix: Private Ranges

Please do not assume that this list is complete. I would love for it to be up-to-date and trustworthy, but it requires knowing all of the relevant RFC’s. Here are the ones I know about and where they are from, almost all of these were informed by RFC6890, Sections 2.2.2 and 2.2.3. Note also that some of these may not be a security vulnerability, like, but generally I doubt that the extra check is going to be expensive enough to matter.

Address Block Relevant RFC RFC1122 RFC1918 RFC6598 RFC1122 RFC3927 RFC1918 RFC6890 RFC6333 RFC5737 RFC3068 RFC1918 RFC2544 RFC5737 RFC5737 RFC1112 RFC0919

The IPv6 ranges have a lot of weird stuff in them. One block, for example, was terminated already a couple years ago. Again, I suspect that for most of them it’s safe to block them and then remove the block later if you find that you need to (like if you absurdly end up on an IPv6 only network.)

Address Block Relevant RFC
::1/128 RFC4291
::/128 RFC4291
64:ff9b::/96 RFC6052
::ffff:0:0/96 RFC4291
100::/64 RFC6666
2001::/23 RFC2928
2001::/32 RFC4380
2001:2::/48 RFC5180
2001:db8::/32 RFC3849
2001:10::/28 RFC4843
2002::/16 RFC3056
fc00::/7 RFC4193
fe80::/10 RFC4291

There are likely more. I think the definitive listings are here and here respectively, but some of the blocks in those listings don’t look private to me.

Posted Mon, Jul 25, 2016

A visit to the Workshop: Hugo/Unix/Vim integration

I write a lot of little tools and take pride in thinking of myself as a toolsmith. This is the first post of hopefully many specifically highlighting the process of the creation of a new tool.

I wanted to do some tag normalization and tag pruning on my blog, to make the tags more useful (eg instead of having all of dbic, dbix-class, and dbixclass just pick one.) Here’s how I did it.

As mentioned previously this blog is generated by Hugo. Hugo is excellent at generating static content; indeed that is it’s raison d’être. But there are places where it does not do some of the things that a typical blogging engine would.

To normalize tags I wanted to look at tags with their counts, and then associated filenames for a given tag. If I were using WordPress I’d navigate around the web interface and click edit and this use case would be handled. Not for me though, because I want to avoid the use of my web browser if at all possible. It’s bloated, slow, and limited.

Anatomy of an Article

Before I go much further here is a super quick primer on what an article looks like in hugo:

aliases: ["/archives/984"]
title: "Previous Post Updated"
date: "2009-07-24T00:59:37-05:00"
tags: ["book", "catalyst", "perl", "update"]
guid: ""
Sorry about that guys, I didn't use **links** to make it clear which book I was
talking about. Usually I do that kind of stuff but the internet was sucky
(fixed!) so it hurt to look up links. Enjoy?

The top part is YAML. Hugo supports lots of different metadata formats but all of my posts use YAML. The part after the --- is the content, which is simply markdown.

Unix Style Tools

My first run at this general problem was to build a few simple tools. Here’s the one that would extract the metadata:

#!/usr/bin/env perl

use 5.22.0;
use warnings;
use autodie;

for my $file (@ARGV) {
  open my $fh, '<', $file;
  my $cnt = 0;
  while (<$fh>) {
    $cnt ++ if $_ eq "---\n";
    print $_ if $cnt < 2

The above returns the YAML part, which can then be consumed by a tool with a YAML parser.

Then I built a tool on top of that, called tag-count:

#!/usr/bin/env perl

use 5.22.0;
use warnings;

use sort 'stable';

use experimental 'postderef';

use YAML;

my $yaml = `bin/metadata content/posts/*`;
my @all_data = Load($yaml);

my @tags = map(($_->{tags}||[])->@*, @all_data);
my %tags;

$tags{$_}++ for @tags;

for (sort { $tags{$b} <=> $tags{$a} } sort keys %tags) {
   printf "%3d $_\n", $tags{$_}

That works, but it’s somewhat inflexible. When I thought about how I wanted to get the filenames for a given tag I decided I’d need to modify the metadata script, or make the calling script a lot more intelligent.

Advanced Unix Tools

So the metadata extractor turned out to be too simple. At some point I had the realization that what I really wanted was a database of data about my posts that I could query with SQL. Tools built on top of that would be straightforward to build and their function would be clear.

So I whipped up what I call q:

#!/usr/bin/env perl

use 5.22.0;
use warnings;
use autodie;
use experimental 'postderef';

use DBI;
use File::Find::Rule;
use Getopt::Long;
my $sql;
my $formatter;
GetOptions (
   'sql=s' => \$sql,
   'formatter=s' => \$formatter,
) or die("Error in command line arguments\n");

use YAML::XS 'Load';

# build schema
my $dbh = DBI->connect('dbi:SQLite::memory:', {
      RaiseError => 1,

   CREATE TABLE articles (

   CREATE TABLE article_tag ( guid, tag )

   CREATE VIEW _ ( guid, title, date, filename, tag ) AS
   SELECT a.guid, title, date, filename, tag
   FROM articles a
   JOIN article_tag at ON a.guid = at.guid

# populate schema
for my $file (File::Find::Rule->file->name('*.md')->in('content')) {
  open my $fh, '<', $file;
  my $cnt = 0;
  my $yaml = "";
  while (<$fh>) {
    $cnt ++ if $_ eq "---\n";
    $yaml .= $_ if $cnt < 2
  my $data = Load($yaml);
  $data->{tags} ||= [];

  $dbh->do(<<'SQL', undef, $data->{guid}, $data->{title}, $data->{date}, $file);
      INSERT INTO articles (guid, title, date, filename) VALUES (?, ?, ?, ?)

  $dbh->do(<<'SQL', undef, $data->{guid}, $_) for $data->{tags}->@*;
      INSERT INTO article_tag (guid, tag) VALUES (?, ?)

# run sql
my $sth = $dbh->prepare($sql || die "pass some SQL yo\n");

# show output
for my $row ($sth->fetchall_arrayref({})->@*) {
   my $code = $formatter || 'join "\t", map $r{$_}, sort keys %r';
   say((sub { my %r = $_[0]->%*; eval $code })->($row))

With less than 80 lines of code I have a super flexible tool for querying my corpus! Here are the two tools mentioned above, as q scripts:



exec bin/q \
   --sql 'SELECT COUNT(*) AS c, tag FROM _ GROUP BY tag ORDER BY COUNT(*), tag' \
   --formatter 'sprintf "%3d  %s", $r{c}, $r{tag}'



exec bin/q --sql "SELECT filename FROM _ WHERE tag = ?" -- "$1"

And then this one, which I was especially pleased with because it was a use case I came up with after building q.



exec bin/q --sql 'SELECT filename, title, date FROM articles ORDER BY date DESC' \
      --format 'my ($d) = split /T/, $r{date}; "$r{filename}:1:$d $r{title}"'

I’m pleasantly surprised that this is fast. All of the above take under 150ms, even though the database is not persistent across runs.

Vim integration

Next I wanted to integrate q into Vim, so that when I wanted to see all posts tagged vim (or whatever) I could easily do so from within the current editor instance instead of spawning a new one.


To be clear, the simple way, where you spawn a new instance, is easily achieved like this:

$ vi $(bin/tag-files vim)

But I wanted to do that from within vim. I came up with some functions and commands to do what I wanted, but it was fairly painful. Vim is powerful, but it gets weird fast. Here’s how I made a :Tagged vim command:

function Tagged(tag)
  execute 'args `bin/tag-files ' . a:tag . '`'
command -nargs=1 Tagged call Tagged('<args>')

:execute is a kind of eval. In vim there are a lot of different execution contexts and each one needs it’s own kind of eval; so this is the Ex-mode eval. :args {arglist} simply sets the argument list. And the magic above is that surrounding a string with backticks causes the command to be executed and the output interpolated, just like in shell or Perl.

I also added a window local version, using :arglocal:

function TLagged(tag)
  exe 'arglocal `bin/tag-files ' . a:tag . '`'
command -nargs=1 TLagged call TLagged('<args>')


I also used the quickfix technique I blogged about before because it comes with a nice, easy to use window (see :cwindow) and I added a caption to each file. I did it for the chronological tool since that ends up being the largest possible list of posts. Making it easier to navigate is well worth it. Here’s the backing script:


exec bin/q --sql 'SELECT filename, title, date FROM articles ORDER BY date DESC' \
           --format 'my ($d) = split /T/, $r{date}; "$r{filename}:1:$d $r{title}"'

and then the vim command is simply:

command Chrono cexpr system('bin/quick-chrono')


Another command I added is called :TaggedWord. It takes the word under the cursor and loads all of the files with that tag into the argument list. If I can figure out how to bake it into CTRL-] (or something else like it) I will, as that would be more natural.

function TaggedWord()
  " add `-` as a "word" character
  set iskeyword+=45
  " save the current value of the @m register
  let l:tmp = @m
  normal "myiw
  call Tagged(@m)
  " restore
  set iskeyword-=45
  let @m = l:tmp
command TaggedWord call TaggedWord()

I also made a local version of that, but I’ll leave the definition of that one to the reader as an exercise.

Tag Completion

As a final cherry on top I added a completion function for tags. This is probably the most user-friendly way I can keep using the right tags. When I write a post, and start typing tags, existing tags will autocomplete and thus will be more likely to be selected than to be duplicated. It’s not perfect, but it’s pretty good. Here’s the code:

au FileType markdown execute 'setlocal omnifunc=CompleteTags'
function! CompleteTags(findstart, base)
  " This is almost purely cargo culted from the vim doc
  if a:findstart
    let line = getline('.')
    let start = col('.') - 1
    " tags are word characters and -
    while start > 0 && line[start - 1] =~ '\w\|-'
      let start -= 1
    return start
    " only run the command if we are on the "tags: [...]" line
    if match(getline('.'), "tags:") == -1
      return []

    " get list of tags that have current base as a prefix
    return systemlist('bin/tags ' . a:base . '%')

And here’s the referenced bin/tags:


bin/q --sql 'SELECT tag FROM article_tag WHERE tag LIKE ? GROUP BY tag' -- "$match"

This little excursion was a lot of fun for me. I’ve always thought that Vim’s completion was black magic, but it’s really not. And the lightbulb moment about building an in memory SQLite database was particularly rewarding. I hope I inspired readers to write some tools as well; go forth, write!

Posted Wed, Jul 20, 2016

Development with Docker

I have not seen a lot of great examples of how to use Docker as a developer. There are tons of examples of how to build images; how to use existing images; etc. Writing code that will end up running inside of a container and more so writing code that gets compiled, debugged, and developed in a container is a bit tricker. This post dives into my personal usage of containers for development. I don’t know if this is normal or even good, but I can definitely vouch that it works.

First off, I am developing with an interpreted language most of the time. I still think these issues apply with compiled languages but they are easier to ignore and sweep under the rug. In this post I’ll show I create layered images for developing a simple web service in Perl. It could be Ruby or Python of course, I just know Perl the best so I’m using it for the examples.

Here is a simple Makefile to build the images:

	docker build -f ./Dockerfile.api -t pw/api .

	exit 1

	docker build -f ./Dockerfile.perl-base -t pw/perl-base .

I can build three images, one of which (db) is not-yet-defined but planned.


Here is Dockerfile.perl-base

FROM alpine:3.4

ADD cpanfile /root/cpanfile
   apk add --update build-base wget perl perl-dev && \
   cpan App::cpm && \
   cd /root && \
   cpm -n --installdeps .

I use Alpine as the underlying image for my containers if possible, because it is almost as lightweight as it gets. Beware though, if you use it you may run into problems because it uses musl instead of glibc. I have only run into issues twice though, and one was a bug in the host kernel.

Next I add the cpanfile to the image. I could probably do something weird like build the Dockerfile and directly add the lines from the cpanfile to the Dockerfile, but that doesn’t seem worth the effort to me.

Finally I, in a single layer (hence the && \’s:)

  • Install Perl (which is a very recent 5.22)
  • Install cpm
  • Install the dependencies of the application

Basically what the above gives you is a cache layer where most of your dependencies are installed. This can hugely speed development while you are adding dependencies to the next layer. This methodology is also useful at deployment time, because new builds of the codebase need not rebuild the entire base image, but instead just one or more layers on top. The base image in this example is over 400 megs, and that’s with Alpine; if it were Ubuntu it would likely be over 700. The point is you don’t want to have to push that whole base layer to production for a spelling fix.


Here is Dockerfile.api

FROM pw/perl-base

ADD . /opt/api
WORKDIR /opt/api

RUN cpm -n --installdeps .

Sometimes I’ll add extra bits to the RUN directive. Like currently in the project I’m working on it’s:

RUN apk add perl-posix-strftime-compiler && cpanm --installdeps .

Because I needed Alpine’s patched POSIX::Strftime::Compiler. That will at some point be baked into the lower layer.


If your project is sufficiently large, it is also likely worth it to break api into two layers. One called, for example, staging, which is almost exactly the same as the base layer, but it’s FROM is your base. api then becomes just the ADD and WORKDIR directives.

Another pretty cool refinement is to use docker run to build images. If you have special build requirements this is super handy. A couple reasons why one might need this would include needing to run multiple programs at once during the build, or needing to mount code that will not be added directly to an image. Here’s how it’s done:

TMP_DIR=$(mktemp -td tmp.$1.XXXXXXXX)

# start container
docker run -d \
   --name $TMPNAME \
   --volume $TMP_DIR:/tmp \
   $FROM /sbin/init

# build
docker exec $TMPNAME build --my /code

# save to pw/api
docker commit -m "build --my /code" $TMPNAME pw/api
docker rm -f $TMPNAME
sudo rm -rf $TMP_DIR

Both of these refinements are arguably gross, but they really help speed development and solve problems, so until there are better ways, I’m happy with them.


The above is a useful workflow for building your images, but that does not answer how the containers are used during development. There are a couple pieces to the answer there. First is this little script, which I placed in maint/dev-api:


exec docker run --rm \
                --link some-postgres:db \
                --publish 5000:5000 \
                --user "$(id -u)" \
                --volume "$(pwd):/opt/api" \
                pw/api "$@"

The --link and --publish directives are sorta ghetto. At some point I’ll make the script dispatch based on the arguments and only link or publish if needed.

If possible I always use a non-root user, hence the --user directive. It is probably silly, but you almost never need root anyway, so you might as well not give it to the container. This has the nice side effect of ensuring that any files created from the container in a volume have the right owner.

The --volume should be clear: it replaces the code you built into the image with the code that’s on your laptop, without requiring a rebuilt image.

The other part to make this all work are a few more directives in the Makefile:

	maint/dev-api perl -Ilib bin/update-database

	docker run --rm --link some-postgres:db pw/api perl -Ilib bin/update-database 1

	docker run --name some-postgres -d postgres

	docker rm -f some-postgres

I haven’t gotten around to creating a database container; I’m just using the official docker one for now. I will eventually replicate it for my application in a more lightweight fashion, but this helps me get up and get going. I wouldn’t have made the rm-db directive except the docker tab completion seems to be pretty terrible, but the make tab completion is perfect.

run-migrations is a little weird. It requires a complete rebuild just to update some DDL; but I believe it will be worth it in the long term. I suspect that I’ll be able to push the api container to some host, run-migrations, and it be done, instead of needing a checkout of the code on the host.


One of the details above that I haven’t gone into is the --link directive. This sets up the container so that it has access to the other container, with some environment variables set for the exposed ports in the linked container. On the face of it, this is just a way to connect two containers. Here is how I’m connecting from a script that deploys database code:

#!/usr/bin/env perl

use 5.22.0;
use warnings;

use DBIx::RetryConnect 'Pg';
use PW::Schema;
my $s = PW::Schema->connect(

Notice that I simply use some environment variables that follow a fairly obvious pattern (though it can be referenced by linking a container running env more easily than the docs.)

One other subtle detail is the use of DBIx::RetryConnect. With containers it is much more common to start all of your containers concurrently, versus with typical init systems or even virtual machines. This means baking retries into your applications, as it stands today, is a requirement. Either that or you add stupid sleep statements and hope nothing ever gets run on an overloaded machine.


Linking is pretty cool. For those who haven’t investigated this space much, linking seems like some cool magic “thing.” Linking is actually a builtin service discovery method for allowing containers to know about each other. But linking has a major drawback: to link containers in docker you have to start the containers serially. This is because links are resolved at container creation time. Worse yet you can’t change the environment variables of a running program, so links cannot be updated. This is at the very least a hassle because it introduces a synthetic, implied ordering to the starting of containers.

You can resolve the ordering problem with docker network:

# run API container
docker run -d \
   --name $NAME \
   pw/api www

# add to network
docker network create pw
docker network connect pw $NAME

# run db container
docker run --name db -d postgres
docker network connect pw db

Order no longer matters and you have much more flexibility with how you do discovery. But now you need to make a decision about discovery, as the environment variables will no longer be magically set for you. I strongly believe that this is where anyone doing anything moderately serious will end up anyway. The serialization of startup is just too finicky to be seriously considered.

I haven’t done enough with service discovery myself to recommend any path forward, but knowing the name to search for should give you plenty of rope.

I hope the ideas and examples above help anyone who is grappling with how to use Docker. Any criticisms or other ideas are welcome.

Posted Mon, Jul 18, 2016

Set-based DBIx::Class

This was originally posted to the 2012 Perl Advent Calendar. I refer people to this article so often that I decided to repost it here in case anything happens to the server it was originally hosted on.

I’ve been using DBIx::Class for a few years, and I’ve been part of the development team for just a little bit less. Three years ago I wrote a Catalyst Advent article about the five DBIx::Class::Helpers, which have since ballooned to twenty-four. I’ll be mentioning a few helpers in this post, but the main thing I want to describe is a way of using DBIx::Class that results in efficient applications as well as reduced code duplication.

(Don’t know anything about DBIx::Class? Want a refresher before diving in more deeply? Maybe watch my presentation on it, or, if you don’t like my face, try this one.)

The thesis of this article is that when you write code to act on things at the set level, you can often leverage the database’s own optimizations and thus produce faster code at a lower level.

Set Based DBIx::Class

The most important feature of DBIx::Class is not the fact that it saves you time by allowing you to sidestep database incompatibilities. It’s not that you never have to learn the exact way to paginate correctly with SQL Server. It isn’t even that you won’t have to write DDL for some of the most popular databases. Of course DBIx::Class does do these things. Any ORM worth it’s weight in salt should.


The most important feature of DBIx::Class is the ResultSet. I’m not an expert on ORMs, but I’ve yet to hear of another ORM which has an immutable (if it weren’t for the fact that there is an implicit iterator akin to each %foo it would be 100% immutable. It’s pretty close though!) query representation framework. The first thing you must understand to achieve DBIx::Class mastery is ResultSet chaining. This is basic but critical.

The basic pattern of chaining is that you can do the following and not hit the database:

   name => 'frew',
   job => 'software engineer',

What the above implies is that you can add methods to your resultsets like the following:

sub search_by_name {
   my ($self, $name) = @_;

   $self->search({ $self->current_source_alias . ".name" => $name })

sub is_software_engineer {
   my $self = shift;

      $self->current_source_alias . ".job" => 'software engineer',

And then the query would become merely


(microtip: use DBIx::Class::Helper::ResultSet::Me to make defining searches as above less painful.)

Relationship Traversal

The next thing you need to know is relationship traversal. This can happen two different ways, and to get the most code reuse out of DBIx::Class you’ll need to be able to reach for both when the time arrises.

The first is the more obvious one:

   '' => 'goblin king',
}, {
   join => 'job',

The above finds person rows that have the job “goblin king.

The alternative to use related_resultset in DBIx::Class::ResultSet:


The above generates the same query, but allows you to use methods that are defined on the job resultset.


Subqueries are less important for code reuse and more important in avoiding incredibly inefficient database patterns. Basically, they allow the database to do more on its own. Without them, you’ll end up asking the database for data, then you’ll send that data right back to the database as part of your next query. It’s not only pointless network overhead but also two queries.

Here’s an example of what not to do in DBIx::Class:

my @failed_tests = $tests->search({
   pass => 0,

my @not_failed_tests = $tests->search({
  id => { -not_in => [map $_->id, @failed_tests] }, # XXX: DON'T DO THIS

If you got enough failed tests back, this would probably just error. Just Say No to inefficient database queries:

my $failed_tests = $tests->search({
   pass => 0,

my @not_failed_tests = $tests->search({
  id => { -not_in => $failed_tests },

This is much more efficient than before, as it’s just a single query and lets the database do what it does best and gives you what you exactly want.


Ok so now you know how to reuse searches as much as is currently possible. You understand the basics of subqueries in DBIx::Class and how they can save you time. My guess is that you actually already knew that. “This wasn’t any kind of ninja secret, fREW! You lied to me!” I’m sorry, but now we’re getting to the real meat.

Correlated Subqueries

One of the common, albeit expensive, usage patterns I’ve seen in DBIx::Class is using N + 1 queries to get related counts. The idea is that you do something like the following:

my @data = map +{
   %{ $_->as_hash },
   friend_count => $_->friends->count, # XXX: BAD CODE, DON'T COPY PASTE
}, $person_rs->all

Note that the $_->friends->count is a query to get the count of friends. The alternative is to use correlated subqueries. Correlated subqueries are hard to understand and even harder to explain. The gist is that, just like before, we are just using a subquery to avoid passing data to the database for no good reason. This time we are just going to do it for each row in the database. Here is how one would do the above query, except as promised, with only a single hit to the database:

my @data = map +{
   %{ $_->as_hash },
   friend_count => $_->get_column('friend_count'),
}, $person_rs->search(undef, {
   '+columns' => {
      friend_count => $friend_rs->search({
         'friend.person_id' =>
            { -ident => $person_rs->current_source_alias . ".id" },
      }, {
        alias => 'friend',

There are only two new things above. The first is -ident. All -ident does is tell DBIx::Class “this is the name of a thing in the database, quote it appropriately.” In the past people would have written -ident using queries like this:

'friend.person_id' => \' =' # don't do this, it's silly

So if you see something like that in your code base, change it to -ident as above.

The next new thing is the alias => 'friend' directive. This merely ensures that the inner rs has it’s own alias, so that you have something to correlate against. If that doesn’t make sense, just trust me and cargo cult for now.

This adds a virtual column, which is itself a subquery. The column is, basically, $friend_rs->search({ 'friend.person_id' => $_->id })->count, except it’s all done in the database. The above is horrible to recreate every time, so I made a helper: DBIx::Class::Helper::ResultSet::CorrelateRelationship. With the helper the above becomes:

my @data = map +{
   %{ $_->as_hash },
   friend_count => $_->get_column('friend_count'),
}, $person_rs->search(undef, {
   '+columns' => {
      friend_count => $person_rs->correlate('friend')->count_rs->as_query


Correlated Subqueries are nice, especially given that there is a helper to make creating them easier, but it’s still not as nice as we would like it. I made another helper which is the icing on the cake. It encourages more forward-thinking DBIx::Class usage with respect to resultset methods.

Let’s assume you need friend count very often. You should make the following resultset method in that case:

sub with_friend_count {
   my $self = shift;

   $self->search(undef, {
      '+columns' => {
         friend_count => $self->correlate('friend')->count_rs->as_query

Now you can just do the following to get a resultset with a friend count included:


But to access said friend count from a result you’ll still have to use ->get_column('friend'), which is a drag since using get_column on a DBIx::Class result is nearly using a private method. That’s where my helper comes in. With DBIx::Class::Helper::Row::ProxyResultSetMethod, you can use the ->with_friend_count method from your row methods, and better yet, if you used it when you originally pulled data with the resultset, the result will use the data that it already has! The gist is that you add this to your result class:

__PACKAGE__->load_components(qw( Helper::Row::ProxyResultSetMethod ));

and that adds a friend_count method on your row objects that will correctly proxy to the resultset or use what it pulled or cache if called more than once!


I have one more, small gift for you. Sometimes you want to do something when either your row or resultset is updated. I posit that the best way to do this is to write the method in your resultset and then proxy to the resultset from the row. If you force your API to update through the result you are doing N updates (one per row), which is inefficient. My helper simply needs to be loaded:

__PACKAGE__->load_components(qw( Helper::Row::ProxyResultSetUpdate ));

and your results will use the update defined in your resultset.

Don’t Stop!

This isn’t all! DBIx::Class can be very efficient and also reduce code duplication. Whenever you have something that’s slow or bound to result objects, think about what you could do to leverage your amazing storage layer’s speed (the RDBMS) and whether you can push the code down a layer to be reused more.

Posted Sat, Jul 16, 2016

Investigation: Why Can't Perl Read From TMPDIR?

On Wednesday afternoon my esteemed colleague Mark Jason Dominus (who already blogged this very story, but from his perspective), showed me that he had run into a weird issue. Here was how it manifested:

$ export TMPDIR='/mnt/tmp'
$ env | grep TMPDIR
$ /usr/bin/perl -le 'print $ENV{TMPDIR}'

So to be clear, nothing was printed by Perl.

Another strange detail was that it happened in our development sandboxes, but not in production. I quickly reproduced it in my sandbox and verified with strace that the env var was being set: (reformatted for readability)

$ strace -v -etrace=execve perl -le'print $ENV{TMPDIR}'
execve("/usr/bin/perl", ["perl", "-leprint $ENV{TMPDIR}"], [
  "LESSCLOSE=/usr/bin/lesspipe%s %"...,
  "LESSOPEN=| /usr/bin/lesspipe %s",
  "SSH_CLIENT= 22976 22",
  "SSH_CONNECTION= 22976"...,
]) = 0

It should be obvious that TMPDIR is included in the execve call above. I knew that there had been a recent security patch related to environment variables, so I ran apt-get upgrade in my sandbox and it fixed the issue! But in mjd’s sandbox he had the same exact version of Perl (verified by running sha1sum on /usr/bin/perl.) My sandbox is a local docker machine and his is an EC2 instance, so maybe something there could be causing an issue.

My next idea was to ask around in #p5p; the channel where people who hack on the core Perl code hang out on I’m crediting the people who had the first idea for a given thing to check. There was a lot of repetition, so I’ll spare you and only list the initial time something is mentioned.

Lukas Mai aka Mauke chimed in quickly saying that I should:

  • print the entire environment (perl -E'say "$_=$ENV{$_} for keys %ENV"')
  • use the perl debugger (PERLDB_OPTS='NonStop AutoTrace' perl -d -e0)
  • use ltrace

The first two of those were non-starters. Nothing interesting happened. Here is the unabbreviated ltrace of the issue in question:

$ ltrace perl -le'print $ENV{TMPDIR}'
__libc_start_main(0x400c70, 2, 0x7fff1fa24e88, 0x400f30, 0x400fc0 <unfinished ...>
Perl_sys_init3(0x7fff1fa24d7c, 0x7fff1fa24d70, 0x7fff1fa24d68, 0x400f30, 0x400fc0) = 0
__register_atfork(0x7fad644a3c10, 0x7fad644a3c50, 0x7fad644a3c50, 0, 0x7fff1fa24ca0) = 0
perl_alloc(0, 0x7fad6440efb8, 0x7fad6440ef88, 48, 0x7fff1fa24ca0) = 0x2551010
perl_construct(0x2551010, 0, 0, 0, 0)               = 0x2558f60
perl_parse(0x2551010, 0x400eb0, 2, 0x7fff1fa24e88, 0 <unfinished ...>
Perl_newXS(0x2551010, 0x40101c, 0x7fad64550f80, 0x7fff1fa24b90, 0x7fad645532c0) = 0x2571b28
<... perl_parse resumed> )                          = 0
perl_run(0x2551010, 0x2551010, 0, 0x2551010, 0
)     = 0
Perl_rsignal_state(0x2551010, 0, 0x2551288, 0x2551010, 0x7fff1fa24c50) = -1
Perl_rsignal_state(0x2551010, 1, -1, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 2, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 3, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 4, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 5, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 6, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 7, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 8, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 1
Perl_rsignal_state(0x2551010, 9, 1, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 10, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 11, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 12, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 13, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 14, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 15, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 16, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 17, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 18, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 19, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 20, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 21, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 22, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 23, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 24, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 25, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 26, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 27, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 28, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 29, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 30, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 31, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 32, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = -1
Perl_rsignal_state(0x2551010, 33, -1, 0x7fad6408a1b5, 0x7fff1fa24cb0) = -1
Perl_rsignal_state(0x2551010, 34, -1, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 35, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 36, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 37, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 38, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 39, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 40, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 41, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 42, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 43, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 44, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 45, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 46, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 47, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 48, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 49, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 50, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 51, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 52, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 53, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 54, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 55, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 56, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 57, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 58, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 59, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 60, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 61, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 62, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 63, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 64, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 6, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 17, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 29, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
Perl_rsignal_state(0x2551010, 31, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
perl_destruct(0x2551010, 0, 0, 0x7fad6408a1b5, 0x7fff1fa24cb0) = 0
perl_free(0x2551010, 0xffffffff, 0x2551010, 0x7fad6440b728, 0x7fad6478e0c0) = 2977
Perl_sys_term(0x7fad6440b720, 0, 0x7fad6440b778, 0x7fad6440b728, 0x7fad6478e0c0) = 0
exit(0 <unfinished ...>
+++ exited (status 0) +++

I still have yet to have ltrace actual help me with debugging. More on that later.

Next Ricardo Jelly Bean Signes mentioned that I should try diffing the environment. As expected the only differences were TMPDIR being missing, and _ being /usr/bin/perl or /usr/bin/env respectively.

Dominic Hargreaves looked closely at the patch (which he had ported to the version of Perl in question) and verified that it shouldn’t be causing what we were seeing.

At this point I decided to attempt to bisect a build of Perl to figure out the cause of the problem. Here’s what I did:

git clone git:// -b wheezy
make -f debian/rules build

I ctrl-c’d the tests, since I knew Perl was built at that point. When I did TMPDIR=foo ./perl -E'say $ENV{TMPDIR}' it “worked” and printed foo. I tried this both on a proper virtual machine, on my docker based sandbox, and on the metal of my laptop. None reproduced the problem. Bummer. I went home frustrated, without any answers.

The following morning I mentioned my progress in #p5p to see if anyone had any other ideas.

Todd Rinaldo verified that I wasn’t running perl under taint mode. I wasn’t, but that’s a great question. If you don’t know about taint mode, read the above. It could reasonably cause something like this. He also had me verify that env vars like TMPDIRA, TMPHAH, etc didn’t have the same issue (they did not.)

Matthew Horsfall had me compile and run the following code, to ensure that it worked like env. It did.

#include <unistd.h>
#include <stdio.h>

extern char **environ;

void main(void) {
  int i;

  for (i = 0; environ[i]; i++) {
    printf("%s\n", environ[i]);

Matthew also verified what shell this happened under. I confirmed that it happened under both the GNU Bourne-Again Shell and the Debian Almquist Shell.

Next Andrew Main, more commonly known as Zefram, asked if I had a I did not.

Zefram next said I should try using gdb to inspect the running process. I needed some hand holding, but basically I did the following:

# install gdb
$ apt-get install gdb

# install debug headers
$ apt-get install libc6-dbg

$ gdb --args /usr/bin/perl -E 'say $ENV{TMPDIR}'
(gdb) break main
Breakpoint 1 at 0x41ca90
(gdb) run
Starting program: /usr/bin/perl perl -Esay\ \$ENV\{TMPDIR\}
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/".

Breakpoint 1, 0x000000000041ca90 in main ()
(gdb) p environ[0]
$1 = 0x7fffffffe4df "XDG_SESSION_ID=c2"
(gdb) p environ[1]
$2 = 0x7fffffffe4f1 "TERM=screen-256color"
(gdb) p environ[2]
$3 = 0x7fffffffe506 "DISPLAY=:0"
[ etc etc ]

I iterated over the entire array (till I got to an empty entry) and there was no TMPDIR. Zefram then had me verify that my EUID and my UID matched. I used both id and perl -E'say "$<:$>"' to show that they did match. Zefram then asked if LD_LIBRARY_PATH had the same problem as TMPDIR, and it did!

11:00:12      Zefram | something is cleansing the environment for security reasons

Andrew Rodland commonly known as hobbs linked me to a bug detailing and explaining the issue.

The subtle reason why Dominus didn’t figure this out in the beginning is, unlike the issue above, the binary here is not actually setuid. Instead, it has what Linux calls capabilities, which are sortav root privileges broken down into discrete pieces. Sadly that means ls -l does not show them. In fact there is no flag to pass to ls to show them, so they are easily missed.

In our developer sandboxes we add a capability to /usr/bin/perl to allow it to listen on low ports, so that developers can access their web application without needing to run Apache or some other proxy. We have plans to add a proxy for performance reasons in development anyway, but in the meantime I plan on adding some rules with iptables and removing the capability, to resolve this issue.

Here’s a funny side note to all of this: this capability has been added to our binary since 2013. Dominus ran into a problem with it Wednesday. Another coworker also ran into it two days later, for totally different reasons.

One more layer

One important thing I learned in this investigation is that there is this mostly invisible and unspoken layer: the dynamic linker. I vaguely knew that there was this thing that wires together binaries and their dynamic libraries, but I never really considered that there was more to it than that. The manpage of the dynamic linker has lots of details, but in this case the important section is:

   Secure-execution mode
       For security reasons, the effects of some environment variables are
       voided or modified if the dynamic linker determines that the binary
       should be run in secure-execution mode.  This determination is made
       by checking whether the AT_SECURE entry in the auxiliary vector (see
       getauxval(3)) has a nonzero value.  This entry may have a nonzero
       value for various reasons, including:

       *  The process's real and effective user IDs differ, or the real and
          effective group IDs differ.  This typically occurs as a result of
          executing a set-user-ID or set-group-ID program.

       *  A process with a non-root user ID executed a binary that conferred
          permitted or effective capabilities.

       *  A nonzero value may have been set by a Linux Security Module.

I have spent a little time while writing this post reading that manpage and playing with some of various options. This is kinda cool:

$ LD_DEBUG=all /bin/ls

The amount of output is significant, so I’ll leave running the above as an exercise for the reader.

Useful and (maybe?) not useful abstractions

The other thing that this investigation reinforced is my belief that not all abstractions and layers are important and useful. I have used strace countless times and almost every time I use it it tells me what I need to know (“what port is this program listening on?”, “where is this program’s config file?“, “What is this program blocking on?”) strace shows what system calls are being executed. To learn more read either some blog posts about strace or read the manpage.

Contrast that with ltrace. ltrace shows what library functions are being called. Bizarrely (to me) depending on the version of ltrace being run it can be either just a little bit shorter than the output of strace (that’s what happened while debugging above) or hugely more (on my laptop right now ltrace /usr/bin/perl -E'say $ENV{TMPDIR}' 2>&1 | wc -l is over six thousand, while the strace version is not even three hundred.) Maybe it depends on what debug symbols are installed? I don’t know. While it may be helpful to some to see this:

memmove(0x1e14e10, "print $ENV{TMPDIR}\n", 19)            = 0x1e14e10
__memcpy_chk(0x7ffd946385a1, 0x1e14c28, 5, 256)           = 0x7ffd946385a1
strlen("%ENV")                                            = 4
memchr("%ENV", ':', 4)                                    = 0
malloc(10)                                                = 0x1e16150

I suspect it is not important to most.

This is not to say that ltrace is worthless; it just is much more niche than strace. I would argue that strace is a tool worth using while writing code for almost any engineer. Yet in a decade of professional problem solving I have not been helped by ltrace.

I hope you enjoyed this. It was fun to experience and to learn about Thanks go to all the people mentioned above. If you liked this but haven’t already read the post linked above, authored by MJD, go do that now.

Posted Thu, Jun 30, 2016

Reap slow and bloated plack workers

As mentioned before at ZipRecruiter we are trying to scale our system. Here are a couple ways we are trying to ensure we maintain good performance:

  1. Add timeouts to everything
  2. Have as many workers as possible


Timeouts are always important. A timeout that is too high will allow an external service to starve your users. A timeout that is too low will give up too quickly. No timeout is basically a timeout that is too high, no matter what. My previous post on this topic was about adding timeouts to MySQL. For what it’s worth, MySQL does have a default timeout, but it’s a year, so it’s what most people might call: too high.

Normally people consider timeouts for external services, but it turns out they are useful for our own servers as well. Sometimes people accidentally write code that can be slow in unusual cases, so while it’s fast 99.99% of the time, that last remaining 0.01% can be outage inducing by how much it can slow down code and consume web workers.

One way to add timeouts to code is to make everything asyncronous and tie all actions to clock events, so that you query the database and if the query doesn’t come back before the clock event, you have some kind of error. This is all well and good, but it means that you suddenly need async versions of everything, and I have yet to see universal RDBMS support for async. If you need to go that route you are almost better off rewriting all of your code in Go.

The other option is to bolt on an exteral watchdog, very similar to the MySQL reaper I wrote about last time.

More Workers

Everywhere I have worked the limiting factor for more workers has been memory. There are a few basic things you can do to use as little memory as possible. First and foremost, with most of these systems you are using some kind of preforking server, so you load up as many libraries before the fork as possible. This will allow Linux (and nearly all other Unix implementations) to share a lot of the memory between the master and the workers. On our system, in production, most workers are sharing about half a gig of memory with the master. That goes a really long way when you have tens of workers.

The other things you can do is attempt to not load lots of stuff into memory at all. Due to Perl’s memory model, when lots of memory is allocated, it is never returned to the operating system, and instead reserved for later use by the process. Instead of slurping a whole huge file into memory, just incrementally process it.

Lastly, you can add a stop gap solution that fits nicely in a reaper process. In addition to killing workers that are taking too long serving a single request, you can reap workers that have allocated too much memory.


Because of the mentioned sharing above, we really want to care more about private (that is, not shared) memory more than anything else. Killing a worker because the master has gotten larger is definitely counter productive. We can leverage Linux’s /proc/[pid]/smaps for this. The good news is that if you simply parse that file for a given worker and sum up the Private_Clean and Private_Dirty fields, you’ll end up with all of the memory that only that process has allocated. The bad news is that it can take a while. Greater than ten milliseconds seems typical; that means that adding it to the request lifecycle is a non-starter. This is why baking this into your plack reaper makes sense.

Plack Reaper

The listing below is a sample of how to make a plack reaper to resolve the above issues. It uses USR1 for timeouts, to simply kill those workers. The worker is expected to have code to intercept USR1, log what request it was serving (preferably in the access log) and exit. USR2 is instead meant to allow the worker to finish serving its current request, if there is one, and then exit after. You can leverage psgix.harakiri for that.

We also use Parallel::Scoreboard, which is what Plack::Middleware::ServerStatus::Lite uses behind the scenes.

(Note that this is incredibly simplified from what we are actually using in production. We have logging, more robust handling of many various error conditions, etc.)


use strict;
use warnings;

use Linux::Smaps;
use Parallel::Scoreboard;
use JSON 'decode_json';

my $scoreboard_dir = '/tmp/' . shift;
my $max_private    = shift;

my $scoreboard = Parallel::Scoreboard->new(
  base_dir => $scoreboard_dir,

while (1) {
  my $stats = $scoreboard->read_all;

  for my $pid (keys %$stats) {
    my %status = %{decode_json($stats->{$pid})};

    # undefined time will be become zero, age will be huge, should get killed
    my $age = time - $status{time};

    kill USR1 => $pid
      if $age > timeout(\%status);

    my $smaps = Linux::Smaps->new($pid);

    my $private = $smaps->private_clean + $smaps->private_dirty;
    kill USR2 => $pid
      if $private > $max_private;

  sleep 1;

sub timeout {
  return 10 * 60 if shift->{method} eq 'POST';
  2 * 60

I am very pleased that we have the above running in production and increasing our effective worker count. Maybe next time I’ll blog about our awesome logging setup, or how I (though not ZipRecruiter) think should be considered harmful.

Until next time!

Posted Wed, Jun 29, 2016

AWS Retirement Notification Bot

If you use AWS a lot you will be familiar with the “AWS Retirement Notification” emails. At ZipRecruiter, when we send our many emails, we spin up tens of servers in the middle of the night. There was a period for a week or two where I’d wake up to one or two notifications each morning. Thankfully those servers are totally ephemeral. By the time anyone even noticed the notification the server was completely gone. Before I go further, here’s an example of the beginning of that email (the rest is static:)

Dear Amazon EC2 Customer,

We have important news about your account (AWS Account ID: XXX). EC2 has detected degradation of the underlying hardware hosting your Amazon EC2 instance (instance-ID: i-deadbeef) in the us-east-1 region. Due to this degradation, your instance could already be unreachable. After 2016-07-06 02:00 UTC your instance, which has an EBS volume as the root device, will be stopped.

Note that the identifier there is totally not useful to a human being. Every time we got this notification someone on my team would log into the AWS console, look up the server, and email the team: “the server is gone, must have been one of the email senders” or maybe “the server is an email sender and will be gone soon anyway.”

Like many good programmers I am lazy, so I thought to myself: “I should write an email bot to automate what we are doing!”



use strict;
use warnings;

use Mail::IMAPClient;
use Email::Address;
use Email::Sender::Simple qw(sendmail);
use Data::Dumper::Concise;
use Try::Tiny;

my ($from) = Email::Address->parse('Zip Email Bot <>');
my $imap = Mail::IMAPClient->new(
  Server   => '',
  User     => $from->address,
  Password => $ENV{ZIP_EMAIL_BOT_PASS},
  Ssl      => 1,
  Uid      => 1,
) or die 'Cannot connect to as ' . $from->address . ": $@";

$imap->select( $ENV{ZIP_EMAIL_BOT_FOLDER} )
  or die "Select '$ENV{ZIP_EMAIL_BOT_FOLDER}' error: ", $imap->LastError, "\n";

for my $msgid ($imap->search('ALL')) {

  require Email::MIME;
  my $e = Email::MIME->new($imap->message_string($msgid));

  # if an error happens after this the email will be forgotten
  $imap->copy( 'processed', $msgid )
    or warn "Could not copy: $@\n";

  $imap->move( '[Gmail]/Trash', $msgid )
    or die "Could not move: $@\n";

  my @ids = extract_instance_list($e);

  next unless @ids;

  my $email = build_reply(
    $e, Dumper(instance_data(@ids))

  try {
  } catch {
    warn "sending failed: $_";

# We ignore stuff in the inbox, stuff we care about gets filtered into another
# folder.
$imap->select( 'INBOX' )
  or die "Select 'INBOX' error: ", $imap->LastError, "\n";

my @emails = $imap->search('ALL');

if (@emails) {
  $imap->move( '[Gmail]/Trash', \@emails )
    or warn "Failed to cleanup inbox: " . $imap->LastError . "\n";

  or die "Logout error: ", $imap->LastError, "\n";

# A lot of this was copy pasted from Email::Reply; I'd use it except it has some
# bugs and I was recommended to avoid it.  I sent patches to resolve the bugs and
# will consider using it directly if those are merged and released.
# -- fREW 22Mar2016
sub build_reply {
  my ($email, $body) = @_;

  my $response = Email::MIME->create;

  # Email::Reply stuff
  $response->header_str_set(From => "$from");
  $response->header_str_set(To => $email->header('From'));

  my ($msg_id) = Email::Address->parse($email->header('Message-ID'));
  $response->header_str_set('In-Reply-To' => "<$msg_id>");

  my @refs = Email::Address->parse($email->header('References'));
  @refs = Email::Address->parse($email->header('In-Reply-To'))
    unless @refs;

  push @refs, $msg_id if $msg_id;
  $response->header_str_set(References => join ' ', map "<$_>", @refs)
    if @refs;

  my @addrs = (
  @addrs = grep { $_->address ne $from->address } @addrs;
  $response->header_str_set(Cc => join ', ', @addrs) if @addrs;

  my $subject = $email->header('Subject') || '';
  $subject = "Re: $subject" unless $subject =~ /\bRe:/i;
  $response->header_str_set(Subject => $subject);

  # generation of the body


sub extract_instance_list {
  my $email = shift;

  my %ids;
  $email->walk_parts(sub {
    my $part = shift;
    return if $part->subparts; # multipart
    return if $part->header('Content-Disposition') &&
      $part->header('Content-Disposition') =~ m/attachment/;

    my $body = $part->body;

    while ($body =~ m/\b(i-[0-9a-f]{8,17})\b/gc) {
      $ids{$1} = undef;

  return keys %ids;

sub find_instance {
  my $instance_id = shift;

  my $res;
  # could infer region from the email but this is good enough
  for my $region (qw( us-east-1 us-west-1 eu-west-1 )) {
    $res = try {
      # theoretically we could fetch multiple ids at a time, but if we get the
      # "does not exist" exception we do not want it to apply to one of many
      # instances.
      _ec2($region)->DescribeInstances(InstanceIds => [$instance_id])
    } catch {
      # we don't care about this error
      die $_ unless m/does not exist/m;

    last if $res;

  return $res;

sub instance_data {
  return unless @_;
  my %ids = map { $_ => 'not found (no longer exists?)' } @_;

  for my $id (keys %ids) {
    my $res = find_instance($id);

    next unless $res;

    my ($i, $uhoh) = map @{$_->Instances}, @$res;

    next unless $i;

    warn "multiple instances found for one instance id, wtf\n" if $uhoh;

    $ids{$id} = +{
      map { $_->Key => $_->Value }

  return \%ids;

my %ec2;
sub _ec2 {
  my $region = shift;

  require Paws;

  $ec2{$region} ||= Paws->service('EC2', region => $region );


There’s a lot of code there, but this is the meat of it:

my @ids = extract_instance_list($e);

next unless @ids;

my $email = build_reply(
  $e, Dumper(instance_data(@ids))

try {
} catch {
  warn "sending failed: $_";

And then the end result is a reply-all to the original email that looks something like this:

Subject: Re: [Retirement Notification] Amazon EC2 Instance scheduled for retirement.

  "i-8c288e74" => {
    Level => "prod",
    Name => "send-22",
    Team => "Search"

The code above is cool, but the end result is awesome. I don’t log into the AWS console often, and the above means I get to log in even less. This is the kind of tool I love; for the 99% case, it is quiet and simplifies all of our lives. I can see the result on my phone; I don’t have to connect to a VPN or ssh into something; it just works.


The power went out in the entire city of Santa Monica today, but I was able to work on this blog post (including seeing previews of how it would render) and access the emails that it references thanks to both my email setup and my blog setup. Hurray for software that works without the internet!

Posted Wed, Jun 22, 2016

Vim: Goto File

Vim has an awesome feature that I think is not shown off enough. It’s pretty easy to use and configure, but thankfully many languages have a sensible configuration out of the box.

Vim has this feature that opens a file when you press gf over a filename. On the face of it, it’s only sort of useful. There are a couple settings that make this feature incredibly handy.


First and foremost, you have to set your path. Typically when you open a Perl script or module in vim, the path is set to something like this:

  • $(pwd)
  • /usr/include
  • And Perl’s default @INC

It’s a good idea to add the path of your current project, for example:

:set path+=lib

So on a typical Linux system, you can type out zlib.h and press gf over it and pull up the zlib headers. The next feature is what really makes it powerful.

suffixesadd and includeexpr

The more basic of the two options is suffixesadd. It is simply a list of suffixes to attempt to add to the filename. So in the example above, if you :set suffixesadd=.h and then type zlib and then press gf on the word, you’ll pull of the header files for zlib. That’s too basic for most modern programming environments though. Here’s the default includeexpr for me when I open a perl script:


Let’s unpack that to make sure we see what’s going on. This may be subtly incorrect syntax, but that’s fine. The point is to communicate what is happening above.

to_open = v:fname

# replace all :: with /
to_open = substitute(to_open,'::','/','g')

# remove any method call (like ->foo)
to_open = substitute(to_open,'->*','','')

# append a .pm
to_open = substitute(to_open,'$','.pm','')

With the above we can find the filename to open. This is the default. You can do even better, if you put in a little effort. Here is an idea I’d like to try when I get some time, call a function as the expression, and in the function, if the fname contains, ->resultset(...) return the namespaced resultset. I’d need to tweak the ifsname to allow selecting weird characters, and maybe that would be more problematic than it’s worth, but it’s hard to know before you try. Could be really handy!

Even if you don’t go further with this idea, consider using gf more often. I personally use it (plus CTRL-O as a “back” command”) to browse repos and even the Perl modules they depend on.

Posted Tue, Jun 21, 2016

Staring into the Void

Monday of this week either Gmail or OfflineIMAP had a super rare transient bug and duplicated all of the emails in my inbox, twice. I had three copies of every email! It was annoying, but I figured it would be pretty easy to fix with a simple Perl script. I was right; here’s how I did it:

#!/usr/bin/env perl

use 5.24.0;
use warnings;

use Email::MIME;
use IO::All;

my $dir = shift;

my @files = io->dir($dir)->all_files;

my %message_id;

for my $file (@files) {
   my $message_id = Email::MIME->new( $file->all )->header_str('message-id');
   unless ($message_id) {
      warn "No Message-ID for $file\n";

   $message_id{$message_id} ||= [];
   push $message_id{$message_id}->@*, $file->name;

for my $message_id (keys %message_id) {
   my ($keep, @remove) = $message_id{$message_id}->@*;

   say "# keep $keep";
   say "rm $_" for @remove;

After running the script above I could eyeball the output and be fairly confident that I was not accidentally deleting everything. Then I just re-ran it and piped the output to sh. Et voilà! The inbox was back to normal, and I felt good about myself.

Then I got nervous

Sometimes when you are programming, you solve real world problems, like what day you’ll get married. Other times, you’re just digging yourself out of the pit that is everything that comes with programming. This is one of those times. I’ve mentioned my email setup before, and I am still very pleased with it. But I have to admit to myself that this problem would never have happened if I were using the web interface that Gmail exposes.

See, while I can program all day, it’s not actually what I get paid to do. I get paid to solve problems, not make more of them and then fix them with code. It’s a lot of fun to write code; when you write code you are making something and you get the nearly instant gratification of seeing it work.

I think code can solve many problems, and is worth doing for sure. In fact I do think the code above is useful and was worth writing and running. But it comes really close to what I like to call “life support” code. Life support code is not code that keeps a person living. Life support code is code that hacks around bugs or lack of features or whatever else, to keep other code running.

No software is perfect; there will always be life support code, incidental complexity, lack of idempotence, and bugs. But that doesn’t mean that I can stop struggling against this fundamental truth and just write / support bad software. I will continue to attempt to improve my code and the code around me, but I think writing stuff like the above is, to some extent, a warning sign.

Don’t just mortgage your technical debt; pay it down. Fix the problems. And keep the real goal in sight; you do not exist to pour your blood into a machine: solve real problems.

Posted Thu, Jun 16, 2016

Vim Session Workflow

Nearly a year ago I started using a new vim workflow leveraging sessions. I’m very pleased with it and would love to share it with anyone who is interested.

Session Creation

This is what really made sessions work for me. Normally in vim when you store a session, which almost the entire state of the editor (all open windows, buffers, etc) you have to do it by hand, with the :mksession command. While that works, it means that you are doing that all the time. Tim Pope released a plugin called Obsession which resolves this issue.

When I use Obsession I simply run this command if I start a new project: :Obsess ~/.vvar/sessions/my-cool-thing. That will tell Obsession to automatically keep the session updated. I can then close vim, and if I need to pick up where I left off, I just load the session.

Lately, because I’m dealing with stupid kernel bugs, I have been using :mksession directly as I cannot seem to efficiently make session updating reliable.

Session Loading

I store my sessions (and really all files that vim generates to function) in a known location. The reasoning here is that I can then enumerate and select a session with a tool. I have a script that uses dmenu to display a list, but you could use one of those hip console based selectors too. Here’s my script:


exec gvim -S "$(find ~/.vvar/sessions -maxdepth 1 -type f | dmenu)"

That simply starts gvim with the selected session. If the session was created with Obsession, it will continue to automatically update.

This allows me to easily stop working on a given project and pick up exactly where I left off. It would be perfect if my computer would stop crashing; hopefully it’s perfect for you!

Posted Thu, Jun 9, 2016

DBI Caller Info

At ZipRecruiter we have a system for appending metadata to queries generated by DBIx::Class. About a month ago I posted about bolting timeouts onto MySQL and in the referenced code I mentioned parsing said metadata. We are depending on that metadata more and more to set accurate timeouts on certain page types.

Adding Metadata to DBI Queries

Because of our increased dependence on query metadata, I decided today that I’d look into setting the metadata at the DBI layer instead of the DBIx::Class layer. This not only makes debugging certain queries easier, but more importantly allows us to give extra grace to queries coming from certain contexts.

First we define the boilerplate packages:

package ZR::DBI;

use 5.14.0;
use warnings;

use base 'DBI';

use ZR::DBI::db;
use ZR::DBI::st;

package ZR::DBI::st;

use 5.14.0;
use warnings;

use base 'DBI::st';


Next we intercept the prepare method. In this example we only grab the innermost call frame. At work we not only walk backwards based on a regex on the filename; we also have a hash that adds extra data, like what controller and action are being accessed when in a web context.

package ZR::DBI::db;

use 5.14.0;
use warnings;

use base 'DBI::db';

use JSON::XS ();

sub prepare {
  my $self = shift;
  my $stmt = shift;

  my ($class, $file, $line, $sub) = caller();

  $stmt .= " -- ZR_META: " . encode_json({
    class => $class,
    file  => $file,
    line  => $line,
    sub   => $sub,
  }) . "\n";

  $self->SUPER::prepare($stmt, @_);


Finally use the subclass:

my $dbh = DBI->connect($dsn, $user, $password, {
    RaiseError         => 1,
    AutoCommit         => 1,

    RootClass          => 'ZR::DBI',

The drawback of the above is that it could (and maybe is?) destroying the caching of prepared statements. In our system that doesn’t seem to be very problematic, but I suspect it depends on RDBMS and workload. Profile your system before blindly following these instructions.

Wow that’s all there is to it! I expected this to be a lot of work, but it turns out Tim Bunce had my back and made this pretty easy. It’s pretty great when something as central as database access has been standardized!

Posted Wed, Jun 8, 2016

My Custom Keyboard

A few years ago I made my own keyboard, specifically an ErgoDox. I’ve been very pleased with it in general and I have finally decided to write about it.


The ErgoDox is sortav an open-source cross between the Kinesis Advantage and the Kinesis Freestyle. It’s two effectively independent halves that have a similar layout to the Advantage, especially the fact that the keys are in a matrix layout. If you don’t know what that means, think about the layout of a numpad and how the keys are directly above each other as opposed to staggered like the rest of the keyboard. That’s a matrix layout.

The other major feature of the ErgoDox is the thumb clusters. Instead of delegating various common keys like Enter and Backspace to pinky fingers, many keys are pressed by a thumb. Of course the idea is that the thumb is stronger and more flexible and thus more able to deal with consistent usage. I am not a doctor and can’t really evaluate the validity of these claims, but it’s been working for me.

The ErgoDox originally only shipped as a kit, so I ended up soldering all of the diodes, switches, etc together on a long hot day in my home office with a Weller soldering iron I borrowed from work. Of course because I had not done a lot of soldering or even electrical stuff I first soldered half of the diodes on backwards and had to reverse them. That was fun!


My favorite thing about my keyboard is that it runs my own custom firmware. It has a number of interesting features, but the coolest one is that when the operator holds down either a or ; the following keys get remapped:

  • h becomes
  • j becomes
  • k becomes
  • l becomes
  • w becomes Ctrl + →
  • b becomes Ctrl + ←
  • y becomes Ctrl + C
  • p becomes Ctrl + V
  • d becomes Ctrl + X
  • y becomes Ctrl + Z
  • x becomes Delete

For those who can’t tell, this is basically a very minimal implementation of vi in the hardware of the keyboard. I can use this in virtually any context. The fact that keys that are not modifiers at all are able to be used in such a manner is due to the ingenuity of TMK.


When I bought the ErgoDox kit from MassDrop I had the option of either buying blank keycaps in a separate but concurrent drop, or somehow scrounging up my own keycaps somewhere else. After a tiny bit of research I decided to get the blank keycaps.


I had the idea for this part of my keyboard after having the keyboard for just a week. I’d been reading Homestuck which inspired me to use the Zodiak for the function keys (F1 through F12.)

After having the idea I emailed Signature Plastics, who make a lot of keycaps, about pricing of some really svelte keys. Note that this is three years ago so I expect their prices are different. (And really the whole keycap business has exploded so who knows.) Here was their response:

In our DCS family, the Cherry MX compatible mount is the 4U. Will all 12 of the Row 5 keycaps have the same text or different text on them? Pricing below is based on each different keycap text. As you will see our pricing is volume sensitive, so if you had a few friends that wanted the same keys as you, you would be better off going that route.

  • 1 pc $98.46 each
  • 5 pcs $20.06 each
  • 10 pcs $10.26 each
  • 15 pcs $6.99 each
  • 25 pcs $4.38 each
  • 50 pcs $2.43 each

Please note that our prices do not include shipping costs or new legend fees should the text you want not be common text. Let me know if you need anything else!

So to be absolutely clear, if I were to get a set all by myself the price would exceed a thousand dollars, for twelve keys. I decided to start the process of setting up a group buy. I’m sad to say that I can’t find the forum where I initiated that. I thought it was GeekHack but there’s no post from me before I had the Zodiak keys.

Anyway just a couple of days after I posted on the forum I got this email from Signature Plastics:

I have some good news! It appears your set has interested a couple people in our company and we have an offer we were wondering if you would consider. Signature Plastics would like to mold these keycaps and place them on our marketplace. In turn for coming up with the idea (and hopefully helping with color selection and legend size) we will offer you a set free of charge… What do you think?

Of course I was totally down. I in fact ordered an extra set myself since I ended up making two of these keyboards eventually! Here’s a screenshot of the keycaps from their store page:


For those who don’t know, these keys are double-shot, which means each key is actually two pieces of plastic: an orange piece (the legend,) and a black piece which contains the legend. This means that no matter how much I type on them, the legend won’t wear off even after twenty years of usage. Awesome.


A couple of months after building the keyboard I came to the conclusion that I needed legends on all of the keys. I can touch type just fine, but when doing weird things like pressing hotkeys outside of the context of programming or writing I need the assistance of a legend. So I decided to make my own stealth keycaps.

You can see the original post on GeekHack here.

Here are the pictures from that thread:



Also, if you didn’t already, I recommend reading that short thread. The folks on GeekHack are super friendly, positive, and supportive. If only the rest of the internet could be half as awesome.


The one other little thing I’ve done to the keyboard is to add small rubber O-rings underneath each key. I have cherry blues (which are supposed to click like an IBM Model-M) but with the O-rings they keyboard is both fairly quiet and feels more gentle on my hands. A full depress of a key, though unrequired with a mechanical switch, is cushioned by the rings.

My keyboard is one of the many tools that I use on a day to day basis to get my job done. It allows me to feel more efficient and take pride in the tools that I’ve built to save myself time and hopefully pain down the road. I have long had an unfinished post in my queue about how all craftspersons should build their own tools, and I think this is a fantastic example of that fine tradition.

Go. Build.

Posted Sat, Jun 4, 2016


A big trend lately has been the rise of “serverless” software. I’m not sure I’m the best person to define that term, but my use of the term generally revolves around avoiding a virtual machine (or a real machine I guess.) I have a server on Linode that I’ve been slowly removing services from in an effort to get more “serverless.”

It’s not about chasing fads. I am a professional software engineer and I mostly use Perl; I sorta resist change for the sake of it.

It’s mostly about the isolation of the components. As it stands today my server is a weird install of Debian where the kernel is 64 bit and the userspace is 32 bit. This was fine before, but now it means I can’t run Docker. I had hoped to migrate various parts of my own server to containers to be able to more easily move them to OVH when I eventually leave Linode, but I can’t now.


I could just rebuild the server, but then all of these various services that run on my server would be down for an unknown amount of time. To make this a little more concrete, here are the major services that ran on my blog at the beginning of 2016:

  1. Blog (statically served content from Apache)
  2. Lizard Brain (Weird automation thing)
  3. IRC Client (Weechat)
  4. RSS (An install of Tiny Tiny RSS; PHP on Apache)
  5. Feeds (various proxied RSS feeds that I filter myself)
  6. Git repos (This blog and other non-public repositories)
  7. SyncThing (Open source decentralized DropBox like thing)

The above are ordered in terms of importance. If SyncThing doesn’t work for some reason, I might not even notice. If my blog is down I will be very angsty.


I’ve already posted about when I moved my blog off Linode. That’s been a great success for me. I am pleased that this blog is much more stable than it was before; it’s incredibly secure, despite the fact that it’s “on someone else’s computer;” and it’s fast and cheap!


After winning a sweet skateboard from Heroku I decided to try out their software. It’s pretty great! The general idea is that you write some kind of web based app, and it will get run in a container on demand by Heroku, and after a period of inactivity, the app will be shut down.

This is a perfect way for my RSS proxy to run, and it simplified a lot of stuff. I had written code to automatically deploy when I push to GitHub. Heroku already does that. I never took care of automating the installation of deps, but Heroku (or really miyagawa) did.

While I had certificates automatically getting created by LetsEncrypt, Heroku provides the same functionality and I will never need to baby-sit it.

And finally, because my RSS proxy is so light (accessed a few times a day) it ends up being free. Awesome. Thanks Heroku.

AWS Lambda

I originally tried using Lambda for this, but it required a rewrite and I am depending on some non-trivial infrastructural dependencies here. While I would have loved to port my application to Python and have it run for super cheap on AWS Lambda, it just was not a real option without more porting than I am prepared to do right now.

RSS and Git Repos

Tiny Tiny RSS is software that I very much have a love/hate relationship with. Due to the way the community works, I was always a little nervous about using it. After reading a blog post by Filippo Valsorda about Piwik I decided to try out on the Oasis. is a lot like Heroku, but it’s more geared toward hosting open source software for individuals, with a strong emphasis on security.

You know that friend you have who is a teacher and likes to blog about soccer? Do you really want that friend installing WordPress on a server? You do not. If that friend had an Oasis account, they could use the WordPress grain and almost certainly never get hacked.

I decided to try using Oasis to host my RSS reader and so far it has been very nice. I had one other friend using my original RSS instance (it was in multiuser mode) and he seems to have had no issues with using Oasis either. This is great; I now have a frustrating to maintain piece of software off of my server and also I’m not maintaining it for two. What a load off!

Oasis also has a grain for hosting a git repo, so I have migrated the storage of the source repo of this blog to the Oasis. That was a fairly painless process, but one thing to realize is that each grain is completely isolated, so when you set up a git repo grain it hosts just the one repo. If you have ten repos, you’d be using ten grains. That would be enough that you’d end up paying much more for your git repos.

I’ll probably move my Piwik hosting to the Oasis as well.

Oh also, it’s lightweight enough that it’s free! Thanks Oasis.

Lizard Brain and IRC Client

Lizard Brain is very much a tool that is glued into the guts of a Unix system. One of its core components is atd. As of today, Sandstorm has no scheduler that would allow LB to run there. Similarly, while Heroku does have a scheduler, its granularity is terrible and it’s much more like cron (it’s periodic) than atd (a specific event in time.) Amazon does have scheduled events for Lambda, but unlike Heroku and Sandstorm, that would require a complete rewrite in Python, Java, or JavaScript. I suspect I will rewrite in Python; it’s only about 800 lines, but it would be nice if I didn’t have to.

Another option would be for me to create my own atd, but then I’d have it running in a VM somewhere and if I have a VM running somewhere I have a lot less motivation to move every little service off of my current VM.

A much harder service is IRC. I use my VM as an IRC client so that I will always have logs of conversations that happened when I was away. Over time this has gotten less and less important, but there are still a few people who will reach out to me while I’m still asleep and I’m happy to respond when I’m back. As of today I do not see a good replacement for a full VM just for IRC. I may try to write some kind of thing to put SSH + Weechat in a grain to run on, but it seems like a lot of work.

An alternate option, which I do sortav like, is finding some IRC client that runs in the browser and also has an API, so I can use it from my phone, but also have a terminal interface.

The good news is that my Linode will eventually “expire” and I’ll probably get a T2 Nano EC2 instance, which costs about $2-4/month and is big enough (500 mB of RAM) to host an IRC Client. Even on my current Linode I’m using only 750 mB of ram and if you exclude MySQL (used for TTRSS, still haven’t uninstalled it) and SyncThing it’s suddenly less than 500 mB. Cool!


SyncThing is cool, but it’s not a critical enough part of my setup to require a VM. I am likely to just stop using it since I’ve gone all the way and gotten a paid account for DropBox.


A lot of the above are specifics that are almost worthless to most of you. There are real reasons to move to a serverless setup, and I think they are reasons that everyone can appreciate.


Software is consistently and constantly shown to be insecure. Engineers work hard to make good software, but it seems almost impossible for sufficiently complex software to be secure. I will admit that all of the services discussed here are also software, but because of their very structure the user is protected from a huge number of attacks.

Here’s a very simple example: on the Oasis, I have a MySQL instance inside of the TTRSS grain. On my Linode the MySQL server could potentially be misconfigured to be listening on a public interface, maybe because some PHP application installer did that. On the Oasis that’s not even possible, due to the routing of the containers.

Similarly, on Heroku, if there were some crazy kernel bug that needed to be resolved, because my application is getting spun down all the time, there are plenty of chances to reboot the underlying virtual machines without me even noticing.


Isolation is a combination of a reliability and security feature. When it comes to security it means that if my blog were to get hacked, my TTRSS instance is completely unaffected. Now I have to admit this is a tiny bit of a straw man, because if I set up each of my services as separate users they’d be fairly well isolated. I didn’t do that though because that’s a hassle.

The reliability part of isolation is a lot more considerable though. If I tweak the Apache site config for TTRSS and run /etc/init.d/apache restart and had a syntax error, all of the sites being hosted on my machine go down till I fix the issue. While I’ve learned various ways to ensure that does not happen, “be careful” is a really stupid way to ensure reliability.


I make enough money to pay for a $20/mo Linode, but it just seems like a waste of overall money that could be put to better uses. Without a ton of effort I can cut my total spend in half, and I suspect drop to about %10. As mentioned already in the past, my blog is costing less than a dime a month and is rock-solid.


Nothing is perfect though. While I am mostly sold on the serverless phenomenon, there are some issues that I think need solving before it’s an unconditional win.

Storage (RDBMS etc)

This sorta blows my mind. With the exception of, which is meant for small amounts of users for a given application, no one really offers a cheap database. Heroku has a free database option that I couldn’t have used with my RSS reader, and the for-pay option would cost about half what I pay for my VM, just for the database.

Similarly AWS offers RDS, but that’s really just the cost of an EC2 VM, so at the barest that would be a consistent $2/mo. If you were willing to completely rewrite your application you might be able to get by using DyanomDB, but in my experience using it at work it can be very frustrating to try to tune for.

I really think that someone needs to come in and do what Lamdba did for code or DyanomDB did for KV stores, but for a traditional database. Basically as it stands today if you have a database that is idle, you pay the same price as you would for a database that is pegging it’s CPU. I want a traditional database that is billed based on usage.

Billing Models

Speaking of billing a database based on usage, more things need to be billed based on usage! I am a huge fan of most of the billing models on AWS, where you end up paying for what you use. For someone self hosting for personal consumption this almost always means that whatever you are doing will cost less than any server you could build. I would gladly pay for my Oasis usage, but a jump from free to $9 is just enough for me to instead change my behaviour and instead spend that money elsewhere.

If someone who works on is reading this and cares: I would gladly pay hourly per grain.

I have not yet used enough of Heroku to need to use the for pay option there, but it looks like I could probably use it fairly cheaply.


Of course there will be some people who read this who think that running on anything but your own server is foolish. I wonder if those people run directly on the metal, or just assume that all of the Xen security bugs have been found. I wonder if those people regularly update their software for security patches and know to restart all of the various components that need to be restarted. I wonder if those people value their own time and money.

Hopefully before July I will only be using my server for IRC and Lizard Brain. There’s no rush to migrate since my Linode has almost 10 months before a rebill cycle. I do expect to test how well a T2 Nano works for my goals in the meantime though, so that I can easily pull the trigger when the time comes.

Posted Wed, Jun 1, 2016

Iterating over Chunks of a Diff in Vim

Every now and then at work I’ll make broad, sweeping changes in the codebase. The one I did recently was replacing all instances of print STDERR "foo\n" with warn "foo\n". There were about 160 instances in all that I changed. After discussing more with my boss, we discussed that instead of blindly replacing all those print statements with warns (which, for those who don’t know, are easier to intercept and log) we should just log to the right log level.

Enter Quickfix

Quickfix sounds like some kind of bad guy from a slasher movie to me, but it’s actualy a super handy feature in Vim. Here’s what the manual says:

Vim has a special mode to speedup the edit-compile-edit cycle. This is inspired by the quickfix option of the Manx’s Aztec C compiler on the Amiga. The idea is to save the error messages from the compiler in a file and use Vim to jump to the errors one by one. You can examine each problem and fix it, without having to remember all the error messages.

More concretely, the quickfix commands end up giving the user a list of locations. I tend to use the quickfix list most commonly with Fugitive. You can run the command :Ggrep foo and the quickfix list will contain all of the lines that git found containing foo. Then, to iterate over those locations you can use :cnext, :cprev, :cwindow, and many others, to interact with the list.

I have wanted a way to populate the quickfix list with the locations of all of the chunks that are in the current modified files for a long time, and this week I decided to finally do it.

First off, I wrote a little tool to parse diffs and output locations:

#!/usr/bin/env perl

use strict;
use warnings;

my $filename;
my $line;
my $offset = 0;
my $printed = 0;
while (<STDIN>) {
   if (m(^\+\+\+ b/(.*)$)) {
      $printed = 0;
      $filename = $1;
   } elsif (m(^@@ -\d+(?:,\d+)? \+(\d+))) {
      $line = $1;
      $offset = 0;
      $printed = 0;
   } elsif (m(^\+(.*)$)) {
      my $data = $1 || '-';
      print "$filename:" . ($offset + $line) . ":$data\n"
         unless $printed;
      $printed = 1;
   } elsif (m(^ )) {
      $printed = 0;

The general usage is something like: git diff | diff-hunk-list, and the output will be something like:

app/lib/ZR/Plack/Middleware/  local $SIG{USR1} = sub {
bin/zr-plack-reaper:29:sub timeout { 120 }

The end result is a location for each new set of lines in a given diff. That means that deleted lines will not be included with this tool. Another tool or more options for this tool would have to be made for that functionality.

Then, I added the following to my vimrc:

command Gdiffs cexpr system('git diff \| diff-hunk-list')

So now I can simply run :Gdiffs and iterate over all of my changes, possibly tweaking them along the way!

Super Secret Bonus Content

The Quickfix is great, but there are a couple other things that I think really round out the functionality.

First: the Quickfix is global per session, so if you do :Gdiffs and then :Ggrep to refer to some other code, you’ve blown away the original quickfix list. There’s another list called the location list, which is scoped to a window. Also very useful; tends to use commands that start with l instead of c.

Second: There is another Tim Pope plugin called unimpaired which adds a ton of useful mappings; which includes [q and ]q to go back and forth in the quickfix, and [l and ]l to go back and forth in the location list. Please realize that the plugin does way more than just those two things, but I do use it for those the most.

Posted Wed, May 25, 2016

OSCON 2016

ZipRecruiter, where I work, generously pays for each engineer to go at least one conference a year. I have gone to YAPC every year since 2009 and would not skip it, except my wife is pregnant with our second child and will be due much too close to this year’s YAPC (or should I say instead: The Perl Conference?) for me to go.

There were a lot of conferences that I wanted to check out; PyCon, Monitorama, etc etc, but OSCON was the only one that I could seem to make work out with my schedule. I can only really compare OSCON to YAPC and to a lesser extent SCALE and the one time I went to the ExtJS conference (before it was called Sencha,) so my comparisons may be a little weird.

Something Corporate

OSCON is a super corporate conference, given that it’s name includes Open Source. For the most part this is fine; it means that there is a huge amount of swag (more on that later,) lots of networking to be done, and many free meals. On the other hand OSCON is crazy expensive; I would argue not worth the price. I got the lowest tier, since my wife didn’t want me to be gone for the full four days (and probably six including travel,) and it cost me a whopping twelve hundred dollars. Of course ZipRecruiter reimbursed me, but for those who are used to this, YAPC costs $200 max, normally.

On top of that there were what are called “sponsored talks.” I was unfamiliar with this concept but the basic idea is that a company can pay a lot of money and be guaranteed a slot, which is probably a keynote, to sortav shill their wares. I wouldn’t mind this if it weren’t for the fact that these talks, as far as I could tell, were universally bad. The one that stands out the most was from IBM, with this awesome line (paraphrased:)

Oh if you don’t use you’re not really an engineer. Maybe go back and read some more Knuth.


At YAPC you tend to get 1-3 shirts, some round tuits, and maybe some stickers. At OSCON I avoided shirts and ended up with six; I got a pair of socks, a metal bottle, a billion pretty awesome stickers, a coloring book, three stress toys, and a THOUSAND DOLLAR SKATEBOARD. To clarify, not everyone got the skateboard; the deal was that you had to get a Heroku account (get socks!) run a node app on your laptop (get shirt!) and then push it up to Heroku (get entered into drawing!) Most people gave up at step two because they had tablets or something, but I did it between talks because that all was super easy on my laptop. I actually was third in line after the drawing, but first and second never showed. Awesome!

The Hallway Track

For me the best part of any conference is what is lovingly called “the hallway track.” The idea is that the hallway, where socializing and networking happen, is equally important to all the other tracks (like DevOps, containers, or whatever.) I really enjoy YAPC’s hallway track, though a non-trivial reason is that I already have many friends in the Perl and (surprisingly distinct) YAPC world. On top of that YAPC tends to be in places that are very walkable, so it’s easy to go to a nice restaurant or bar with new friends.

I was pleasantly surprised by the OSCON hallway track. It was not as good as YAPC’s, but it was still pretty awesome. Here are my anecdotes:

Day 1 (Wed)

At lunch I hung out with Paul Fenwick and a few other people, which was pretty good. Chatting with Paul is always great and of course we ended up talking about ExoBrain and my silly little pseudoclone: Lizard Brain.

At dinner I decided to take a note from Fitz Elliot’s book, who once approached me after I did a talk and hung out with me a lot during the conference. I had a lot of good conversations with Fitz and I figured that maybe I could be half as cool as him and do the same thing. The last talk I went to was about machine learning and the speaker, Andy Kitchen, swerved into philosophy a few times, so I figured we’d have a good time and get along if I didn’t freak him out too much by asking if we could hang out. I was right, we (him, his partner Laura Summers, a couple other guys, and I) ended up going to a restaurant and just having a generally good time. It was pretty great.

Day 2 (Thu)

At lunch on Thursday I decided to sit at the Perl table and see who showed up. Randal Schwartz, who I often work with, was there, which was fun. A few other people were there. Todd Rinaldo springs to mind. I’ve spoken to him before, but this time we found an important common ground in trying to reduce memory footprints. I hope to collaborate with him to an extent and publish our results.

Dinner was pretty awesome. I considered doing the same thing I did on Wednesday, but I thought it’d be hugely weird to ask the girl who did the last talk I saw if she wanted to get dinner. That means something else, usually. So I went to the main area where people were sorta congregating and went to greet some of the Perl people that I recognized (Liz, Wendy, David H. Alder.) They were going to Max’s Wine Bar and ended up inviting me, and another girl who I sadly cannot remember the name of. Larry Wall (who invented Perl,) and his wife Gloria and one of his sons joined us, which was pretty fun. At the end of dinner (after I shared an amazing pair of deserts with Gloria) Larry and Wendy fought over who would pay the bill, and Larry won. This is always pretty humbling and fun. The punchline was that the girl who came with us didn’t know who Larry was, because she was mostly acquainted with Ruby. When Wendy told her there were many pictures taken. It was great.

Day 3 (Fri)

Most of Friday I tried to chill and recuperate. I basically slept, packed, went downtown to get lunch and coffee, and then waited for a cab to the airport. Then when I got to the airport I was noticed by another OSCON attendee (Julie Gunderson) because I was carrying the giant branded skateboard. She was hanging out with AJ Bowen and Jérôme Petazzoni, and they were cool with me tagging along with them to get a meal before we boarded the plane. It’s pretty cool that we were able to have a brief last hurrah after the conference was completely over.


One thing that I was pretty disappointed in was the general reaction when I mentioned that I use Perl. I have plenty of friends in Texas who think poorly of Perl, but I had assumed that was because they mostly worked on closed source software. The fact that a conference that was originally called The Perl Conference would end up encouraging such an anti-Perl attitude is very disheartening.

Don’t get me wrong, Perl is not perfect, but linguistic rivalries only alienate people. I would much rather you tell me some exciting thing you did with Ruby than say “ugh, why would someone build a startup on Perl?” I have a post in the queue about this, so I won’t say a lot more about this. If you happen to read this and are a hater, maybe don’t be a hater.

Overall the conference was a success for me. If I had to choose between a large conference like OSCON and a small conference like YAPC, I’d choose the latter. At some point I’d like to try out the crazy middle ground of something like DefCon where it’s grass roots but not corporate. Maybe in a few years!

Posted Fri, May 20, 2016