AWS Accounts as Cattle

Rob WeaverMay 16, 2022

In managing AWS, the best practice is to use AWS Organizations, and separate concerns into multiple accounts.

For example I generally organize login with the hub and spoke model (see: AWS Login Code As Infrastructure CloudFormation), where I restrict login to one central account that only allows role assumption into the other accounts. And the other accounts are divided into functional areas, like dev/qa/prod, or even specific service needs. This allows for reducing blast radius, and limiting what is in the account.

Accounts are Pets

It recently occurred to me that my accounts need to be more fully treated like cattle, generally we try to avoid things where we treat infrastructure as our pet (See: https://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/). I’ve been building automation that makes sure everything in the account can be automated in a way that IF an account ever gets compromised, I have a way to rebuild everything in the account.

Technical Debt Test Case

In this case, I realized that I had some tech debt in that my Route53 zone is in my billing master account, and not in the specific account it belongs in (meaning that the billing master is doing work that should have been split out into a sub-account in the organization).

So to fix this, I needed to migrate everything out of the master account, and into the appropriate account for the workload. So the plan is this:

Migrate the base zone to a new network account (alias foo-network ). This will only hold the definitions for the SOA and delegate the other sub-zones
Migrate the delegated sub-zones to the appropriate account (service.prod,foo.com to foo-service-prod and service.dev.foo.com to foo-service-dev for example)

Repeatable resources with CloudFormation

I already had the CloudFormation template for the base zones, so that part was easy, just recreate the zone from the foo-master account in the more appropriate account of foo-network (after creating the new account via the organizations method) – eventually I’ll get where this account is locked down to just managing Route53, and has everything else denied by SCP, but that’s for later.

AWS process for migrating

Once the zones are created in the new account, I needed a way to get the zones from the foo-master account, which meant following the steps here: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zones-migrating.html

Making it less manual with scripting

It occurred to me that with a bit of jq magic, I should be able to do this more or less in one or two commands, so first I just ran the command to get the records from the original zone using the aws route53 list-resource-record-sets command:

aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | jq -r '.'

The <zoneid> is found by describing the zone in AWS (for now I just grabbed the value from the console), and of course the profile just maps to the right account so that we pull from the right place.

The output from the above gives us a list of everything in that zone, so we need to strip out the SOA and NS records as the article describes. Easy enough with a couple of selects, so we update the above to something like:

aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | jq -r '.ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS"))'

That gets us the list without those records, but then the article talks about not including the zoneId, so I chose to just add a sed command to rewrite the zone ID before output like this:

aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | sed -r 's/<zoneid>/<newzoneid>/g' | jq -r '.ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS"))'

So now I have the raw data for import, so next is figuring out how to reformat in jq which turns out to be easier than I thought. Basically you can throw text in around the commands you built and as long as you do it in the right place, jq will spit out the JSON that produces. So the above creates a JSON list of the records in the zone, so we need to turn those into CREATE (or UPSERT) statements that the update uses for the import. To do that we just need to add some things inside the map statement like this:

map(select(.Type != "SOA") | select(.Type != "NS") | {"Action": "UPSERT", "ResourceRecordSet": .})'

So for each record in our zone, we reformat the JSON for the recordset to be the value of the ResourceRecordSet with the Action of UPSERT giving JSON like:

[ 
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "app.service.foo.prod.com.",
        "Type": "CNAME",
        "TTL": 900,
        "ResourceRecords": [
          {
            "Value": "xxx.us-west-1.elb.amazonaws.com"
          }
        ]
      }
    },
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "glassfish.service.foo.prod.com.",
        "Type": "CNAME",
        "TTL": 900,
        "ResourceRecords": [
          {
            "Value": "xxxx.us-west-1.elb.amazonaws.com"
          }
        ]
      }
    },

]

Now we need to wrap that array with the rest of the update information, so we have to add a Comment and make the array be the value of Changes with this update to the command:

 jq -r '{ "Comment": "Migrating service.prod.foo.com zone", "Changes": .ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS")| {"Action": "CREATE", "ResourceRecordSet": .})}'

Breaking it down

Breaking down the jq command:

the first curly brace says it will output as one JSON object.
The "Comment": up to the .ResourceRecordSets gets output literally
The .ResourceRecordSets is piped to the map so each row gets processed by the things inside that map
The first select removes SOA record
The output from that is piped to another select which removes NS records
That is piped to the formatting for the record which adds the Action and names the item ResourceRecordSet all wrapped in curly braces to make it a new JSON object.
There is a closing curly brace on to close out the object we started on line 1

Export command

That yields the proper JSON for input to our command, so saving the JSON to a file with this:

aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | sed -r 's/<zoneid>/<newzoneid>/g' | jq -r '{ "Comment": "Migrating service.prod.foo.com zone", "Changes": .ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS")| {"Action": "CREATE", "ResourceRecordSet": .})}' > /tmp/foo-master-service-prod.json

Import command

Then you can migrate to the new zone with the following (after getting the zone ID from the new account):

aws route53 change-resource-record-sets --hosted-zone-id <newzoneid> --profile foo-service-prod --change-batch file:///tmp/foo-master-prod.json

Caveats

Alias records won’t work, as they point to the zone you exported from, and the sed tromped on the zone in them, so you may need to delete those, or add more jq magic to change them into CName records.
Alias records have to be created in the right order, so if you have an alias pointing from one alias to another, you have to create the one it points to first.
You’ll still need to repoint the SOA to the new account for this to be migrated. Probably worth noting because of the way DNS works, you should leave the old zone around for a couple of hours at least to make sure the new SOA and NS records are working.

Conclusion

While this isn’t super simple, it does allow us to migrate a zone from one account to another.

The commands can also be wrapped into a CI/CD pipeline to back up your DNS and keep a clone up to date for recovery from a compromised account.

In my scenario, the number of records in the zone is pretty small, I think there are some issues with large numbers of records in DNS that might require other strategies here, but all in all this is a good step toward treating all of my infrastructure as code.