AWS Accounts as Cattle
In managing AWS, the best practice is to use AWS Organizations, and separate concerns into multiple accounts.
For example I generally organize login with the hub and spoke model (see: AWS Login Code As Infrastructure CloudFormation), where I restrict login to one central account that only allows role assumption into the other accounts. And the other accounts are divided into functional areas, like dev/qa/prod, or even specific service needs. This allows for reducing blast radius, and limiting what is in the account.
Accounts are Pets
It recently occurred to me that my accounts need to be more fully treated like cattle, generally we try to avoid things where we treat infrastructure as our pet (See: https://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/). I’ve been building automation that makes sure everything in the account can be automated in a way that IF an account ever gets compromised, I have a way to rebuild everything in the account.
Technical Debt Test Case
In this case, I realized that I had some tech debt in that my Route53 zone is in my billing master account, and not in the specific account it belongs in (meaning that the billing master is doing work that should have been split out into a sub-account in the organization).
So to fix this, I needed to migrate everything out of the master account, and into the appropriate account for the workload. So the plan is this:
- Migrate the base zone to a new network account (alias
foo-network
). This will only hold the definitions for the SOA and delegate the other sub-zones - Migrate the delegated sub-zones to the appropriate account (
service.prod,foo.com
tofoo-service-prod
andservice.dev.foo.com
tofoo-service-dev
for example)
Repeatable resources with CloudFormation
I already had the CloudFormation template for the base zones, so that part was easy, just recreate the zone from the foo-master
account in the more appropriate account of foo-network
(after creating the new account via the organizations method) – eventually I’ll get where this account is locked down to just managing Route53, and has everything else denied by SCP, but that’s for later.
AWS process for migrating
Once the zones are created in the new account, I needed a way to get the zones from the foo-master
account, which meant following the steps here: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zones-migrating.html
Making it less manual with scripting
It occurred to me that with a bit of jq
magic, I should be able to do this more or less in one or two commands, so first I just ran the command to get the records from the original zone using the aws route53 list-resource-record-sets
command:
aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | jq -r '.'
The <zoneid>
is found by describing the zone in AWS (for now I just grabbed the value from the console), and of course the profile just maps to the right account so that we pull from the right place.
The output from the above gives us a list of everything in that zone, so we need to strip out the SOA
and NS
records as the article describes. Easy enough with a couple of selects, so we update the above to something like:
aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | jq -r '.ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS"))'
That gets us the list without those records, but then the article talks about not including the zoneId, so I chose to just add a sed
command to rewrite the zone ID before output like this:
aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | sed -r 's/<zoneid>/<newzoneid>/g' | jq -r '.ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS"))'
So now I have the raw data for import, so next is figuring out how to reformat in jq
which turns out to be easier than I thought. Basically you can throw text in around the commands you built and as long as you do it in the right place, jq
will spit out the JSON that produces. So the above creates a JSON list of the records in the zone, so we need to turn those into CREATE (or UPSERT) statements that the update uses for the import. To do that we just need to add some things inside the map
statement like this:
map(select(.Type != "SOA") | select(.Type != "NS") | {"Action": "UPSERT", "ResourceRecordSet": .})'
So for each record in our zone, we reformat the JSON for the recordset to be the value of the ResourceRecordSet
with the Action
of UPSERT
giving JSON like:
[ { "Action": "UPSERT", "ResourceRecordSet": { "Name": "app.service.foo.prod.com.", "Type": "CNAME", "TTL": 900, "ResourceRecords": [ { "Value": "xxx.us-west-1.elb.amazonaws.com" } ] } }, { "Action": "UPSERT", "ResourceRecordSet": { "Name": "glassfish.service.foo.prod.com.", "Type": "CNAME", "TTL": 900, "ResourceRecords": [ { "Value": "xxxx.us-west-1.elb.amazonaws.com" } ] } }, ]
Now we need to wrap that array with the rest of the update information, so we have to add a Comment
and make the array be the value of Changes
with this update to the command:
jq -r '{ "Comment": "Migrating service.prod.foo.com zone", "Changes": .ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS")| {"Action": "CREATE", "ResourceRecordSet": .})}'
Breaking it down
- the first curly brace says it will output as one JSON object.
- The
"Comment":
up to the.ResourceRecordSets
gets output literally - The
.ResourceRecordSets
is piped to themap
so each row gets processed by the things inside thatmap
- The first
select
removesSOA
record - The output from that is piped to another
select
which removesNS
records - That is piped to the formatting for the record which adds the
Action
and names the itemResourceRecordSet
all wrapped in curly braces to make it a new JSON object. - There is a closing curly brace on to close out the object we started on line 1
Export command
That yields the proper JSON for input to our command, so saving the JSON to a file with this:
aws route53 list-resource-record-sets --profile foo-master --hosted-zone-id <zoneid> | sed -r 's/<zoneid>/<newzoneid>/g' | jq -r '{ "Comment": "Migrating service.prod.foo.com zone", "Changes": .ResourceRecordSets | map(select(.Type != "SOA") | select(.Type != "NS")| {"Action": "CREATE", "ResourceRecordSet": .})}' > /tmp/foo-master-service-prod.json
Import command
Then you can migrate to the new zone with the following (after getting the zone ID from the new account):
aws route53 change-resource-record-sets --hosted-zone-id <newzoneid> --profile foo-service-prod --change-batch file:///tmp/foo-master-prod.json
Caveats
- Alias records won’t work, as they point to the zone you exported from, and the
sed
tromped on the zone in them, so you may need to delete those, or add morejq
magic to change them intoCName
records. - Alias records have to be created in the right order, so if you have an alias pointing from one alias to another, you have to create the one it points to first.
- You’ll still need to repoint the SOA to the new account for this to be migrated. Probably worth noting because of the way DNS works, you should leave the old zone around for a couple of hours at least to make sure the new SOA and NS records are working.
Conclusion
While this isn’t super simple, it does allow us to migrate a zone from one account to another.
The commands can also be wrapped into a CI/CD pipeline to back up your DNS and keep a clone up to date for recovery from a compromised account.
In my scenario, the number of records in the zone is pretty small, I think there are some issues with large numbers of records in DNS that might require other strategies here, but all in all this is a good step toward treating all of my infrastructure as code.