Friday, August 04, 2017

Docker


Installation

https://docs.docker.com/engine/getstarted/step_one/

Examples

Run nginx container
docker run -d -p 80:80 --name webserver nginx
You should be able to access it by going to http://localhost
List all docker images
docker images
List running containers
docker ps
Copy file to a docker container
docker cp html.tar 213389b682d6:/tmp/
Connect to a running container
docker exec -it 213389b682d6 /bin/bash
Saving new container after the changes
docker commit -m "Adding ops runbook" -a "Igor" 213389b682d6 runbook
Run your new container
docker run -d -p 80:80 -it runbook /bin/bash
Login to the container shell
docker exec -it 7e7d9c35a216 bash
Exit from the container shell
Ctrl + p , Ctrl + q
Restart nginx inside your container
docker exec -it 213389b682d6 /etc/init.d/nginx restart
After docker service restart - make sure ecs-agent is running
root@ecs00-us-west-2b:~# docker start ecs-agent

Logging

It seems like docker puts logs in two places:
  1.  Actual container logs
    /var/lib/docker/containers
  2. docker system log
    /var/log/docker.log
The following options should take care of log rotation:
/etc/default/docker
DOCKER_OPTS="--pidfile=/var/run/docker.pid --log-level=warning --log-opt max-size=1g --storage-driver=zfs --storage-opt=zfs.fsname=zd0/containers"
log-level makes docker.log file less verbose.  log-opt makes container logs no bigger than 1GB

Terraform notes

This section serves as a guide line for some usage of terraform to make life easier. This is built from hard experience using terraform in a multi-member team and across multiple environments in production.

Modules

Modules are the reusable construct in terraform. In most cases, we want to make a module out of ANY provisioning process that we expect to reuse.
In addition to reuse, modules provide a repeatable structure and versioning mechanism.
For instance, let’s look at our VPC module. It does the following:
  • Creates a VPC
  • Creates two subnets in that VPC in each AZ - one public/one private
  • Creates a private Route53 zone for that VPC
  • Creates basic security groups for that VPC
  • Creates dhcp options for nodes provisioned in that VPC
  • Provisions NAT nodes in each AZ for outbound traffic from the private subnets
This is a codified and repeatable mechanism for our base AWS environment and best practices for that environment. By using this module, you get all of the basics required to create an isolated environment in AWS for free.
Additionally, as this is a git repo for this module, you can get VERSIONED environments. Consider the following module declaration in terraform code:
module "vpc" {
source = "git::ssh://git@opsgit.i.stormpath.com/tf-modules/vpc?ref=894abd9"
By specifically putting the ref in there, you can be sure that FUTURE changes to the module will not unintentionally propogate down to existing environments. You can easily test changes by spinning up the old version, bumping the rev, calling terraform get --update=true and then terraform plan.
While terraform TRIES to do the right thing here, being strict in your versions lets you know for sure that even if you unintentially issued --update=true, you would not accidentally tear down a stack.

Note
Always use versioned module sources in terraform. Always. This is required for determinism and repeatability.

Use remote state

Terraform provides another mechanism for a form of reusability - remote state.
Looking back out our pushbutton setup, we leverage remote state to help isolate independent infrastucture components. There is no reason creating a new instance needs to modify a VPC. There is no reason, creating an RDS instance needs to modify dhcp options.
You _can_ use non-remote state files but those are not currently allowed in terraform for reuse. We use artifactory as our universal remote state mechanism because it provides built in versioning for us.
In our pushbutton runs, EACH subdirectory pushes its state up to artifactory at the end of the run. That makes it usable for downstream modules. The following is an example of using the VPC remote state for an environment downstream:
resource "terraform_remote_state" "vpc" {
        backend = "artifactory"
        config {
                url = "https://artifactory.i.stormpath.com/artifactory"
                repo = "terraform-state"
                subpath = "customers/${var.orgname}/vpc"
        }
}
This allows us to reference outputs from that remote state:
module "frontend_subnet_c" {
        source = "git::ssh://git@opsgit.i.stormpath.com/tf-modules/frontend_node?ref=60e6ecd"
        orgname = "${var.orgname}"
        gateway_host = "${terraform_remote_state.vpc.output.nat_c_public_ip}"
        .......
}

Note
In terraform 0.7 (unreleased), the syntax for using values from remote state is changing slightly. Instead of terraform_remote_state.vpc.output.some_output, the extra output is being dropped:terraform_remote_state.vpc.some_output
Additionally, 0.7 is adding a new construct called data sources which are immutable, read-only sources of data than can be used in terraform runs - i.e. a json file with a list of current amis or versions of software.

Passing variables and outputs

One area that is painful in terraform is passing variables from modules and other things. This leads to a bit of code duplication however the tradeoffs for safety are worth it.
Use consistent variable input names across modules to help minimize confusion. A good example is our orgname variable. This is used in EVERY module. This provides a programatic way to build tags and names for AWS resources as well as ensures a factor of uniqueness during provisioning. When you have multiple “stacks” in a single AWS account, it’s very easy to see _WHICH_ environment a resource goes with as all names are prefixed with the orgname (i.e. nat-a.orgname.local)
Also, modules to not natively export all outputs from a given resource. For this reason and for future unknowns, you should export ALL outputs from a resource inside a module. Consider the following resource and output in a module:
resource "aws_vpc" "default" {
        cidr_block = "${var.vpc_cidr}"
        enable_dns_hostnames = true
        enable_dns_support = true
        tags { Name = "${var.orgname}-secure"}
}

output "vpc_id" { value = "${aws_vpc.default.id}" }
In this module, we’re ONLY exporting the vpc id. However, what we should be doing is exporting everything that terraform returns as an output in case we need that information in future reuses of the module (originally this resource did not support the additional outputs. They were added to terraform later. This code only exports older outputs):
output "vpc_cidr_block" { value = "${aws_vpc.default.cidr_block}" }
output "vpc_main_route_table_id" { value = "${aws_vpc.default.main_route_table_id}" }
output "vpc_default_network_acl_id" { value = "${aws_vpc.default.default_network_acl_id}" }
output "vpc_default_security_group_id" { value = "${aws_vpc.default.default_security_group_id}" }
You can find the exported attributes of a given resource at the bottom of any resource’s page on the terraform website

Note
Doing a terraform apply will update any remote state with new outputs from modules used assuming a new version of the module was pulled in. Let’s say we needed to get those new attributes for a new terraform use elsewhere. Following our best practices we would:
  • Make changes to the git repo of the module and note the new hash
  • Update the rev in our downstream use of the module
  • call terraform get --update=true to pull in the new version of the module
  • run terraform plan to ensure that we aren’t actually CHANGING anything (or maybe your module allows that - it’s situational)
  • run terraform apply which will grab the new exported outputs from the module and shove them into the current terraform run’s state.
Now we have a remote state file with additional information that we require when using remote state.

Be careful with count

Using the count trick is really handy in terraform when you need multiple copies of something. Be aware, however, that counts “break” the dependency graph. For instance let’s assume you have an aws_ec2_instance resource and you use a count to provision 3 of them. Now let’s assume that you ALSO use that count inside a dependent resource (like a broker id in kafka) and that is a dependency on that instance resource.
When you increase the count, the planner recomputes EVERYTHING using that count variable.
If you must use a count, you should ensure that the count is not used in ANY other resources unless you can safely update those resources as well - this includes passing the same _count_ to a template that you use in the instance resource.

Scope creep

Terraform brings a much needed true infrastructure-as-code approach to things. It is important, however, to not try and do too much in terraform. This is the reason we wrap terraform in Rundeck. Rundeck allows us to wrap terraform with steps that don’t FIT in terraform to create a final deliverable.
Terraform is a really poor fit for application deployment and volatile lifecycle components.
My general rule is if something feels hard in terraform it probably is.

Recreating resource

taint will destroy and re-create resource
terraform taint aws_instance.nat

Terraforming

Terraforming is a discovery utility to build terraform tf and state files from a running AWS infrastructure.
https://github.com/dtan4/terraforming

Installation

gem install terraforming
gem install io-console

Usage

terraforming elb --profile fitch > elb.tf
terraforming elb --profile fitch --tfstate > terraform.tfstate
"terraform plan" should show you no changes after that.

Merge

Example of merging terraform state with existing infrastructure
Disable remote state
terraform remote config -disable
Pull routing data
terraforming rt > route.tf
Update current state file
terraforming rt --tfstate --merge=terraform.tfstate --overwrite
Upload state to remote location
terraform remote config -backend=artifactory -backend-config="repo=terraform-state" -backend-config="subpath=customers/enterpriseeu/vpn"