Backstory
I love open source. I love ElasticSearch, I love the AWS cloud, and I love not getting compromised. Today, I'm putting a solution in production that I just have to tell someone about or I'm going to explode. If you love any of these things too, then read on my friend, because I've created something that can make your life easier.
See this slide presentation for more information on the initial architecture and ideas. Ultimately I dumped Kubernetes because it didn't add any value at this scale.
What was the Problem?
The problem all started at Armor. My friend Andy Reedy had a major problem: The company needed to ingest and index thousands of logs, but Splunk had just gone private and they wanted like, a million dollars or something for us to continue using their product. Andy introduced us to the ELK stack, and I remember thinking how brilliant his ideas were, but also how complex it would be to implement, maintain, and secure. This was only 2015, though, and back then there really wasn't an easier way to do what we needed to do.
Andy's architecture stuck in the back of my mind for years. I encountered the same problem everywhere - dev teams, security teams, product teams, forensics teams...everyone had more access to the app chain, but the data volume was just overwhelming! So I asked myself:
Can I make ElasticSearch Cheap AND Easy AND Secure?
TLDR; the answer is yes.
The Solution
At the time, Andy did not have four crucial technologies that I've implemented in our solution here at LEO:
- Docker Swarm
- AWS Spot Market
- Ansible
- ElasticSearch X-Pack as part of the open source package
By combining these technologies, I've created a secure infrastructure-as-code and configuration-as-code technology that can deploy a fully secured, fully fungible, and fully configured ElasticSearch stack within about ten minutes.
Docker Swarm
Get the code here
In Elastic 7, a system administrator can control the entire stack through environment variables. So, there are basically only three things that you need to configure, and you can do them all from your docker file:
- node configuration
- xpack configuration
- certificates
AWS Spot Market
One of the hardest things we learn moving to the cloud is how to stop loving our instances. As I type this, the computer I'm using still has parts from 2005. I've loved almost every computer I've owned, but that love is worthless in the cloud.
No, in the cloud, we must love our code, and we must love the power it gives us to create and destroy those computers with a keystroke*. In AWS, there is this super awesome feature that everyone learns about but nobody ever uses called "the spot market." Using it is impossibly easy.
Anyone using ansible in AWS will recognize the ec2 module for creating ec2 instances. If you want to make spot instances, just add this to the top:
- ec2: spot_price: "1" spot_wait_timeout: 60
Those two lines right there can save you 85% of your compute and memory costs.
Ansible
Ansible is really what brings all this together. I've used a lot of automation languages; Chef, Terraform, CloudFormation, and Puppet. The problem is, to be frank, that none of them have the range and flexibility of Ansible. By mastering Ansible, I've been able to perform all of the tasks of the above technologies, all for free, and without learning anything more complex than YAML and a bit of Python now and then.
Here's the playbook I use to deploy the swarm:
- name: Build swarm cluster hosts: localhost gather_facts: False tasks: - name: Provision hosts if needed include_tasks: tasks/provision_master_node_AZ1.yml - include_tasks: tasks/provision_data_node_AZ1.yml - include_tasks: tasks/provision_data_node_AZ2.yml - name: Configure each host accordingly import_playbook: tasks/configure_new_hosts.yml - import_playbook: tasks/configure_all_hosts.yml - import_playbook: tasks/configure_master_hosts.yml - import_playbook: tasks/configure_data_hosts.yml - name: Deploy the Docker Swarm stack onto the nodes import_playbook: tasks/deploy_stack.yml
This playbook is idempotent, which means that I can run it as many times as I like on the same stack, and the stack will remain unchanged, or will revert to its "known good" configuration. For more on why this is so important, See my talk on destroying dwell times with Ansible here.
X-Pack
The X-Pack security suite is now included ElasticSearch 7. X-Pack allows you to finally secure the ElasticSearch product, and was a major, major reason to never, ever use ElasticSearch in production. This is just one example of the horrible kind of breach that is commonplace in ES 6.
Let's take a quick walk through the X-Pack and how I got it working as part of an automated configuration.
Create Certificates on the Fly
First, I create an ElasticSearch container just for making the certificates, and then I destroy it immediately.
Here are the bash commands to create the new certificates:
#!/bin/bash /usr/share/elasticsearch/bin/elasticsearch-certutil ca --out elastic-stack-ca.p12 --pass "" /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12 --name es01 --dns es01 --out certs/es01.p12 --ca-pass "" --pass ""
Here is the Ansible to stand up, execute, and destroy the certificate authority for your ES cluster:
- name: Stand up the CA container docker_compose: project_src: /home/ec2-user/elastic-7/elastic-ca state: present recreate: always - name: Create the certificates shell: 'docker exec es-ca.leo.local bash -l -c "cd /usr/share/elasticsearch/certs/; ./commands.sh"' args: chdir: /home/ec2-user/certs - name: Kill the CA container docker_compose: project_src: /home/ec2-user/elastic-7/elastic-ca state: absent
I also create host certificates on the fly using my own certificate authority. If you'd like to learn more about that, reach out to me @bkrubnzi .
Create Passwords on the Fly
Every container is going to need some kind of access. The easiest way to manage this is to generate them on the fly at run-time, save them to secure location, and then place them in the environment variables of the containers directly. Never write them to disk if you can help it, and if you do, make certain to encrypt them.
- set_fact: elastic_password: "{{ lookup('password', '/dev/null chars=ascii_letters,digits,hexdigits') }}"
Conclusion
There's a lot I've left out, as I'm sure you've noticed, but these are the big stories. As always, my code is available on Github and you can always reach me at bkrubnzi@frankencloud.net!