Firstly, this is a topic you have likely heard covered before, and I really hope you will hear again. That is a good thing. Security is a lonely profession, a lot of companies in the startup space hire very small security teams, and so those teams tend to be very isolated. By hearing more about how different teams handle their security, you can get ideas for how to do your own process better, or give feedback. Many of us aren't working at large companies with large security teams, only by sharing can we learn.
On that note. however, this is not a formula for success. If you do what I did, your company wont necessarily be better. Different companies are in different places. Therefore, I am going to start off talking about Roadmunk and how it works, since this is my experience there. Different companies will be in different places. Roadmunk is a startup, but when I joined it, it was already five years old. This means that it's what I would call a mature startup. This is important, since smaller companies will have even less process to work with, fewer certifications, and more technical work to get done. Working at a smaller startup would be ideal, since security grows with a company, but few have the money to hire specialists. Similarly, larger companies do also sometimes try to create dedicated security teams, despite being hundreds of people and far older. This is an environment that is completely different, since it will already have a security culture, and a process for creating new processes. But Roadmunk walked that happy medium, where there was space to maneauver, while also being a framework to work in.
Similarily to who Roadmunk is, who I am matters a lot. Who I am is going to shape my approach a lot. So, just to fill that in, I am a Unix geek. I have been using Unix since highschool, and it is my first love. I basically would prefer to always be working with Unix. However, despite that, I have a degree in Psychology. For me, humans are the most important part of any system, and systems are there for humans. Therefore, I wanted to know humans first. Once I left school, I worked at an ISP as a Unix and networking grunt. Then, I worked at a company doing patient record storage for healthcare and non-profits. Basically, I have a very old school feel, lots of bare metal and networks, running Unix, but also a focus on how humans think, and a bunch of time in a highly regulated environment.
Then I was hired at Roadmunk. As the first person with the word Security in their title. And I had to split that role with my operations. So, it meant I had to be pretty careful to build out the job quickly.
Upon coming to Roadmunk, I quickly learned that it had a bunch of things going for it. They had no local computers, all their servers were in the cloud, and so everyone just showed up to work with a laptop. And they had policies and processes for encrypting those and so on. So I didn't need to handle a lot of local devices. But, of course, everyone's cloud sucks, right? Actually, Roadmunk had already segregated their networks. Only devices that directly had to talk to each other were on the same AWS virtual private cloud. So, the different kinds of services were properly split apart. And to get on those servers, they had managed keys, that were easy to replace, comission, and decomission! It also had a few other nice to haves, and there were internal security conversations. So, I set some goals.
My one year plan was to: - Be proactive - Be aware - Don't get fancy
Firstly, I wanted to make sure that we were staying in a secure state, and that our security decisions mattered. Too often we hear of companies not knowing about infrastructure, or not properly fixing a breach that existed. I recall a story of an incident where a researcher got a bug bounty out of Facebook for a six month old vulnerability that Facebook thought it had fixed. My goal was to not let that happen.
One of the biggest problems in companies the size of Roadmunk is that they are breached, an attacker sits on their network and siphons data for months, and no one notices. I have heard numbers between 90 and 200 days for average breach discovery time. I wanted to make sure that we were aware of events that occurred on the systems, and that if something did occur, we could actually do an investigation. I don't want to be paying consultant rates to setup a logstore for me.
Finally, for this year, I really didn't want to do anything overly fancy. I wanted to make sure that the team was setting realistic goals and had realistic expectations.
To that end, I have three pillars. Firstly, tech. I needed to do a bunch of actual technical work, being proactive and aware means monitoring, logging, patching, and lots and lots of automation. Secondly, I needed to help build culture within Roadmunk. That meant teaching, helping people learn how to give better reports, and helping people prioritize and threat model better. Finally, Roadmunk needed to grow into compliance. Roadmunk had an ISO 270001 certification, which was good, but Roadmunk also needed to build a compliance strategy to help communicate internally and with its customers.
The four pillars of success, as far as I am concerned, for anyone attempting to build a security process are: Automate, Patch, Monitor, and Log. Missing any one of these means that you probably don't have security, no matter how good your pentest result looks.
Automation is probably the least obvious of these. But it is the one that make everything else work. People like to talk up The Cloud as a concept (or they did a few years ago), but honestly, I see a lot of people make mistakes there. The cloud lets you build brand new servers, running up to date code very quickly. I personally use Terraform and Ansible to do this, you don't have to. Both work almost everywhere, and both don't tie you down to one operating system and cloud provider. Their specific use of this is outside my scope, but Terraform is a tool by Hashicorp for deploying cloud servers, and Ansible is a tool by Red Hat for configuring servers that are already live. I basically could not do my job without either of them.
Because, what they let me do is burn everything down. Servers should never be left to live. The easiest way to leave an out of date component running, or forget to decomission a user is to let the server live. You cannot forget something that has been burned down. Roadmunk's release process is to turn on new servers running new code, then, turn off the old servers, then a day later destroy them. This is basically our patching process.