Show full content
Programmers hate adding security to their systems.
First, it is a huge amount of work, and second, since it is so often left to the end, it is very ugly and disruptive work. A patchwork of hacks.
But it’s misdirected. Without enough security, the system they built is useless, well, worse than useless. If people use it, it could severely screw them over. Nobody would intentionally use something that helps criminals more than it helps them. Even if it is in a walled garden, you can never really be sure that someone isn’t motivated to take a peek.
It’s worth noting that I am not a security expert, and although I’ve had to deal with it a lot in my career, my practice might not be as strong as the experts would like. That being said, I’ll continue.
There are only a few things you need to worry about in security. First is actually identifying any ‘users’. You always have to know who they are and have enough confidence in that decision that you don’t make a mistake.
Then the other part of it is that you want to protect both the data and the code from anyone who isn’t supposed to see or activate it. It’s not enough to protect just the data or just the code. You need both.
In that sense, security isn’t that hard if it is your concern from day one. There are a bunch of entry points that people will use to get to the features and functionality. First, you identify them, then you check to see if they can access the given functionality. If they can, then lower, you check to see if they can access all of the data input into that functionality. If they can, then they can see the output. Simple 🙂
Here’s where the trouble starts. First, there should be no anonymous endpoints. But people love adding them, but they open the door to leaks or denial of service attacks. If you have none, though, all of that goes away. If you can’t quickly identify someone right at the top, punt them immediately, send a log to some administrator. They might have to block the incoming address or put up some firewalls to stop botnets and other nasty things. You always need to flag a punted user as a serious problem.
Second is databases. For capitalist reasons, they charge by users, so the system users are not the same as the database users. That sucks, and it has always sucked. Life would be pretty easy if a person’s identity propagated all the way down to the metal. It should.
If there was a necessity for a group or functional account shared by a bunch of people, then the group is the identity, and that identity goes all the way down.
If your database or its license makes that impossible, then you need to wrap it. You need to wrap it thoroughly enough that pretty much nobody can get to it in any way without first passing an identity check. So, not just in the backend code, but also on the machine in the scripts, with the OS, etc. Everywhere.
Wrap the database. It won’t make it convenient, since that is the opposite of secure, but you need to do it.
Now, at the top, after you have checked identity, you take a quick look at whatever functionality is called. Are they allowed to use it? In some extreme cases, that is a messy lookup table, and it needs to be managed by data admins. It’s annoying, but really, in a large organization, that really should be a distinct piece that is shared by a whole bunch of systems. You just check with it, user X wants to call foo, is that okay?
If that’s good, then as the code executes and hits the wrapped database, the second check will trigger on the data. If it’s good, then it is all done. If you always reuse both the high and lower levels, then the security will be everywhere, and you don’t have to lie awake at night worrying about it failing.
The only other part is that if a user ever sends you ‘code’, you laugh and reject it. If you want some cool dynamic execution feature, great, but there have to be two paths, not one. The code comes in from somewhere else, having been fully and completely vetted, and then the user later asks for it to execute dynamically. That keeps it really simple, and sets you up with some external means for this uber dangerous code to be properly managed, vetted, and approved. That in itself is a huge task; you can’t just ignore it and hope for the best. Dynamic code can never be ‘open’ dynamic code; it has to be closed and come from a reliable source that actually has to be more reliable than just reliable.
So, in the end, if you wrap the database, always identify everyone, manage a lookup table or two, and punt anything that could ever be possibly executed by any downstream party or library, then you’re done. All of this code is reusable; you just need to do it once, at the beginning of the project, then leverage it for success and glory.
First, it is a huge amount of work, and second, since it is so often left to the end, it is very ugly and disruptive work. A patchwork of hacks.
But it’s misdirected. Without enough security, the system they built is useless, well, worse than useless. If people use it, it could severely screw them over. Nobody would intentionally use something that helps criminals more than it helps them. Even if it is in a walled garden, you can never really be sure that someone isn’t motivated to take a peek.
It’s worth noting that I am not a security expert, and although I’ve had to deal with it a lot in my career, my practice might not be as strong as the experts would like. That being said, I’ll continue.
There are only a few things you need to worry about in security. First is actually identifying any ‘users’. You always have to know who they are and have enough confidence in that decision that you don’t make a mistake.
Then the other part of it is that you want to protect both the data and the code from anyone who isn’t supposed to see or activate it. It’s not enough to protect just the data or just the code. You need both.
In that sense, security isn’t that hard if it is your concern from day one. There are a bunch of entry points that people will use to get to the features and functionality. First, you identify them, then you check to see if they can access the given functionality. If they can, then lower, you check to see if they can access all of the data input into that functionality. If they can, then they can see the output. Simple 🙂
Here’s where the trouble starts. First, there should be no anonymous endpoints. But people love adding them, but they open the door to leaks or denial of service attacks. If you have none, though, all of that goes away. If you can’t quickly identify someone right at the top, punt them immediately, send a log to some administrator. They might have to block the incoming address or put up some firewalls to stop botnets and other nasty things. You always need to flag a punted user as a serious problem.
Second is databases. For capitalist reasons, they charge by users, so the system users are not the same as the database users. That sucks, and it has always sucked. Life would be pretty easy if a person’s identity propagated all the way down to the metal. It should.
If there was a necessity for a group or functional account shared by a bunch of people, then the group is the identity, and that identity goes all the way down.
If your database or its license makes that impossible, then you need to wrap it. You need to wrap it thoroughly enough that pretty much nobody can get to it in any way without first passing an identity check. So, not just in the backend code, but also on the machine in the scripts, with the OS, etc. Everywhere.
Wrap the database. It won’t make it convenient, since that is the opposite of secure, but you need to do it.
Now, at the top, after you have checked identity, you take a quick look at whatever functionality is called. Are they allowed to use it? In some extreme cases, that is a messy lookup table, and it needs to be managed by data admins. It’s annoying, but really, in a large organization, that really should be a distinct piece that is shared by a whole bunch of systems. You just check with it, user X wants to call foo, is that okay?
If that’s good, then as the code executes and hits the wrapped database, the second check will trigger on the data. If it’s good, then it is all done. If you always reuse both the high and lower levels, then the security will be everywhere, and you don’t have to lie awake at night worrying about it failing.
The only other part is that if a user ever sends you ‘code’, you laugh and reject it. If you want some cool dynamic execution feature, great, but there have to be two paths, not one. The code comes in from somewhere else, having been fully and completely vetted, and then the user later asks for it to execute dynamically. That keeps it really simple, and sets you up with some external means for this uber dangerous code to be properly managed, vetted, and approved. That in itself is a huge task; you can’t just ignore it and hope for the best. Dynamic code can never be ‘open’ dynamic code; it has to be closed and come from a reliable source that actually has to be more reliable than just reliable.
So, in the end, if you wrap the database, always identify everyone, manage a lookup table or two, and punt anything that could ever be possibly executed by any downstream party or library, then you’re done. All of this code is reusable; you just need to do it once, at the beginning of the project, then leverage it for success and glory.