We use the amazing WordPress Redirection plugin. We recently started a survey and the best way to contact the participants was by letter. We had to include a URL to the survey. Some people were mistyping the URL. For example:
To fix this, at 8:10am this morning, I threw up a very hasty regex redirect and went to have my breakfast, slapping myself on the back. It matches all the errors and, I thought, would catch most other typos. Here it is:
^/(directpayment|direct-payment).*
Problem is it also caught the target URL and an endless redirect ensued. The page was down for 9 hours.
After my trials a little while back trying to get to grips with not matching strings in a regex I had a good idea for how to fix it.
The silver lining here is that I now have a very reusable fix when I need to match something very close to the target URL. I’ve had this problem in the past and often just created a completely different URL. Even then this was not foolproof as WordPress keeps it’s own records of old URLs and redirects.
Well, who knew Cloudwatch would be so much fun to tinker with?! Not me!
I have been slowly refining my Cloudwatch dashboard: adding new alarms, expanding the scope of the log watch, all that good stuff. It is very satisfying. Over the last week or so I have also set-up fail2ban because (according to my audit log via Cloudwatch ?) sshd was getting hammered. As previously mentioned, this box is not well resourced, so I wanted to nip that in the bud. But does the cost of running fail2ban outweigh the benefits? Hard to say!
Anyway, I am getting quite a lot of email from fail2ban. This is good because I know it is working but I’d rather not have the email and still be able to easily check it was working… so Cloudwatch!
I added the fail2ban log to the config and used the Logs Insights tool to explore. This is typical line:
2021-04-30 17:58:06.631, "2021-04-30 18:58:06,208 fail2ban.filter [100432]: INFO [sshd] Found 205.185.119.236 - 2021-04-30 18:58:05"
We could use the date/time a few more times, right? I decided this was the time to jump into the parse command in the CloudWatch Logs Insights query language (rolls off the tongue that). I knew I was going to need another regex within about 10 seconds. But, damn, if the examples aren’t thin on the ground. I googled and found virtually nothing although this post did help a bit.
So, to regex101.com I went. I exported a few lines from the log to test and I must be getting quite a lot better because I got the basics working pretty quickly:
Excellent! It was running for about 5 minutes and it suddenly produced a blank line. Of course, [sshd] in the log refers to the jail. I have several set up so…
Had a bit of trouble with what should have been a simple regex today.
I removed a whole bunch of Pages from our site today, and they were all sat under the same parent. I wanted to redirect the now missing pages to the parent. I came up with this:
^\/dignity-care-reports\/[a-z-]*(\/?)
It worked fine in that it redirected the sub-pages but the parent was now in a redirect loop. So, I headed over to regex101.com to investigate. I quickly found that, yes, this regex did match the parent URL.
The great thing about regex101 is that it gives an explanation of what each of your tokens is doing and I quickly saw:
* matches the previous token between zero and unlimited times
And there’s the problem! Now I just needed any sort of match after that second / to stop the redirect loop and a ‘+’ does the job:
This site runs on CentOS 8, on an AWS t2.micro EC2 instance. It’s all free tier at the moment as this is very much a trial run/learning experience.
It’s taken several attempts over the last few months to get the instance running nice and stable. I was getting regular out-of-memory errors, with mysqld just killing itself. It would then get caught in an endless suicidal loop because the restart of the service caused another out-of-memory error.
However, I’m not sure that mysqld was really at fault. I don’t think php-fpm was very well configured. When I set the site up I followed a few tutorials that didn’t really explain how to clean-up afterwards and I ended up running way more php-fpm threads/children than needed for such a tiny, low traffic site. Since I did that, I have not had a crash and have not had to restart the instance.
Yesterday, I was finally able to get an AWS Cloudwatch dashboard set up that actually showed me what was going on. That part was surprisingly easy compared to setting up Cloudwatch itself, which, even following AWS’ on tutorials, was a bit of shot in the dark.
As I said, it all seems pretty stable now so I am adding new WP plugins etc very slowly so I can see if any one thing causes a problem.
Found a weird WordPress bug today. I exported ALL from my old WP hosted blog and imported everything here. It all went fine except all the Post counts for imported Tags in the Tags screen were zero. I found this old post with a similar problem but no solution.
I figured it just needed a nudge to force it to recount the Tags, so simply did a Bulk Edit and added a dummy tag to all my Posts. Checked the Tags and bingo! Then I just deleted the dummy tag.
I love this cartoon. It makes a great point really simply.
However, it’s a bit, well, misleading…
The cartoon suggests that Tr0ub4dor&3 is massively more susceptible to an attack than correcthorsebatterystaple because of the difference in entropy.
Assuming the math is sound (I calculated using actual password space rather than entropy) I still have a problem with how it might be interpreted.
In the cartoon Tr0ub4dor&3 loses entropy points because it’s based on a non-random word with substitutions. Ok, that’s fair enough. Lots of people do create p4ssw0rds this way so it seems reasonable to punish this with lower entropy.
In short, this password is punished because the format is predictable.
However, if we punish that password for a predictable format, it’s also fair to say that correcthorsebatterystaple is thus susceptible to a dictionary attack. Conversely, Tr0ub4dor&3 is entirely secure against such an attack.
A pure brute force attack against Tr0ub4dor&3 with a 1000 guesses a second would take 180 billion years.
The same attack against correcthorsebatterystaple would take 7.5 billion billion years.
The difference is so gigantic it’s almost inconceivably massive.
However, if we consider a dictionary attack using 860,000 words against correcthorsebatterystaple at 1000 guesses a second we’re looking at 17,345 billion years.
Suddenly, correcthorsebatterystaple is a lot less strong.
In fact, if you add a single _ to the end of Tr0ub4dor&3 it now takes 17,134 billion years to brute force. That’s very comparable.
To batter the correcthorsebatterystaple example even more we could alter our dictionary attack and remove all the words that are shorter than 4 characters from the 860,000 total. This would be reasonable as you would want your four word password to be at least 16 characters long.
However, I can’t deny that a pure brute force attack on Tr0ub4dor&3 would take less time than a full dictionary attack on correcthorsestaplebattery and the latter is much easier to remember.
So it’s still an amazing bit of work – it just helps to understand the details.
So… what?
Well, basically, two things.
1) using a format that helps you remember your passwords is a good idea – whether you combine 4 random words or use substitutions – the weakness comes when someone PREDICTS the format.
For example, using 4 random dictionary words AND making a single typical substitution of an alpha char for a numeric char (e.g. substituting a 0 for an o) secures it completely against a dictionary attack and the hacker would have to resort to a brute force attack. However, if the hacker KNOWS that you did this, they know they can modify their dictionary attack instead.
2) there is no substitute for length when it comes to passwords.
For example, take a simple 8 char lower case password. Changing one of those 8 chars to a number means it takes 10 times longer to brute force. However, adding another lowercase char would take 26 times longer.
That’s a generous example but it makes the point. Having a short but complex password like #9Nj and then typing it 4 times is extremely resilient against a brute force attack.
Just make sure no-one is looking over your shoulder…