I’ve been doing web development for ages and still do not really “know” regex. I think there are many in this camp, so I wanted to show you how I handle situations when I need to do something with regex.
I simply use a regex tester tool (https://regex101.com/). this lets me create a regex string and test it right there. It also gives me hints and a reference list for all the regex tokens.
Example project I worked on
So in this specific project I needed to use regex to filter for a specific page URL to then forward it to another URL. The word I was looking for is “science” and the URL was: http://domain.com/science?var=12345. Sometimes this URL can have a back slash after the word science, so I needed to make sure to catch that possibility also.
After going to Regex101.com, I saw in the reference that (a|b) lets me choose one or the other, so this would help me catch the without the back slash variation as well as one with it.
So I started to build it: (science|science/)
Right away it told me there was an error. I knew that you are suppose to escape some characters, so I escaped the back slash with a forward slash.
My regex: (science|science\/)
Now that I have this, I need to include the question mark that goes after to identify specifically this page, because I do not want it to catch pages like “science-is-cool”. So I added the question mark, but remembered that they also need to be escaped.
My regex: (science\?|science\/\?)
When testing this against this URL, it doesn’t catch it:
But when the page is “science”, it does:
as well as without the back slash:
Perfect. This is what I needed. I know this is a lengthy process than if I would have known regex without having to look it up, but honestly, I do not need it that often so stopping the music to go an learn it would be a waste of time anyway since I won’t be using it enough to retain it.
Sometimes your victim is more complicated than just a word, so you can search for something someone else already resolved. This is as easy as a Google search. I needed to catch when someone entered a PO Box into the shipping address field. For this I just Googled “regex po box”. The first result from Github delivered the solution for me: https://gist.github.com/gregferrell/74946670.
Just to make sure it works, I pasted it into Regex101 to make sure it catches any version of PO Box I can throw at it, and it does very well.
So there you go. If Google doesn’t solve it on the first try, try searching for something that is related to the filter you need and you will surely find something that will work. I have not had a case that I was not able to find a solution yet.
How do you deal with regex? Let me know in the comments.