Perfecting a Regular Expression for Comprehensive Blacklisting

Hello

I’m trying to create a regular expression that will comprehensively block a multitude of bad words, bad word circumvention tactics, and also apply to the name checks in Night bot.
As I’m not very clued up with Reg Ex I’ve used ChatGPT to help me structure the Reg Ex based on a list of Blacklist words I found on Reddit – 700+ words.

This can be shared with others to use to make everyone’s lives simpler.

I have the following Reg Ex and I have a few questions that follow for use in Nightbot
Please note there are bad words in the regular expression itself, I don’t think this can be avoided, as such I’ve blurred it out and summarized it in a block so that it’s not by default showing up on this page, I’m not sure if there is anything further to do for precaution purpose.

Regular Expression from Chat GPT

~/\b(?:arse|ass|asshole|bastard|bitch|cunt|damn|fuck(?:er|ing|face)?|goddamn|motherfucker|prick|shit(?:ass|s)?|son\sof\sa\sbitch|whore|thot|slut)\b/i,
~/\b(?:faggot|homo(?:phobe|sexual)?|lgbt|gay)\b/i,
~/\b(?:nigg(?:a|er)|chink|coon|negro|jew(?:ish)?|anti\s
semitic|muslim(?:s)?|islamophobe)\b/i,
~/\b(?:anal(?:\sleakage)?|anilingus|butt(?:rape)?|clit(?:oris)?|cock(?:s|sucker)?|cum(?:shot|dumpster|ming)?|cunnilingus|dildo|horny|masturbat(?:e|ion)|orgasm|penis|pussy|rape|semen|sex(?:ual)?|slut|vagina)\b/i,
~/\b(?:kill|die|cliff|bridge|shoot(?:ing)?|bomb(?:ing|ed)?|terror(?:ism|ist)|necrophilia|molest|cut\s
myself|fuck\slife|depression)\b/i,
~/\b(?:redtube|porn(?:ography|o)?|xxx|loli(?:con)?|cub|cp|pedo(?:phile|philia)?|child\s
predator|predatory)\b/i,
~/(?:5h1t|5hit|a55|ar5e|a_s_s|b!tch|b17ch|bi+ch|c0ck|cawk|cl1t|c*nt|d1ck|f*ck|phuk|sh!t|sh1t|tw4t|w00se)/i,
~/\b(f[a@4]ck|f[a@4]+u[ck]+k|f[a@4]*k+u)\b,
~/\b(sh[i1!]+t|s[h$5]+[i1!]+t|sh[i1!]+tt+y+|s[h$5]*i[t]+h[e@4]ad)\b,
~/\b(c[u@]+m+[s$5]*h[o0]+t|c[l1!]+it|c[u@]+nt|c[o0]+ck|d[i1!]+ck|d[o0]+uch[e3])\b,
~/\b(b[i1!]+tch|b[o0]+ner|b[a@4]+st[a@4]+rd)\b,
~/\b(wh[o0]+re|tw[a@4]+t|p[u@]+ss[y!i]+)\b,
~/\b(f[a@4]+gg[o0]+t|f[a@4]+gg[i1!]+ng)\b,
~/\b(h[o0]+mo|h[o0]+m[o0]+sexual)\b,
~/\b(4rse|[a@4]ss|[a@4]ssh[o0]+le)\b,
~/\b(k[i1!]+ll+y+[o0]+urs[e3]+lf|su[i1!]+c[i1!]+d[e3]+|s[e3]lf[-\s]*h[a@4]+rm)\b

Question 1 : Must each regex start with the “~/” or must is start with “/” or go straight into "" ?
Question 2 : Must each regex line end with a comma “,” like I currently have it or should it be removed ?
Question 3 : as these reg ex strings are very complex to me, if anyone would like to comment if it’s been structured correctly that would be great :slight_smile: ?

Thanks

We consider the line a regular expression when there is a full line matching this pattern: ~/[pattern]/[flags]

It must start with ~/ and end with / or /[flags]

Since a comma is not a valid flag, you cannot use it there. Each regular expression would need to be separated by a new line.

You can use a site like https://www.regexpal.com/ (or other similar ones on Google) to test the expressions. You should not need to include the ~ at the start (that’s a Nightbot thing).