Block specific languages (with RegExp)

Hi,

I have the regex string:

~/[АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯЁЂЃЄЅІЇЈЉЊЋЌЎЏҐабвгдежзийклмнопрстуфхцчшщъыьэюяёђѓєѕіїјљњћќўџґ]+$/i\

in my blacklist, but it seems to not be working. Is there a simplified version that I can use or is there a reason that the code isn’t working properly? Sorry for the simple question, my regex skills are pretty mediocre lol.

Also, if anyone can provide a regex string to detect Arabic characters that would be amazing! Thank you so much :smiley:

Hey @louis!

Have a look here: How to timeout non English sentences?

Hi Emily,

Thanks for your response! I actually looked at this earlier, and I’m not interested in timing out ALL non-Latin characters. Chinese/Korean/Japanese is okay, I’m really just looking for Arabic and Russian regex terms. Is the error in the regex possibly the \ character after the i?

Okay, then for Cyrillic use: /[\u0400-\u04FF]+/
And for Arabic use: /[\u0600-\u06FF]+/
To block further sets of ASCII characters, have a look there: Public Unicode Character Map.

And to answer your question, yes, your last \ is likely causing an error, and you don’t need the $ either, also I don’t know what the ~ is doing there. Also the i modifier is to make the comparison case insensitive, so with this parameter you could only have uppercase or lowercase characters instead of the whole set, the modifier you might be looking for is g, but given the + I don’t think it’s really necessary.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.