Regex Tip: Use [0-9] Instead of \d for Digit Matching
When I first started using regular expressions, I thought \d
and [0-9]
were the same thing. They both match digits, right?
Well… not quite.
In many regex engines, especially the ones that support Unicode, \d
matches more than just the numbers 0–9. It also matches digits from other writing systems like Arabic, Devanagari, or even fullwidth numbers used in some East Asian languages.
So for example, \d
can match things like:
- ٢ (Arabic digit 2)
- ३ (Devanagari digit 3)
- 4 (Fullwidth digit 4)
These might look like regular numbers, but they’re actually different Unicode characters.
If you’re building something like a form validation, and you want to make sure the user enters plain old numbers (like 0 to 9 on a regular keyboard), \d
might let through stuff you didn’t expect.
That’s why I recommend using [0-9]
when you want to be strict and only match ASCII digits. It’s a bit longer, but it’s clear and safe.
This might sound like a small thing, but I’ve seen real bugs caused by this — especially when different systems handle Unicode in different ways.
Summary
\d
matches any Unicode digit (not just 0–9)[0-9]
matches only ASCII digits- If you’re validating input like phone numbers, zip codes, IDs, etc., it’s usually safer to stick with
[0-9]
- Regex engines that support Unicode (Swift, Python, .NET, Perl, Java, etc.) treat
\d
as Unicode by default
I hope this helps! If you’ve ever run into this or even had it break something in production 😅 feel free to ping me on X/Twitter. I’d love to hear about it.