Replacing ObjectId with a string in JSON. Using RegEx

Problem: I have a data dump of a MongoDb query in a JSON file. I need to replace the ObjectID(“12345677abc”) with “12345677abc”.

Using Visual Studio Code’s find and replace

Find:

ObjectId\("([0-9a-fA-F]{24})"\)

Replace with: “$1”

Turns this

"_id" : ObjectId("5e3b1890e032d225a091d43f"),
"userId" : ObjectId("65ed1c2c-922c-4c82-b5bc-7324f69eea10"),

To this

"_id" : "5e3b1890e032d225a091d43f",
"userId" : "65ed1c2c-922c-4c82-b5bc-7324f69eea10",


Bonus:

ISODate\("([^"]+)"\)

REGEX Examples with explanations

A string 24 characters in length and only with hexadecimal letters and numbers

Example use: In MongoDb the unique id’s (ObjectId)

The string must be 24 characters long. ^ Starts of string or start of line, $ end of string or line; both dependent on the “multi-line mode”

/^{24}$/

Hexadecimal uses only the letters “a to f” and the number “0 to 9”

/^[a-f0-9]{24}$/

REGEX 2

Okay in the previous post we had found the “find” tool and realised we can do so much more with it using regular expressions (regEx).

To recap a regEx is

“a sequence of symbols and characters expressing a string or pattern to be searched for within a longer piece of text.”

\d = a character 0 to 9

\w = any character a to Z and 0 to 9

\s =  whitespace

Example:

\d\d\d will (using the find tool in your text editor) will highlight groups of 3 numbers in a string

\w\w\w\w\w will highlight groups of 5 characters

Notice how \w\w\w included numbers and letters

\s\s will highlight double spaces

Lets look for words that have only 4 characters.

A 4 letter word can be described as,

“a space followed by any 4 characters, followed by a space”

\s\w\w\w\w\s

Which can be rewritten as

\s\w{4}\s

But that will also include numbers. To ignore numbers

\s\w{4}[a-z]\s

Not quite there, if you are playing along you will notice that we are highlighting 4 letter words and the space before and after. What we need is to set boundaries.

\b\w{4}[a-z]\b

\b is a boundary, there are a few but for now lets stay with ‘spaces’. So with that you can find all four letter words