I have written here numerous times about GPT-3, the language model developed by OpenAI, which has produced stunning and sometimes scary results on problems such as text completion, question answering, writing computer code, and generating text-based adventure games. Just search for “GPT-3” in the site search box for links.
Now, “Aarya” has applied GPT-3 to one of the most arcane corners of programming: composing regular expressions to match patterns in text. If you have never encountered regular expressions, you may consider yourself as having lived a privileged life, although perhaps deemed too sheltered by gnarly-fingered programmers. Regular expressions pack a lot of power in a few characters, but can drive you crazy to write and debug. A simple example might be to find words that begin with a vowel and end with “ology”.
Here’s one that validates MasterCard numbers.
How about one that matches a HTML tag in a document and its matching closing tag?
The GPT-3 based tool is called AutoRegex, and requires you to create a free account or sign in with Google in order to use it. You can enter an English description of what you want to match, and it generates a regular expression to do that. For example:
words beginning with a vowel and ending with “ology”
valid IPv4 address
You can also give it a regular expression and ask it to explain, in English, what it does. Let’s try the HTML tag matcher from above.
The regular expression matches an opening HTML tag (<), followed by one or more characters ([a-z]+) for the tag name, followed by zero or more characters ([^<]+) that are not “<” characters, followed by a closing angle bracket (>). The regular expression also includes a non-capturing group (?:>) that matches either a closing HTML tag with the same name as the opening tag, or a space followed by a forward slash (/).
As with everything GPT-3 related, it is not perfect, but it is phenomenal. If you get any interesting or enlightening results, please post them here as comments.