Regular expression basics

If you haven’t heard of regular expressions (regex for short) before, it might be worth covering the basics before continuing with this tutorial. Luckily, we’ve got you covered! See an introduction to regular expressions here .

Implement regular expressions in iOS

Now that you know the basics, it’s time to use regular expressions in your applications.

Download the starter project using the Download Materials button at the top or bottom of this tutorial. Open the iRegex starter project in Xcode and run it.

You’ll build a diary app for your boss, a supervillain! Everyone knows supervillains need to keep track of all their evil plans for world domination, right? There’s a lot of planning to be done, and you, as the Minion, are part of those plans – it’s your job to build applications for their plans!

The app’s UI is almost complete, but the core functionality of the app relies on regular expressions, which it doesn’t have yet!

In this tutorial, your job is to add the regular expressions you need to this application to make it shine (and hopefully avoid being thrown into a vat of hot lava).

Here are some example screenshots showing the final product:

The final application will cover two common use cases for regular expressions:

  1. Perform text searches: highlighting and search and replace.
  2. Validate user input.

You’ll start by implementing the most straightforward use of regular expressions: text search.

Implement search and replace

Here’s a basic overview of the app’s search and replace capabilities:

  • The search view controller SearchViewControllerhas a read-only UITextViewview that contains an excerpt from the boss’s private diary.
  • The navigation bar contains a search button, which will be presented modally SearchOptionsViewController.
  • This will allow your evil boss to type information into the input box and click “Search”.
  • The application then closes the Search view and highlights all matches in the journal in the Text View.
  • If your boss SearchOptionsViewControllerselects the Replace option in , the application will perform a search and replace function on all matches in the text instead of highlighting matching results.

Note: Your application uses the attribute UITextViewof NSAttributedStringto highlight search results. You can also use Text Kit to implement highlighting functionality. Be sure to check out the Text Kit in Swift tutorial to learn more.

There’s also a reading mode button that highlights all dates, times and separators between each entry in the journal. For the sake of simplicity, you won’t cover every possible format for date and time strings that may appear in the text. You’ll implement this highlighting functionality at the end of this tutorial.

The first step in making the search function work is to convert the standard string representing the regular expression into NSRegularExpressionan object.

Open SearchOptionsViewController.swiftSearchViewControllerPresents this view controller modally and allows the user to enter his search (and optional replacement) terms, as well as specify whether the search should be case-sensitive or only match whole words.

Look at the structure at the top of the file SearchOptionsSearchOptionsIt is a simple structure that encapsulates the user’s search options. The code SearchOptionspasses back an instance of SearchViewControllerNSRegularExpressionIt would be nice if it could be used directly to construct the appropriate . You can do this by NSRegularExpressionadding custom initializers and extensions to .

Choose File ▸ New ▸ File… and select Swift File. Name the file RegexHelpers.swift. Open the new file and add the following code:

extension NSRegularExpression {
    convenience init?(options: SearchOptions) throws {
        let searchString = options.searchString
        let isCaseSensitive = options.matchCase
        let isWholeWords = options.wholeWords

        let regexOption: NSRegularExpression.Options = isCaseSensitive ? [] : .caseInsensitive

        let pattern = isWholeWords ? "\\b\(searchString)\\b" : searchString

        try self.init(pattern: pattern, options: regexOption)
    }
}

This code NSRegularExpressionadds a convenience initialization method to . It uses SearchOptionsvarious properties from the passed instance to configure correctly.

have to be aware of is:

  • NSRegularExpressionOptionsThe regular expression in the enumeration type is used whenever the user requests a case-insensitive search .caseInsensitiveNSRegularExpressionThe default behavior of is to perform a case-sensitive search, but in this case you are using the more user-friendly default case-insensitive search.
  • If the user requests a full-field search, the application wraps the regular expression pattern in \bcharacters. Recall \bthat are word boundary characters, so placing before and after the search pattern \bwill turn it into a whole word search (i.e., the pattern \bcat\bwill only match the word “cat”, not “catch”).

If for some reason it cannot be created NSRegularExpression, the initializer will fail and return nil. Now that you have NSRegularExpressionthe object, you can use it to match text.

Open searchViewController.swift, find searchFortext(_:repleastwith:intextView :), and add the following implementation to the blank method:

// 搜索并替换字符
func searchForText(_ searchText: String, replaceWith replacementText: String, inTextView textView: UITextView) {
    if let beforeText = textView.text, let searchOptions = self.searchOptions {
        let range = NSRange(beforeText.startIndex..., in: beforeText)

        if let regex = try? NSRegularExpression(options: searchOptions) {
            let afterText = regex?.stringByReplacingMatches(in: beforeText, 
                                                            options: [],
                                                            range: range,
                                                            withTemplate: replacementText)
            textView.text = afterText
        }
    }
}

First, the method captures UITextViewthe current text in and calculates the range of the entire string. Regular expressions can be applied to only part of the text, which is why you need to specify a range. In this case, you use the entire string, which causes the regular expression to be applied to all text.

The real magic happens in stringByReplacingMatches(in:options:range:withTemplate:)the call to . This method returns a new string without changing the old string. The method then UITextViewsets the new string on so that the user can see the result.

Still SearchViewControllerin , find highlightText(_:inTextView:)and add the following:

// 
func highlightText(_ searchText: String, inTextView textView: UITextView) {
    // 
    let attributedText = textView.attributedText.mutableCopy() as! NSMutableAttributedString
    // 
    let attributedTextRange = NSMakeRange(0, attributedText.length)
    attributedText.removeAttribute(NSAttributedString.Key.backgroundColor, range: attributedTextRange)
    // 
    if let searchOptions = self.searchOptions,
          let regex = try? NSRegularExpression(options: searchOptions) {
        let range = NSRange(textView.text.startIndex..., in: textView.text)
        if let matches = regex?.matches(in: textView.text, options: [], range: range) {
            // 
            for match in matches {
                let matchRange = match.range
                attributedText.addAttribute(
                  NSAttributedString.Key.backgroundColor,
                  value: UIColor.yellow,
                  range: matchRange
                )
            }
        }
    }
    // 
    textView.attributedText = (attributedText.copy() as! NSAttributedString)
}

Build and run your application. Try searching for various words and word groups! You’ll see the search terms highlighted throughout the text, as shown in the image below:

Try searching for the word “the” using various options and see how it works. For example, note that when whole word search is turned on, the “the” in “then” is not highlighted.

Also, test the search and replace functionality to see if text strings are replaced as expected. Also try the Match Case and Whole Word options.

Highlighting and replacing text are both great. But how else can you use regular expressions effectively in your applications?

Data validation

Many applications have user input, such as a user entering an email address or phone number. You need to perform some level of data validation on this user input to ensure data integrity and to respond to any errors when users enter data.

Regular expressions are great for many kinds of data validation because they are excellent at pattern matching.

There are two things you need to add to your application: the validation patterns themselves and a mechanism to use those patterns to validate user input.

As an exercise, try coming up with a regular expression to validate the following text string (don’t worry about case sensitivity):

  • Name : should consist of standard English letters and be between 1 and 10 characters in length.
  • Middle initial : Should consist of one English letter.
  • Name : Should consist of standard English letters plus an apostrophe (for names such as O’Brian), a hyphen (for names such as Randell-Nash), and be between 2 and 20 characters in length.
  • Supervillain name : Should consist of standard English letters, apostrophes, periods, hyphens, numbers, and spaces, and be between 2 and 20 characters in length. This allows for the use of names such as Ra’s al Ghul, Two-Face and Mr. Freeze.
  • Password : At least 8 characters, including 1 uppercase character, 1 lowercase character, 1 number, and 1 non-alphanumeric or numeric character. This one is tricky!

Of course, you can use the iRegex Playground in the profile folder to try out expressions as you develop them.

How did you come up with the regular expression you needed? If you get stuck, just go back to the note at the top of this tutorial to find something that might help you in the scenario above.

The spoiler below shows the regular expression you will use. But before reading further, try to figure it out yourself and check the results!

  "^[a-z]{1,10}$",    // First name
  "^[a-z]$",          // Middle Initial
  "^[a-z'\\-]{2,20}$",  // Last Name
  "^[a-z0-9'.\\-\\s]{2,20}$"  // Super Villain name
  "^(?=\\P{Ll}*\\p{Ll})(?=\\P{Lu}*\\p{Lu})(?=\\P{N}*\\p{N})(?=[\\p{L}\\p{N}]*[^\\p{L}\\p{N}])[\\s\\S]{8,}$" // Password validator

Open AccountViewController.Swiftand add the following code to viewDidLoad():

override func viewDidLoad() {
    super.viewDidLoad()

    textFields = [
        firstNameField,
        middleInitialField,
        lastNameField,
        superVillianNameField,
        passwordField
    ]

    let patterns = [
        "^[a-z]{1,10}$",
        "^[a-z]$",
        "^[a-z'\\-]{2,20}$",
        "^[a-z0-9'.\\-\\s]{2,20}$",
        "^(?=\\P{Ll}*\\p{Ll})(?=\\P{Lu}*\\p{Lu})(?=\\P{N}*\\p{N})(?=[\\p{L}\\p{N}]*[^\\p{L}\\p{N}])[\\s\\S]{8,}$"
    ]

    regexes = patterns.map {
        do {
            let regex = try NSRegularExpression(pattern: $0, options: .caseInsensitive)
            return regex
        } catch {
            #if targetEnvironment(simulator)
            fatalError("Error initializing regular expressions. Exiting.")
            #else
            return nil
            #endif
        }
    }
}

This will create an array of text fields and an array of string patterns in the view controller. It then uses Swift’s mapfunction to create an NSRegularExpressionarray of objects, one for each pattern. If creating the regex via pattern fails, it will show up in the simulator fatalErrorso you can quickly catch it when developing your app, but ignore it in production because you don’t want the app to crash for the user!

To create a regular expression to validate a name, start by matching from the beginning of the string. You then match a sequence of characters from A to Z, and finally match the end of the string, making sure it is between 1 and 10 characters long.

The next two patterns – middle initial and last name – follow the same logic. If it’s a middle initial, you don’t need to specify the length – {1}– since ^[a-z]$matches one character by default. The supervillain name pattern is similar, but starts to look a bit complicated with the addition of support for special characters: apostrophes, hyphens, and periods.

Note that you don’t have to worry about case sensitivity here – you’ll handle that when instantiating the regex.

Now, what does a password regular expression look like? It’s important to stress that this is just an exercise to show how to use regular expressions, you really shouldn’t use it in real-world applications!

Having said that, how does it actually work? First, review some regular expression theory:

  • (parentheses) defines a capturing group that combines parts of a regular expression.
  • When a capturing group ?=begins with , it means that the group will be used as a positive lookahead, matching the pattern in the capturing group only if it is followed by the previous pattern. For example, A(?=B)the letter A will be matched, but only if it is followed by the letter B. A lookahead is an assertion, similar to ^or $, that checks for a specific pattern in a string but does not consume any characters itself.
  • \p{}Matches Unicode characters within a certain category and \P{}matches Unicode characters that do not belong to a certain category. For example, the category could be all letters (\p{L}), all lowercase letters (\p{Lu}), or numbers (\p{N}).

Using this knowledge, break down the regular expression itself:

  • ^and $As usual, matches the beginning and end of a line.
  • (?=\P{Ll}*\p{Ll})Matches (but does not consume) any number of non-lowercase Unicode characters followed by a lowercase Unicode character, actually matching a string containing at least one lowercase character.
  • (?=\P{Lu}*\p{Lu})Follow a similar pattern to above, but make sure there is at least one uppercase character.
  • (?=\P{N}*\p{N})Make sure there is at least one digit.
  • (?=[\p{L}\p{N}]*[^\p{L}\p{N}])Use ( ^) to negate a pattern by ensuring that at least one character is not a letter or number.
  • Finally, [\s\S]{8,}match any character eight or more times by matching whitespace or non-whitespace characters.

well done!

You can get creative with regular expressions. There are other ways to solve the above problem, such as using \dinstead [0-9]. However, any solution is totally fine as long as it works!

Now that you have the schema, you need to validate the text entered in each text field.

Still AccountViewController.swiftin , find validate(string:withRegex:)and replace the dummy implementation with the following:

func validate(string: String, withRegex regex: NSRegularExpression) -> Bool {
    let range = NSRange(string.startIndex..., in: string)
    let matchRange = regex.rangeOfFirstMatch(in: string, options: .reportProgress, range: range)
    return matchRange.location != NSNotFound
}

Then, validateTextField(_:)just below , add the following implementation:

func validateTextField(_ textField: UITextField) {
    let index = textFields.index(of: textField)
    if let regex = regexes[index!] {
        if let text = textField.text?.trimmingCharacters(in: .whitespacesAndNewlines) {
            let valid = validate(string: text, withRegex: regex)

            textField.textColor = (valid) ? .trueColor : .falseColor
        }
    }
}

This is SearchViewController.Swiftvery similar to what you did in . Starting validateTextField(_:)at , get the relevant regular expression from the regular expression array and trim any whitespace in the user-entered text field.

Then, validate(string:withRegex:)create a range for the entire text in and rangeOfFirstMatch(in:options:range:)check for matches by testing the results of . This is probably the most efficient way to check for matches, as this call exits early when the first match is found. However, if you need to know the total number of matches, there are other options, eg numberOfMatches(in:options:range:).

Finally, allTextFieldsAreValid()replace the dummy implementation in with:

func allTextFieldsAreValid() -> Bool {
    for (index, textField) in textFields.enumerated() {
        if let regex = regexes[index] {
            if let text = textField.text?.trimmingCharacters(in: .whitespacesAndNewlines) {
                let valid = text.isEmpty || validate(string: text, withRegex: regex)

                if !valid {
                    return false
                }
            }
        }
    }
    return true
}

Using the same validate(string:withRegex:)method as above, this method just tests whether each non-null text field is valid.

Run the project, click the “Account” icon button in the upper left corner and try to enter some information in the registry. As you complete each field, you should see its text turn green or red, depending on whether it is valid, as shown in the screenshot below:

Try saving your account. Note that this can only be done if all text fields are correctly validated. Restart the application. This time, when the app launches, you’ll see a sign-up form before you can see the secret plans in your diary. Enter the password you just created and click Sign In.

NOTE: This is a regex tutorial, not authentication! Do not use the code in this tutorial as an example of authentication best practices. To emphasize this point, passwords are stored on the device in plain text. LoginViewControllerin loginActiononly checks passwords stored on the device, not passwords stored securely on the server. This is not safe in any way.

Handle multiple search results

You have not used the reading mode button on the navigation bar. When the user clicks on it, the app should enter “focused” mode, highlighting any date or time strings in the text, and highlighting the end of each journal entry.

Open in Xcode SearchViewController.Swiftand find the following implementation of the reading mode button item:

@IBAction func toggleReadingMode(_ sender: AnyObject) {
    if !self.readingModeEnabled {
        readingModeEnabled = true
        decorateAllDatesWith(.underlining)
        decorateAllTimesWith(.underlining)
        decorateAllSplittersWith(.underlining)
    } else {
        readingModeEnabled = false
        decorateAllDatesWith(.noDecoration)
        decorateAllTimesWith(.noDecoration)
        decorateAllSplittersWith(.noDecoration)
    }
}

The above method calls three other helper methods to decorate the date, time, and journal entry separators in the text. Each method takes a decoration option that underlines the text or sets no decoration (removes the underlining). If you look at the implementation of each helper method above, you’ll see that they are all empty!

Before worrying about implementing the decorator methods, you should define and create NSRegularExpressionsthe decorator itself. A convenient way is NSRegularExpressionto create static variables on . Switch to RegexHelpers.swiftand NSRegularExpressionadd the following placeholders in the extension:

static var regularExpressionForDates: NSRegularExpression? {
    let pattern = ""
    return try? NSRegularExpression(pattern: pattern, options: .caseInsensitive)
}

static var regularExpressionForTimes: NSRegularExpression? {
    let pattern = ""
    return try? NSRegularExpression(pattern: pattern, options: .caseInsensitive)
}

static var regularExpressionForSplitter: NSRegularExpression? {
    let pattern = ""
    return try? NSRegularExpression(pattern: pattern, options: .caseInsensitive)
}

Now, your job is to complete these patterns! The following are the requirements:

Date requirements:

  • xx/xx/xxor xx.xx.xxor xx-xx-xxformat. The position of the day, month, and year doesn’t matter because the code will just highlight them. Example: 5/10/12.
  • The full or abbreviated month name (such as Jan or January, Feb or February, etc.), followed by 1 or 2 digits (such as x or xx). The days of the month can be ordinal numbers (for example, 1st, 2nd, 10th, 21st, etc.), followed by a comma as a separator, and then a four-digit number (for example, xxxx). There can be zero or more spaces between the month, day, and year names. Example: March 13th, 2001

Time requirements:

  • Find a simple time, such as “9am” or “11pm”: one or two digits followed by zero or more spaces, followed by lowercase “am” or “pm”.

Dividing line requirements:

  • A sequence of tilde (~) characters, at least 10 characters in length.

You can use the Playground to try these out. See if you can figure out the regular expression you need!

Here are three sample patterns you can try. Replace RegularExpressionForDatesthe empty pattern of with the following:

(\\d{1,2}[-/.]\\d{1,2}[-/.]\\d{1,2})|((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)((r)?uary|(tem|o|em)?ber|ch|il|e|y|)?)\\s*(\\d{1,2}(st|nd|rd|th)?+)?[,]\\s*\\d{4}

The pattern has two parts, |separated by . (or) character. This means either the first part or the second part will match.

The first part is: (\d{1,2}[-/.]\d{1,2}[-/.]\d{1,2}). This means two digits followed by one of -or /or .. Followed by two digits, followed by -or /or ., followed by the last two digits.

The second part ((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)((r)?uary|(tem|o|em)?ber|ch|il|e|y|)?)starts with , which will match the full or abbreviated month name.

Next is that \\s*\\d{1,2}(st|nd|rd|th)?it will match zero or more spaces, followed by one or two digits, followed by an optional ordinal suffix. For example, this will match “1” and “1st”.

Finally, [,]\\s*\\d{4}a comma will be matched, followed by zero or more spaces, followed by a four-digit year.

This is a really intimidating regular expression! However, you can see how regular expressions are concise and contain a lot of information – and powerful! ——becomes a seemingly mysterious string.

Next are the patterns of regularExpressionForTimesand regularExpressionForSplitters. Fill the empty pattern with the following:

// Times
\\d{1,2}\\s*(pm|am)

// Splitters
~{10,}

As an exercise, see if you can interpret the regular expression pattern according to the above specifications.

Finally, open SearchViewController.swiftand fill in SearchViewControllerthe implementation of each decoration method in , as follows:

func decorateAllDatesWith(_ decoration: Decoration) {
    if let regex = NSRegularExpression.regularExpressionForDates {
        let matches = matchesForRegularExpression(regex, inTextView: textView)
        switch decoration {
        case .underlining:
            highlightMatches(matches)
        case .noDecoration:
            removeHighlightedMatches(matches)
        }
    }
}

func decorateAllTimesWith(_ decoration: Decoration) {
    if let regex = NSRegularExpression.regularExpressionForTimes {
        let matches = matchesForRegularExpression(regex, inTextView: textView)
        switch decoration {
        case .underlining:
            highlightMatches(matches)
        case .noDecoration:
            removeHighlightedMatches(matches)
        }
    }
}

func decorateAllSplittersWith(_ decoration: Decoration) {
    if let regex = NSRegularExpression.regularExpressionForSplitter {
        let matches = matchesForRegularExpression(regex, inTextView: textView)
        switch decoration {
        case .underlining:
            highlightMatches(matches)
        case .noDecoration:
            removeHighlightedMatches(matches)
        }
    }

}

Each of these methods uses NSRegularExpressionone of the static variables on to create the appropriate regular expression. They then find a match and call highlightMatches(_:)to color and underline each string in the text, or call removeHighlightedMatches(_:)to revert the style changes. If you’re interested in understanding how they work, take a look at their implementation.

Build and run the application. Now, tap on the reading mode icon. You should see the link style highlighting for the date, time, and separator like this:

Click the button again to disable reading mode and return the text to its normal style.

While this example is great, can you see why a regular expression for time might not be suitable for a more general search? As it currently stands, it won’t match 3:15pm, but 28pm.

This is a challenging question! Learn how to rewrite a time regular expression to match a more general time format.

Specifically, your answer should ab:cd am/pmmatch the time in standard 12-hour format. So it should match: 11:45 am, 10:33 pm, 04:12 am, but not 2pm, 0:00am, 18:44am, 9:63pm or 7:4am. There should be at most one space before am/pm. By the way, it would be fine if 14:33am matched 4:33am.

A possible answer is shown below, but try it yourself first. Check the end of the included Playground to see it in action.

"(1[0-2]|0?[1-9]):([0-5][0-9]\\s?(am|pm))"

Where to go

Congratulations! You now have some practical experience using regular expressions.

You can download a full version of this project using the “Download Materials” button at the top or bottom of this tutorial.

Regular expressions are powerful and fun to use—they’re a lot like solving math problems. The flexibility of regular expressions gives you many ways to create patterns that suit your needs, such as filtering spaces in input strings, stripping HTML or XML tags before parsing, or finding specific XML or HTML tags – and more !

More exercises

There are many practical examples of strings that you can verify using regular expressions. As a final exercise, try to figure out the following regular expression that validates email addresses :

[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

At first glance, it may look like a jumbled mess of characters, but with your newfound knowledge (and the helpful links below), you’re one step closer to understanding it and becoming a regular expression master!

Leave a Reply

Your email address will not be published. Required fields are marked *