String segmentation: Word Break Problem

Hacking the coding interviews at FAANG!

Vishal Bhushan
3 min readJan 16, 2021

Here’s a question based on Strings that is commonly asked during the coding interviews at FAANG and other top tech companies.

Problem Statement: We are given a dictionary of words and a large input string. We have to find out whether the input string can be completely segmented into the words of a given dictionary.

Solution

Algorithm:

We can solve this problem by segmenting the given string at each possible index, hence dividing the string into two substrings: first_word and second_word.

If the first_word exists in the given dictionary then two cases arise:

  1. if the second_word also exists in the dictionary or its length is 0 then return true —i.e. the string can be segmented
  2. if condition 1 doesn’t hold true then call the method recursively with second_word as input to check if it can be further segmented

At the end of the function return false (no condition satisfies)

Below is a visual representation of the algorithm:

String Segmentation: Initial Setup
String Segmentation: Loop Counter i=0

We move on with the iteration until we find a match in the dictionary.

String Segmentation: Loop Counter i=3
String Segmentation: secondword check
String Segmentation: Recursion Level 1: i=0
String Segmentation: Recursion Level 1: i=1
String Segmentation: Recursion Level 1: secondword check
String Segmentation: Recursion Level 2
String Segmentation: Recursion Level 1: i=2
String Segmentation: Recursion Level 1: i=2
String Segmentation: Level0: Loop Counter i=4
String Segmentation: secondword check

Code(JavaScript)

let canSegmentString = function(s, dictionary) {for (let i = 1; i < s.length + 1; i++) {let first = s.substr(0, i);if (dictionary.has(first)) {let second = s.substr(i);if (second.length === 0) {return true;}if (dictionary.has(second)) {return true;}if (canSegmentString(second, dictionary)) {return true;}}}return false;};

Runtime complexity of this solution is exponential, O(2^n), if we use only recursion.

With memoization, the runtime complexity of this solution can be improved to be polynomial, O(n²).

You can see that we may be computing the same substring multiple times, even if it doesn’t exist in the dictionary. This redundancy can be fixed by memoization, where we remember which substrings have already been solved.

To achieve memoization, we can store the second string in a new set each time. This will reduce both time and memory complexities.

Memory complexity

The memory complexity of this solution is polynomial, O(n²) because we create a substring on each recursion call. Creating a substring can be avoided if we pass indices.

--

--