# 72. Edit Distance
LeetCode problem link (opens new window)
Given two words word1 and word2, calculate the minimum number of operations required to convert word1 to word2.
You can perform the following operations on a word:
Insert a character
Delete a character
Replace a character
Example 1:
Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: horse -> rorse (replace 'h' with 'r') rorse -> rose (remove 'r') rose -> ros (remove 'e')
Example 2:
Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: intention -> inention (remove 't') inention -> enention (replace 'i' with 'e') enention -> exention (replace 'n' with 'x') exention -> exection (replace 'n' with 'c') exection -> execution (insert 'u')
Constraints:
- 0 <= word1.length, word2.length <= 500
- word1 and word2 consist of lowercase English letters
# Approach
The edit distance problem has finally arrived, and this problem can seem extremely complex without an understanding of dynamic programming.
Edit distance is a classic problem solved using dynamic programming. Though this problem looks complex at first, dynamic programming elegantly computes the minimal edit distance.
Here I'll analyze this problem thoroughly using the five steps of dynamic programming:
# 1. Define the dp array (dp table) and the meaning of the indices
dp[i][j] indicates the minimum edit distance between the substring of word1 ending at index i-1 and the substring of word2 ending at index j-1.
Some might ask why this indicates the substring ending at i-1 rather than i.
The reason for defining it this way is explained in detail in 0718.Maximum Length of Repeated Subarray (opens new window).
You could use i! Opting for i-1 simplifies subsequent initialization of the dp array.
# 2. Determine the recurrence formula
When determining the recurrence formula, first consider all possible editing operations, summarized below:
if (word1[i - 1] == word2[j - 1])
No operation
if (word1[i - 1] != word2[j - 1])
Insert
Delete
Replace
2
3
4
5
6
These are the four cases.
if (word1[i - 1] == word2[j - 1]) indicates no editing is needed, so dp[i][j] should equal dp[i - 1][j - 1], i.e., dp[i][j] = dp[i - 1][j - 1];.
Some may not understand right away why dp[i][j] = dp[i - 1][j - 1].
Recall the definition of dp[i][j]. If word1[i - 1] equals word2[j - 1], no editing is needed, and the minimum edit distance between the substring of word1 ending at i-2 and the substring of word2 ending at j-2, which is dp[i - 1][j - 1], is dp[i][j].
If some explanations are unclear, referring back to the definition of dp[i][j] can clarify it.
The key to successfully applying dynamic programming here is a correct understanding of the definition of dp[i][j].
if (word1[i - 1] != word2[j - 1]), editing is necessary. How can you edit it?
Operation 1: Deleting a character from
word1makes it align with a substring ofword2up toj-2, requiring an additional operation. Thus,dp[i][j] = dp[i - 1][j] + 1;.Operation 2: Deleting a character from
word2aligns it with a substring ofword1up toi-2, also requiring an additional operation, sodp[i][j] = dp[i][j - 1] + 1;.
Some might notice these are deletion operations, what about insertion?
Inserting a character in word2 is equivalent to deleting a character in word1. For instance, word1 = "ad", word2 = "a". Removing 'd' in word1 to transform it into a and adding 'd' to word2 to transform it into ad are essentially requiring the same number of operations! The dp array is illustrated as follows:
a a d
+-----+-----+ +-----+-----+-----+
| 0 | 1 | | 0 | 1 | 2 |
+-----+-----+ ===> +-----+-----+-----+
a | 1 | 0 | a | 1 | 0 | 1 |
+-----+-----+ +-----+-----+-----+
d | 2 | 1 |
+-----+-----+
2
3
4
5
6
7
8
Operation 3: Substituting a character: replace word1[i - 1] to match word2[j - 1] with no need for insertion or deletion.
You only need one substitution operation to make word1[i - 1] and word2[j - 1] identical. Hence, dp[i][j] = dp[i - 1][j - 1] + 1;.
In summary, when if (word1[i - 1] != word2[j - 1]), take the minimum of these: dp[i][j] = min({dp[i - 1][j - 1], dp[i - 1][j], dp[i][j - 1]}) + 1;.
The recursion formula in code:
if (word1[i - 1] == word2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1];
}
else {
dp[i][j] = min({dp[i - 1][j - 1], dp[i - 1][j], dp[i][j - 1]}) + 1;
}
2
3
4
5
6