Problem Statement

Given two strings, calculate the minimum number of edits required to transform the first string into the second. The allowed operations are:

  • Insertion of a character
  • Deletion of a character
  • Substitution of one character for another

Key Concepts

  1. Dynamic Programming: This method breaks down the problem into smaller subproblems and builds up solutions using previously computed results.
  2. State Representation: We can use a 2D array dp where dp[i][j] represents the minimum edit distance between the first i characters of string X and the first j characters of string Y.

Dynamic Programming Approach

  • Step 1: Initialize a 2D array dp of size (m+1) x (n+1), where m is the length of string X and n is the length of string Y.
  • Step 2: Fill the first row and the first column to represent the cost of converting between an empty string and a substring of the other string.
  • Step 3: Iterate through each character of both strings to fill the dp table based on the minimum operations required.

C Program for Minimum Edit Distance

Here’s a complete C program to compute the Minimum Edit Distance:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Function to calculate the minimum edit distance
int minEditDistance(char* X, char* Y) {
    int m = strlen(X);
    int n = strlen(Y);
    
    // Create a DP table
    int** dp = (int**)malloc((m + 1) * sizeof(int*));
    for (int i = 0; i <= m; i++) {
        dp[i] = (int*)malloc((n + 1) * sizeof(int));
    }

    // Initialize the first row and column
    for (int i = 0; i <= m; i++) {
        dp[i][0] = i; // Deletion cost
    }
    for (int j = 0; j <= n; j++) {
        dp[0][j] = j; // Insertion cost
    }

    // Fill the DP table
    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (X[i - 1] == Y[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1]; // No operation needed
            } else {
                dp[i][j] = 1 + (dp[i - 1][j - 1] < dp[i][j - 1] ? 
                               (dp[i - 1][j - 1] < dp[i][j - 1] ? 
                               dp[i - 1][j - 1] : dp[i][j - 1]) : 
                               (dp[i][j - 1] < dp[i - 1][j] ? 
                               dp[i][j - 1] : dp[i - 1][j]));
            }
        }
    }

    // The minimum edit distance is in dp[m][n]
    int distance = dp[m][n];

    // Free the allocated memory
    for (int i = 0; i <= m; i++) {
        free(dp[i]);
    }
    free(dp);

    return distance;
}

// Main function
int main() {
    char X[100], Y[100];

    // Input the two strings
    printf("Enter first string: ");
    fgets(X, sizeof(X), stdin);
    X[strcspn(X, "\n")] = '\0'; // Remove newline character

    printf("Enter second string: ");
    fgets(Y, sizeof(Y), stdin);
    Y[strcspn(Y, "\n")] = '\0'; // Remove newline character

    // Calculate the minimum edit distance
    int distance = minEditDistance(X, Y);
    printf("Minimum Edit Distance (Levenshtein Distance): %d\n", distance);

    return 0;
}

Explanation of the Code

  1. Dynamic Programming Table:
    • A 2D array dp is created, where dp[i][j] holds the minimum edit distance between the first i characters of string X and the first j characters of string Y.
  2. Initialization:
    • The first row represents the cost of converting from an empty string to substrings of Y, which is equal to the length of those substrings (insertion cost).
    • The first column represents the cost of converting from substrings of X to an empty string, which is equal to the length of those substrings (deletion cost).
  3. Filling the DP Table:
    • We iterate through both strings, checking if the characters match. If they do, no additional cost is incurred. If they don’t, we compute the minimum cost of the three possible operations (insertion, deletion, substitution).
  4. Memory Management:
    • The program dynamically allocates memory for the dp array and frees it after use to prevent memory leaks.

Input and Output Example

Input

cCopy codeEnter first string:kitten
Enter second string:sitting

Output

javaCopy codeMinimum Edit Distance (Levenshtein Distance): 3

Explanation of the Output

In this case, the minimum edit distance of 3 can be achieved by:

  1. Substituting ‘k’ with ‘s’
  2. Substituting ‘e’ with ‘i’
  3. Inserting ‘g’ at the end.

Conclusion

This C program efficiently computes the Minimum Edit Distance (Levenshtein Distance) between two strings using dynamic programming. It allows user input for flexibility and outputs the minimum number of edits required to convert one string into another. You can test various pairs of strings to see how the edit distance changes, making this approach useful in various applications, including spell checking, DNA sequence analysis, and more.

Leave a Reply

Your email address will not be published. Required fields are marked *

Verified by MonsterInsights