Key Concepts of Longest Common Subsequence

  1. Subsequence: A subsequence is a sequence derived from another sequence by deleting some elements without changing the order of the remaining elements.
  2. Dynamic Programming: The LCS problem can be efficiently solved using a dynamic programming approach. We create a 2D array where each cell represents the length of the LCS of substrings of the two given strings.

Dynamic Programming Approach

  • Step 1: Create a 2D array dp where dp[i][j] will hold the length of LCS of the first i characters of string X and the first j characters of string Y.
  • Step 2: Fill this array based on the following rules:
    • If the characters match (X[i-1] == Y[j-1]), then: dp[i][j]=dp[i−1][j−1]+1dp[i][j] = dp[i-1][j-1] + 1dp[i][j]=dp[i−1][j−1]+1
    • If they don’t match, then: dp[i][j]=max⁡(dp[i−1][j],dp[i][j−1])dp[i][j] = \max(dp[i-1][j], dp[i][j-1])dp[i][j]=max(dp[i−1][j],dp[i][j−1])
  • Step 3: The value in dp[m][n] (where m and n are the lengths of the strings) will be the length of the LCS.

C Program to Find Longest Common Subsequence

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Function to find the length of Longest Common Subsequence
int longestCommonSubsequence(char* X, char* Y) {
    int m = strlen(X);
    int n = strlen(Y);
    
    // Create a 2D array to store lengths of longest common subsequence
    int** dp = (int**)malloc((m + 1) * sizeof(int*));
    for (int i = 0; i <= m; i++) {
        dp[i] = (int*)malloc((n + 1) * sizeof(int));
    }

    // Build the dp array
    for (int i = 0; i <= m; i++) {
        for (int j = 0; j <= n; j++) {
            if (i == 0 || j == 0) {
                dp[i][j] = 0; // Base case
            } else if (X[i - 1] == Y[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1] + 1; // Characters match
            } else {
                dp[i][j] = (dp[i - 1][j] > dp[i][j - 1]) ? dp[i - 1][j] : dp[i][j - 1]; // Characters don't match
            }
        }
    }

    // Length of LCS is in dp[m][n]
    int lcsLength = dp[m][n];

    // Free the allocated memory for dp
    for (int i = 0; i <= m; i++) {
        free(dp[i]);
    }
    free(dp);

    return lcsLength;
}

// Function to print the LCS itself
void printLCS(char* X, char* Y) {
    int m = strlen(X);
    int n = strlen(Y);
    
    // Create a 2D array to store lengths of longest common subsequence
    int** dp = (int**)malloc((m + 1) * sizeof(int*));
    for (int i = 0; i <= m; i++) {
        dp[i] = (int*)malloc((n + 1) * sizeof(int));
    }

    // Build the dp array
    for (int i = 0; i <= m; i++) {
        for (int j = 0; j <= n; j++) {
            if (i == 0 || j == 0) {
                dp[i][j] = 0; // Base case
            } else if (X[i - 1] == Y[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1] + 1; // Characters match
            } else {
                dp[i][j] = (dp[i - 1][j] > dp[i][j - 1]) ? dp[i - 1][j] : dp[i][j - 1]; // Characters don't match
            }
        }
    }

    // Now, reconstruct the LCS from the dp array
    int index = dp[m][n];
    char* lcs = (char*)malloc((index + 1) * sizeof(char));
    lcs[index] = '\0'; // Null-terminate the string

    int i = m, j = n;
    while (i > 0 && j > 0) {
        if (X[i - 1] == Y[j - 1]) {
            lcs[index - 1] = X[i - 1]; // Store this character in lcs
            i--;
            j--;
            index--;
        } else if (dp[i - 1][j] > dp[i][j - 1]) {
            i--; // Move up in the dp array
        } else {
            j--; // Move left in the dp array
        }
    }

    printf("Longest Common Subsequence: %s\n", lcs);

    // Free the allocated memory
    for (int i = 0; i <= m; i++) {
        free(dp[i]);
    }
    free(dp);
    free(lcs);
}

// Main function
int main() {
    char X[100], Y[100];

    // Input the two strings
    printf("Enter first string: ");
    fgets(X, sizeof(X), stdin);
    X[strcspn(X, "\n")] = '\0'; // Remove newline character

    printf("Enter second string: ");
    fgets(Y, sizeof(Y), stdin);
    Y[strcspn(Y, "\n")] = '\0'; // Remove newline character

    // Find the length of LCS
    int lcsLength = longestCommonSubsequence(X, Y);
    printf("Length of Longest Common Subsequence: %d\n", lcsLength);

    // Print the LCS itself
    printLCS(X, Y);

    return 0;
}

Explanation of the Code

  1. Dynamic Programming Table:
    • The program uses a 2D array dp where dp[i][j] stores the length of the LCS of the first i characters of string X and the first j characters of string Y.
  2. Building the DP Array:
    • The nested loops populate the dp array based on whether characters match or not. The base case is when one of the strings is empty.
  3. Reconstructing the LCS:
    • After filling the dp table, the program reconstructs the LCS by tracing back through the dp array.
  4. Memory Management:
    • The program frees the allocated memory for the dp array and the LCS string to prevent memory leaks.

Input and Output Example

Input

cCopy codeEnter first string: AGGTAB
Enter second string: GXTXAYB

Output

mathematicaCopy codeLength of Longest Common Subsequence: 4
Longest Common Subsequence: GTAB

Conclusion

This C program efficiently computes the Longest Common Subsequence of two strings using dynamic programming. It allows for user input and displays both the length and the actual LCS. You can test different pairs of strings to see how the LCS changes. This approach is suitable for various applications requiring sequence analysis.

Leave a Reply

Your email address will not be published. Required fields are marked *

Verified by MonsterInsights