C# Coding Test Course, String Search

Problem Description

You need to find out how many times a specific string P appears in the string S. String P can appear multiple times in string S, and there may be overlapping occurrences. The length of the given string S is between 1 and 100,000, and the length of string P is between 1 and 100. The comparison is case-insensitive.

Input Format

  • First line: string S (1 ≤ |S| ≤ 100,000)
  • Second line: string P (1 ≤ |P| ≤ 100)

Output Format

Print the total number of occurrences as an integer.

Example

Input

    abCabcABCabc
    abc
    

Output

    4
    

Problem Solving Process

1. Understanding the Problem

This problem involves checking how many times string P appears in the given string S. Since string comparison is case-insensitive, both strings should be converted to lowercase for comparison.

2. Approach

There are several approaches to string search problems, but we will solve it directly using a simple loop. The following steps will be taken to solve the problem:

  1. Convert string S to lowercase.
  2. Convert string P to lowercase.
  3. Use a loop to find string P in string S.
  4. Increment the count for each occurrence.

3. Algorithm Design

The complexity of the algorithm is O(n*m), where n is the length of string S and m is the length of string P. Since we are directly comparing the two strings, in the worst case we may need to search string P from every index.

4. C# Code Implementation

Below is an example of C# code.

    using System;

    class Program
    {
        static void Main(string[] args)
        {
            string S = Console.ReadLine();
            string P = Console.ReadLine();

            // Convert to lowercase for case-insensitive comparison
            S = S.ToLower();
            P = P.ToLower();

            int count = 0;
            int position = 0;

            while ((position = S.IndexOf(P, position)) != -1)
            {
                count++;
                position++; // Move position by one to consider overlapping cases
            }

            Console.WriteLine(count);
        }
    }
    

5. Code Explanation

The code includes the following steps:

  1. Accept input strings S and P from the user.
  2. Convert strings S and P to lowercase for easier comparison.
  3. Use a while loop to find the position of P in string S.
  4. Use the S.IndexOf() method to locate P starting from the current position. If the found position is not -1, increase the count and move to the next position.
  5. Output the total number of occurrences.

Performance Considerations

The time complexity of this code is O(n*m), which varies depending on the lengths of strings S and P. If the length of S is 100,000 and P is 100, the worst case could require 10,000,000 operations. This may be somewhat inefficient.

Therefore, if you wish to further improve performance, you might consider using string search algorithms like KMP (Knuth-Morris-Pratt). The KMP algorithm has a time complexity of O(n + m) and enables more efficient searching.

5-1. Overview of the KMP Algorithm

The KMP algorithm is an efficient method for substring searching. It operates on the following principles:

  • First, it creates an array to store the partial matches of the pattern string P.
  • While scanning string S, it calculates how many characters can be skipped in the pattern string when a mismatch occurs.

5-2. KMP Algorithm C# Implementation

Below is the C# code implementing the KMP algorithm.

    using System;

    class Program
    {
        static void Main(string[] args)
        {
            string S = Console.ReadLine();
            string P = Console.ReadLine();

            // Convert to lowercase for case-insensitive comparison
            S = S.ToLower();
            P = P.ToLower();

            int count = KMP(S, P);
            Console.WriteLine(count);
        }

        static int KMP(string S, string P)
        {
            int m = P.Length;
            int n = S.Length;
            int count = 0;

            // Initialize LPS array
            int[] lps = new int[m];
            ComputeLPSArray(P, m, lps);

            int i = 0; // Index for S
            int j = 0; // Index for P

            while (i < n)
            {
                if (P[j] == S[i])
                {
                    i++;
                    j++;
                }

                if (j == m)
                {
                    count++;
                    j = lps[j - 1];
                }
                else if (i < n && P[j] != S[i])
                {
                    if (j != 0)
                        j = lps[j - 1];
                    else
                        i++;
                }
            }
            return count;
        }

        static void ComputeLPSArray(string P, int m, int[] lps)
        {
            int len = 0;
            int i = 1;
            lps[0] = 0;

            while (i < m)
            {
                if (P[i] == P[len])
                {
                    len++;
                    lps[i] = len;
                    i++;
                }
                else
                {
                    if (len != 0)
                        len = lps[len - 1];
                    else
                    {
                        lps[i] = 0;
                        i++;
                    }
                }
            }
        }
    }
    

6. KMP Algorithm Code Explanation

The above code operates in the following manner:

  1. First, it converts strings S and P to lowercase to eliminate case sensitivity.
  2. Calls the KMP method to explore how many times string P appears in string S.
  3. Inside the KMP method, it generates the LPS array. The LPS array stores the maximum length of the prefix and suffix of pattern P.
  4. While scanning string S, it matches the pattern P. If matching is successful, it increments the count, and if matching fails, it adjusts the position based on the LPS array.

Conclusion

In this lecture, we learned how to solve the problem of finding a specific substring in a string using C#. From a simple loop approach to the extension using the KMP algorithm, we gained an understanding of the fundamental concepts of string searching and efficient approaches. I hope this process helped you understand various coding implementations and the complexities of algorithms.

References