KMP Algorithm for Pattern Searching

Rabin-Karp Algorithm for Pattern Searching

Last Updated : 03 Jun, 2025

Try it on GfG Practice

Given two strings text and pattern string, your task is to find all starting positions where the pattern appears as a substring within the text. The strings will only contain lowercase English alphabets.

While reporting the results, use 1-based indexing (i.e., the first character of the text is at position 1). You are required to identify every occurrence of the pattern, including overlapping ones, if any.

Examples:

Input: text = "birthdayboy", pattern = "birth"
Output: [1]
Explanation: The string "birth" occurs at index 1 in text.
Input: text = "geeksforgeeks", pattern = "geek"
Output: [1, 9]
Explanation: The string "geek" occurs twice in text, one starts are index 1 and the other at index 9

Rabin-Karp Algorithm

In the Naive String Matching algorithm, we check whether every substring of the text of the pattern's size is equal to the pattern or not one by one.

Like the Naive Algorithm, the Rabin-Karp algorithm also check every substring. But unlike the Naive algorithm, the Rabin Karp algorithm matches the hash value of the pattern with the hash value of the current substring of text, and if the hash values match then only it starts matching individual characters. So Rabin Karp algorithm needs to calculate hash values for the following strings.
Pattern itself
All the substrings of the text of length m which is the size of pattern.

Hash value is used to efficiently check for potential matches between a pattern and substrings of a larger text.

How is Hash Value calculated in Rabin-Karp?

The hash value is calculated using a rolling hash function, which allows you to update the hash value for a new substring by efficiently removing the contribution of the old character and adding the contribution of the new character. This makes it possible to slide the pattern over the text and calculate the hash value for each substring without recalculating the entire hash from scratch.

For a string S of length M, the rolling hash is built like this:

hash=(((S[0]⋅b+S[1])⋅b+S[2])⋅b+⋯+S[M−1])modp

Which is equivalent to:

hash=( i=0∑M−1 S[i]⋅b M−i−1 )modp

Here's how the hash value is typically calculated in Rabin-Karp:

Pick a prime number p as the modulus to reduce hash collisions. Choose a base b, which is often the size of the character set 256 (e.g., ASCII charset size).
Start with hash = 0 before processing the pattern or text.
Iterate over the pattern from left to right. For each character c, compute the hash using the formula:
\text{hash} = (\text{hash} \cdot b + c) \bmod p

Calculate the hash for the first substring of text matching the pattern's length. Use the same method as for the pattern hash.
Update the hash by removing the leftmost char and adding the next one.

\text{hash} = \left( \text{hash} - (\text{text}[i - \text{pattern\_length}] \cdot b^{\text{pattern\_length} - 1}) \bmod p \right) \cdot b + \text{text}[i]

If the current text hash equals the pattern hash, verify character by character. This is needed because different substrings can have the same hash (collision).

Illustration Calculate the Hash Value of the pattern (e.g., pat)

calculation_of_the_hash_value_of_the_pat_

We use the formula \text{hash} = (\text{hash} \cdot b + c) \bmod p to compute the hash value of the pattern string.

Demonstration of pattern(e.g., pat) matching in the given text(e.g., txt)

calculation_of_the_hash_value_of_pat_

To compute the hash value for a substring of length equal to the pattern, we apply the formula iteratively, updating the hash by incorporating each character’s contribution according to its position.

\text{hash} = \left( \text{hash} - (\text{text}[i - \text{pattern\_length}] \cdot b^{\text{pattern\_length} - 1}) \bmod p \right) \cdot b + \text{text}[i]

Step-by-step approach:

Initially calculate the hash value of the pattern.
Start iterating from the starting of the string:
- Calculate the hash value of the current substring having length m.
- If the hash value of the current substring and the pattern are same check if the substring is same as the pattern.
- If they are same, store the starting index as a valid answer. Otherwise, continue for the next substrings.
Return the starting indices as the required answer.

C++

// Rabin-Karp Algorithm for Pattern Searching in C++
// Reference: Introduction to Algorithms (CLRS)

#include <iostream>
#include <vector>
using namespace std;

// Function to search for all occurrences of 'pat' in 'txt' using Rabin-Karp
vector<int> search(const string &pat, const string &txt){
    
    // Number of characters in the input alphabet (ASCII)
    int d = 256;
    // A prime number for modulo operations to reduce collisions
    int q = 101;
    // Length of the pattern
    int M = pat.length();
    // Length of the text
    int N = txt.length();
    // Hash value for pattern
    int p = 0;
    // Hash value for current window of text
    int t = 0;
    // High-order digit multiplier
    int h = 1;
    
    vector<int> ans;
    
    // Precompute h = pow(d, M-1) % q
    for (int i = 0; i < M - 1; i++)
        h = (h * d) % q;

    // Compute initial hash values for pattern and first window of text
    for (int i = 0; i < M; i++){
        
        p = (d * p + pat[i]) % q;
        t = (d * t + txt[i]) % q;
    }

    // Slide the pattern over text one by one
    for (int i = 0; i <= N - M; i++){
        
        // If hash values match, check characters one by one
        if (p == t){
            bool match = true;
            for (int j = 0; j < M; j++){
                
                if (txt[i + j] != pat[j]){
                
                    match = false;
                    break;
                }
            }
            if (match)
                ans.push_back(i + 1);
        }

        // Calculate hash value for the next window
        if (i < N - M){
            
            t = (d * (t - txt[i] * h) + txt[i + M]) % q;

            // Ensure hash value is non-negative
            if (t < 0)
                t += q;
        }
    }
    return ans;
}

int main(){
    
    string txt = "birthboy";
    string pat = "birth";

    // Function call to search pattern in text
    vector<int> res = search(pat, txt);

    for (auto it : res){
        cout << it << " ";
    }
    cout << "\n";
    return 0;
}

Java

import java.util.ArrayList;

public class RabinKarp {

    // Function to search for all occurrences of 'pat' in 'txt' using Rabin-Karp
    public static ArrayList<Integer> search(String pat, String txt) {

        // Number of characters in the input alphabet (ASCII)
        int d = 256;
        // A prime number for modulo operations to reduce collisions
        int q = 101;
        // Length of the pattern
        int M = pat.length();
        // Length of the text
        int N = txt.length();
        // Hash value for pattern
        int p = 0;
        // Hash value for current window of text
        int t = 0;
        // High-order digit multiplier
        int h = 1;
        // ArrayList to store result indices
        ArrayList<Integer> ans = new ArrayList<>();

        // Precompute h = pow(d, M-1) % q
        for (int i = 0; i < M - 1; i++)
            h = (h * d) % q;

        // Compute initial hash values for pattern and first window of text
        for (int i = 0; i < M; i++) {
            p = (d * p + pat.charAt(i)) % q;
            t = (d * t + txt.charAt(i)) % q;
        }

        // Slide the pattern over text one by one
        for (int i = 0; i <= N - M; i++) {

            // If hash values match, check characters one by one
            if (p == t) {
                boolean match = true;
                for (int j = 0; j < M; j++) {
                    if (txt.charAt(i + j) != pat.charAt(j)) {
                        match = false;
                        break;
                    }
                }
                if (match)
                    ans.add(i + 1); // 1-based indexing
            }

            // Calculate hash value for the next window
            if (i < N - M) {
                t = (d * (t - txt.charAt(i) * h) + txt.charAt(i + M)) % q;

                // Ensure hash value is non-negative
                if (t < 0)
                    t += q;
            }
        }

        return ans;
    }

    public static void main(String[] args) {

        // text and pattern
        String txt = "birthboy";
        String pat = "birth";

        // Search pattern in text
        ArrayList<Integer> res = search(pat, txt);

        // Print result
        for (int index : res)
            System.out.print(index + " ");
        System.out.println();
    }
}

Python

# Rabin-Karp Algorithm for Pattern Searching in Python
# Reference: Introduction to Algorithms (CLRS)

def search(pat, txt):
    # Number of characters in the input alphabet (ASCII)
    d = 256  
    # A prime number for modulo operations to reduce collisions
    q = 101  
    # Length of the pattern
    M = len(pat)  
    # Length of the text
    N = len(txt)
    # Hash value for pattern
    p = 0  
    # Hash value for current window of text
    t = 0  
    # High-order digit multiplier
    h = 1  
    ans = []

    # Precompute h = pow(d, M-1) % q
    for i in range(M - 1):
        h = (h * d) % q

    # Compute initial hash values for pattern and first window of text
    for i in range(M):
        p = (d * p + ord(pat[i])) % q
        t = (d * t + ord(txt[i])) % q

    # Slide the pattern over text one by one
    for i in range(N - M + 1):
        # If hash values match, check characters one by one
        if p == t:
            match = True
            for j in range(M):
                if txt[i + j] != pat[j]:
                    match = False
                    break
            if match:
                ans.append(i + 1)

        # Calculate hash value for the next window
        if i < N - M:
            t = (d * (t - ord(txt[i]) * h) + ord(txt[i + M])) % q
            if t < 0:
                t += q

    return ans

# Driver code
if __name__ == "__main__":
    txt = "birthboy"
    pat = "birth"
    res = search(pat, txt)
    print(" ".join(map(str, res)))

C#

using System;
using System.Collections.Generic;

class GfG{

    // Function to search for all occurrences of 'pat' in 'txt' using Rabin-Karp
    public static List<int> search(string pat, string txt) {
        
        // Number of characters in the input alphabet (ASCII)
        int d = 256;
        // A prime number for modulo operations to reduce collisions
        int q = 101;
        // Length of the pattern
        int M = pat.Length;
        // Length of the text
        int N = txt.Length;
        // Hash value for pattern
        int p = 0;
        // Hash value for current window of text
        int t = 0;
        // High-order digit multiplier
        int h = 1;
        // List to store result indices
        List<int> ans = new List<int>();

        // Precompute h = pow(d, M-1) % q
        for (int i = 0; i < M - 1; i++)
            h = (h * d) % q;

        // Compute initial hash values for pattern and first window of text
        for (int i = 0; i < M; i++) {
            p = (d * p + pat[i]) % q;
            t = (d * t + txt[i]) % q;
        }

        // Slide the pattern over text one by one
        for (int i = 0; i <= N - M; i++) {
            // If hash values match, check characters one by one
            if (p == t) {
                bool match = true;
                for (int j = 0; j < M; j++) {
                    if (txt[i + j] != pat[j]) {
                        match = false;
                        break;
                    }
                }
                if (match)
                    ans.Add(i + 1);
            }

            // Calculate hash value for the next window
            if (i < N - M) {
                t = (d * (t - txt[i] * h) + txt[i + M]) % q;

                // Ensure hash value is non-negative
                if (t < 0)
                    t += q;
            }
        }

        return ans;
    }

    static void Main() {
        // Input text and pattern
        string txt = "birthboy";
        string pat = "birth";

        // Search pattern in text
        List<int> res = search(pat, txt);

        // Print result
        foreach (int index in res) {
            Console.Write(index + " ");
        }
        Console.WriteLine();
    }
}

JavaScript

// Rabin-Karp Algorithm for Pattern Searching in JavaScript
// Reference: Introduction to Algorithms (CLRS)

function search(pat, txt) {
    
    // Number of characters in the input alphabet (ASCII)
    const d = 256;  
    // A prime number for modulo operations to reduce collisions
    const q = 101; 
    // Length of the pattern
    const M = pat.length;  
    // Length of the text
    const N = txt.length;
    // Hash value for pattern
    let p = 0;  
    // Hash value for current window of text
    let t = 0;  
    // High-order digit multiplier
    let h = 1;  
    const ans = [];

    // Precompute h = pow(d, M-1) % q
    for (let i = 0; i < M - 1; i++) {
        h = (h * d) % q;
    }

    // Compute initial hash values for pattern and first window of text
    for (let i = 0; i < M; i++) {
        p = (d * p + pat.charCodeAt(i)) % q;
        t = (d * t + txt.charCodeAt(i)) % q;
    }

    // Slide the pattern over text one by one
    for (let i = 0; i <= N - M; i++) {
        // If hash values match, check characters one by one
        if (p === t) {
            let match = true;
            for (let j = 0; j < M; j++) {
                if (txt[i + j] !== pat[j]) {
                    match = false;
                    break;
                }
            }
            if (match) ans.push(i + 1);
        }

        // Calculate hash value for the next window
        if (i < N - M) {
            t = (d * (t - txt.charCodeAt(i) * h) +
                 txt.charCodeAt(i + M)) % q;
            if (t < 0) t += q;
        }
    }

    return ans;
}

// Driver code
const txt = "birthboy";
const pat = "birth";
const res = search(pat, txt);
console.log(res.join(" "));

Output

Time Complexity: O(n + m), but in the worst case, when all characters of the pattern and text are identical causing all substring hash values of the text to match the pattern’s hash, the time complexity degrades to O(nm).

Auxiliary Space: O(1)

Limitations of Rabin-Karp Algorithm

When the hash value of the pattern matches with the hash value of a window of the text but the window is not the actual pattern then it is called a spurious hit. Spurious hit increases the time complexity of the algorithm. In order to minimize spurious hit, we use good hash function. It greatly reduces the spurious hit.

Related Posts:
Searching for Patterns | Set 1 (Naive Pattern Searching)
Searching for Patterns | Set 2 (KMP Algorithm)

KMP Algorithm for Pattern Searching

K

kartik

Improve

Article Tags :

Practice Tags :

Similar Reads

What is Pattern Searching ?

Pattern searching in Data Structures and Algorithms (DSA) is a fundamental concept that involves searching for a specific pattern or sequence of elements within a given data structure. This technique is commonly used in string matching algorithms to find occurrences of a particular pattern within a

Introduction to Pattern Searching - Data Structure and Algorithm Tutorial

Pattern searching is an algorithm that involves searching for patterns such as strings, words, images, etc. We use certain algorithms to do the search process. The complexity of pattern searching varies from algorithm to algorithm. They are very useful when performing a search in a database. The Pat

Naive algorithm for Pattern Searching

Given text string with length n and a pattern with length m, the task is to prints all occurrences of pattern in text. Note: You may assume that n > m. Examples:Â Input: Â text = "THIS IS A TEST TEXT", pattern = "TEST"Output: Pattern found at index 10 Input: Â text = Â "AABAACAADAABAABA", pattern =

Rabin-Karp Algorithm for Pattern Searching

Given two strings text and pattern string, your task is to find all starting positions where the pattern appears as a substring within the text. The strings will only contain lowercase English alphabets.While reporting the results, use 1-based indexing (i.e., the first character of the text is at po

KMP Algorithm for Pattern Searching

Given two strings txt and pat, the task is to return all indices of occurrences of pat within txt. Examples:Input: txt = "abcab", pat = "ab"Output: [0, 3]Explanation: The string "ab" occurs twice in txt, first occurrence starts from index 0 and second from index 3.Input: txt= "aabaacaadaabaaba", pat

Z algorithm (Linear time pattern searching Algorithm)

This algorithm efficiently locates all instances of a specific pattern within a text in linear time. If the length of the text is "n" and the length of the pattern is "m," then the total time taken is O(m + n), with a linear auxiliary space. It is worth noting that the time and auxiliary space of th

Finite Automata algorithm for Pattern Searching

Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that n > m.Examples: Input: txt[] = "THIS IS A TEST TEXT" pat[] = "TEST" Output: Pattern found at index 10 Input: txt[] = "AABAACAADAAB

Boyer Moore Algorithm for Pattern Searching

Pattern searching is an important problem in computer science. When we do search for a string in a notepad/word file, browser, or database, pattern searching algorithms are used to show the search results. A typical problem statement would be-Â " Given a text txt[0..n-1] and a pattern pat[0..m-1] whe

Aho-Corasick Algorithm for Pattern Searching

Given an input text and an array of k words, arr[], find all occurrences of all words in the input text. Let n be the length of text and m be the total number of characters in all words, i.e. m = length(arr[0]) + length(arr[1]) + ... + length(arr[k-1]). Here k is total numbers of input words. Exampl

ÂÂkasaiâ€™s Algorithm for Construction of LCP array from Suffix Array

Background Suffix Array : A suffix array is a sorted array of all suffixes of a given string. Let the given string be "banana". 0 banana 5 a1 anana Sort the Suffixes 3 ana2 nana ----------------> 1 anana 3 ana alphabetically 0 banana 4 na 4 na 5 a 2 nanaThe suffix array for "banana" :suffix[] = {