30. 串联所有单词的子串 - 力扣（LeetCode）

最新推荐文章于 2025-05-13 17:55:37 发布

千小凡

最新推荐文章于 2025-05-13 17:55:37 发布

阅读量1.1k

点赞数 10

CC 4.0 BY-SA版权

分类专栏：力扣（LeetCode）算法题文章标签： leetcode 算法

本文链接：https://blog.csdn.net/m0_72231747/article/details/138436722

力扣（LeetCode）算法题专栏收录该内容

23 篇文章

订阅专栏

文章讲述了如何使用Java和Python编程语言解决一个字符串问题，即给定一个字符串s和一个单词数组words，找出所有由words中单词按任意顺序组成的串联子串的起始索引。通过滑动窗口的方法遍历字符串并检查子串中的单词匹配情况。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

基础知识要求：

Java：方法、集合、泛型、逻辑运算符、if else判断、for循环、while循环

Python：方法、Counter类、逻辑运算符、if else判断、for循环、while循环、列表

题目：

给定一个字符串 s 和一个字符串数组 words。 words 中所有字符串 长度相同。

s 中的 串联子串 是指一个包含 words 中所有字符串以任意顺序排列连接起来的子串。

例如，如果 words = ["ab","cd","ef"]，那么 "abcdef"， "abefcd"，"cdabef"， "cdefab"，"efabcd"，和 "efcdab" 都是串联子串。 "acdbef" 不是串联子串，因为他不是任何 words 排列的连接。

返回所有串联子串在 s 中的开始索引。你可以以 任意顺序 返回答案。

示例 1：

输入：s = "barfoothefoobarman", words = ["foo","bar"]
输出：[0,9]
解释：因为 words.length == 2 同时 words[i].length == 3，连接的子字符串的长度必须为 6。
子串 "barfoo" 开始位置是 0。它是 words 中以 ["bar","foo"] 顺序排列的连接。
子串 "foobar" 开始位置是 9。它是 words 中以 ["foo","bar"] 顺序排列的连接。
输出顺序无关紧要。返回 [9,0] 也是可以的。

示例 2：

输入：s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
输出：[]
解释：因为 words.length == 4 并且 words[i].length == 4，所以串联子串的长度必须为 16。
s 中没有子串长度为 16 并且等于 words 的任何顺序排列的连接。
所以我们返回一个空数组。

示例 3：

输入：s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
输出：[6,9,12]
解释：因为 words.length == 3 并且 words[i].length == 3，所以串联子串的长度必须为 9。
子串 "foobarthe" 开始位置是 6。它是 words 中以 ["foo","bar","the"] 顺序排列的连接。
子串 "barthefoo" 开始位置是 9。它是 words 中以 ["bar","the","foo"] 顺序排列的连接。
子串 "thefoobar" 开始位置是 12。它是 words 中以 ["the","foo","bar"] 顺序排列的连接。

提示：

1 <= s.length <= 104
1 <= words.length <= 5000
1 <= words[i].length <= 30
words[i] 和 s 由小写英文字母组成

思路解析：

首先，我们需要明确题目的要求：找到字符串s中所有由words列表中的单词以任意顺序排列连接起来的子串的起始索引。

边界情况处理：
- 如果字符串s为空、words列表为空、words列表长度为0，或者words中第一个单词的长度为0，则直接返回空列表，因为无法构成有效的子串。
- 如果words中所有单词的长度都不为0，但某个单词的长度与words中第一个单词的长度不同，说明words中的单词长度不一致，无法构成有效的子串，也应返回空列表。
初始化变量：
- word_length：words列表中第一个单词的长度（假设所有单词长度相同）。
- total_length：所有单词连接起来的总长度，即len(words) * word_length。
- word_counts：使用Counter统计words列表中每个单词的出现次数。
- results：用于存储满足条件的子串起始索引的列表。
遍历字符串s：
- 从索引0开始，每次遍历长度为total_length的子串，直到剩余的子串长度不足以构成total_length。
检查子串：
- 对于每个子串，使用另一个计数器seen来记录当前子串中每个单词的出现次数。
- 遍历子串中的每个长度为word_length的片段，将其视为一个单词，并检查是否在word_counts中。
  - 如果单词在word_counts中，增加seen中该单词的计数，并检查是否超过了word_counts中的计数，如果超过则提前终止检查。
  - 如果单词不在word_counts中，也提前终止检查。
- 如果成功遍历了整个子串，并且seen中的每个单词的计数都与word_counts中的计数相同，则找到了一个有效的串联子串，将起始索引添加到results中。
返回结果：
- 返回results列表，其中包含了所有满足条件的子串起始索引。

以上思路是通过滑动窗口的方式来遍历字符串s，并逐个检查子串是否满足由words列表中单词构成的条件

Java代码示例：

import java.util.ArrayList;  
import java.util.HashMap;  
import java.util.List;  
import java.util.Map;  
  
public class SubstringSearch {  
  
    /**  
     * 找到字符串s中所有由words列表中的单词以任意顺序排列连接起来的子串的起始索引  
     *  
     * @param s    待搜索的字符串  
     * @param words 单词列表  
     * @return 满足条件的子串起始索引列表  
     */  
    public List<Integer> findSubstring(String s, String[] words) {  
        if (s == null || s.length() == 0 || words == null || words.length == 0) {  
            return new ArrayList<>(); // 如果字符串s或单词列表words为空，则返回空列表  
        }  
  
        // 确保所有单词长度相同  
        int wordLength = words[0].length();  
        for (String word : words) {  
            if (word.length() != wordLength) {  
                return new ArrayList<>(); // 如果单词长度不一致，则返回空列表  
            }  
        }  
  
        int totalLength = words.length * wordLength;  
        if (s.length() < totalLength) {  
            return new ArrayList<>(); // 如果字符串s的长度小于一个完整单词组合的长度，则返回空列表  
        }  
  
        // 使用Map来记录每个单词在words列表中出现的次数  
        Map<String, Integer> wordCounts = new HashMap<>();  
        for (String word : words) {  
            wordCounts.put(word, wordCounts.getOrDefault(word, 0) + 1);  
        }  
  
        List<Integer> results = new ArrayList<>(); // 存储满足条件的子串起始索引的列表  
  
        // 使用滑动窗口来遍历字符串s  
        for (int i = 0; i <= s.length() - totalLength; i++) {  
            // 用于记录当前窗口内每个单词的出现次数  
            Map<String, Integer> seen = new HashMap<>();  
            int j = 0; // 窗口内的指针  
  
            // 遍历当前窗口内的每个单词  
            while (j < totalLength) {  
                String word = s.substring(i + j, i + j + wordLength);  
  
                // 如果单词在words列表中，则增加其在seen中的计数  
                if (wordCounts.containsKey(word)) {  
                    seen.put(word, seen.getOrDefault(word, 0) + 1);  
  
                    // 如果当前单词的计数超过它在words中的计数，则退出内部循环  
                    if (seen.get(word) > wordCounts.get(word)) {  
                        break;  
                    }  
                } else {  
                    // 如果单词不在words列表中，则退出内部循环  
                    break;  
                }  
  
                j += wordLength; // 移动窗口内的指针  
            }  
  
            // 如果成功遍历了整个子串，并且seen中的每个单词的计数都与wordCounts中的计数相同  
            if (j == totalLength && seen.equals(wordCounts)) {  
                results.add(i); // 将起始索引添加到结果列表中  
            }  
        }  
  
        return results; // 返回满足条件的子串起始索引列表  
    }  
  
    public static void main(String[] args) {  
        SubstringSearch solution = new SubstringSearch();  
        String s = "barfoothefoobarman";  
        String[] words = {"foo", "bar"};  
        List<Integer> result = solution.findSubstring(s, words);  
        System.out.println(result); // 输出: [0, 9]  
    }  
}

Python代码示例：

from collections import Counter  
from typing import List  
  
def findSubstring(s: str, words: List[str]) -> List[int]:  
    # 如果s为空、words为空、words长度为0或words中第一个字符串的长度为0，则直接返回空列表  
    if not s or not words or len(words) == 0 or len(words[0]) == 0:  
        return []  
      
    # 获取words中每个字符串的长度  
    word_length = len(words[0])  
    # 计算串联子串的总长度  
    total_length = len(words) * word_length  
    # 使用Counter统计words中每个字符串的出现次数  
    word_counts = Counter(words)  
    # 用于存储结果的列表  
    results = []  
      
    # 遍历s中所有可能的起始位置  
    for i in range(len(s) - total_length + 1):  
        # 截取当前起始位置开始的长度为total_length的子串  
        substring = s[i:i+total_length]  
        # 初始化一个计数器来记录当前子串中已经看到的单词  
        seen = Counter()  
        # 遍历子串中的每个单词  
        j = 0  
          
        # 遍历子串中的每个长度为word_length的片段  
        while j < total_length:  
            # 截取当前片段作为单词  
            word = substring[j:j+word_length]  
            # 如果单词在word_counts中存在  
            if word in word_counts:  
                # 在seen中增加该单词的计数  
                seen[word] += 1  
                # 如果seen中该单词的计数超过了word_counts中的计数，说明该单词出现次数过多，不可能组成串联子串  
                if seen[word] > word_counts[word]:  
                    break  
            else:  
                # 如果单词不在word_counts中，也不可能组成串联子串  
                break  
            # 移动到下一个单词的起始位置  
            j += word_length  
          
        # 如果成功遍历了整个子串，并且seen中的每个单词的计数都与word_counts中的计数相同，说明找到了一个串联子串  
        if j == total_length and all(seen[word] == word_counts[word] for word in seen):  
            # 将起始索引添加到结果列表中  
            results.append(i)  
      
    # 返回结果列表  
    return results