A tool designed to efficiently summarize user-provided knowledge files into a structured Table of Contents (ToC). This ToC is formatted specifically to enhance your custom GPT's ability to navigate and utilize large volumes of data effectively.
Custom GPT's have difficulty navigating many and large knowledge files. This, I learned, is because it actively scrolls through these files using myfiles_browser
. Which means it finds keywords and looks over only a small portion of the file.
With a Table of Contents, you can adjust your GPT's instruction to process the entire ToC. Empowering your GPT to understand which knowledge files and which lines inside the knowledge files it should scroll to. Giving your GPT context of the important regions of your knowledge files, to more accurately respond to your queries.
- ChatGPT access link: Knowledge Summarizer GPT
- Automated Summarization: Processes uploaded knowledge files and generates a comprehensive Table of Contents.
- Customized Formatting: Generates ToCs in a JSON format tailored to improve GPT navigation.
- Chunk Processing: Capable of handling large files by summarizing them in smaller, manageable sections.
- Continuity Support: Remembers the last processed position in a file, allowing for seamless continuation in large documents.
- User-Friendly Interface: Easy-to-use prompts and clear instructions for uploading and processing files.
- Accuracy and Efficiency: Ensures precise summarization with key information highlighted for quick GPT access.
- Recommended GPT Instructions: A recommended format for GPT instructions, that can be copied and filled to properly look up and search through uploaded knowledge files outputted by this script and BuilderIO/gpt-crawler.
- Sample GPT: For a sample GPT that uses the recommended instructions and the ToC, look at our Mojo Teacher GPT GitHub. It contains the link to the GPT instructions and knowledge files used to create it.
- Ensure you have a JSON knowledge file formatted as per the Knowledge Summarizer's requirements.
- Access to ChatGPT Plus
- In the future the instructions will be added to this repository so we can all improve upon them.
- Prepare Your Knowledge File: Format your knowledge file according to the specified guidelines. Each file should be a JSON containing a list of dictionaries with keys like "title", "url", "path", "content", and "html".
- Upload Your File: Use the GPT interface to upload your formatted knowledge file.
- Summarization: The GPT will process the file, starting from the top, and generate a ToC in the specified JSON format.
- Continuation and Completion: For large files, the GPT will handle them in chunks and will prompt you to continue processing the next chunk until completion.
Your knowledge file should be a JSON with the following structure:
- A list of dictionaries, each containing keys: "title", "url", "path", "content", "html".
- The "content" and "html" keys must contain the main information for summarization.
- Ensure consistency and clarity in the content for accurate summarization.
To assist in properly formatting your knowledge files, you can use the following command-line tools:
-
GPT Crawler (GPT Crawler GitHub Repository): This tool crawls and processes web content into the required JSON format. It's particularly useful for converting web pages and online documents into the structure expected by the Knowledge Summarizer.
-
GPT GitHub Crawler (GPT GitHub Crawler Repository): This tool is designed for extracting content from GitHub repositories. It formats code, READMEs, and other documentation into a JSON structure suitable for the Knowledge Summarizer.
These tools can automate the process of formatting and ensure that your files adhere to the required structure.
The output will be a JSON-formatted Table of Contents with the following fields for each entry (fields in brackets are filled in by the GPT):
[
{
"Knowledge Filename": "Filename of the knowledge file",
"Title": "Descriptive title",
"Description": "Brief description of the contents",
"Key Words": ["List", "of", "keywords"],
"Index": "Index position in the knowledge file",
"Lines": "Line positions in the knowledge file"
},
...
]
After uploading all of your knowledge files, including your Table of Contents, it is recommended to copy the instruction-template.md into your custom GPT instructions. You should fill in the areas in brackets, with the specified information unique to your assistant. I found this format for the GPT instructions to work best with the outputted knowledge files from this script and BuilderIO/gpt-crawler. If you find a better template, please contribute and submit a pull request so we can all benefit from your improved instructions!
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch (
git checkout -b feature/AmazingFeature
). - Commit your changes (
git commit -m 'Add some AmazingFeature'
). - Push to the branch (
git push origin feature/AmazingFeature
). - Create a new Pull Request.
Distributed under the MIT License. See LICENSE
for more information.
This project uses instructions inspired by concepts from the spdustin/ChatGPT-AutoExpert.