-
Notifications
You must be signed in to change notification settings - Fork 2.3k
chore: refactor docx_tool to reduce function size #6273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Split the monolithic docx_tool function into focused helper functions: - docx_error/invalid_params: Error construction helpers - read_docx_file/read_or_create_docx/write_docx_file: File I/O helpers - extract_paragraph_text: Text extraction from paragraphs - add_styled_paragraphs: Styled paragraph creation - parse_update_mode: Parameter parsing for update modes - extract_text_from_docx/extract_structure_from_docx: Document content extraction - do_extract_text: Extract text operation handler - do_append: Append mode handler - do_replace: Replace mode handler - do_insert_structured: Structured insert mode handler - load_image_as_png: Image loading and conversion - do_add_image: Add image mode handler The main docx_tool function is now a simple dispatcher (33 lines), well under the 200 line target. All existing tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR successfully refactors the docx_tool function from 440 lines to 33 lines by extracting focused helper functions. The refactoring maintains all original functionality while significantly improving code organization and readability.
Key changes:
- Extracted 15 helper functions covering error handling, file I/O, content extraction, content creation, and operation handlers
- Simplified error creation with
docx_errorandinvalid_paramshelper functions - Consolidated duplicate paragraph text extraction logic into
extract_paragraph_text
|
I'd also be very willing to remove this mcp server entirely! I'm not sure what use it has. But assuming we want to keep it this should be ready for review |
| } | ||
| } | ||
| text | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this supposed to have a trailing \n? consider using filter_map and join instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i kept the original code as is other than splitting up the big function here, thinking ill keep it but we can do another pr to clean this implementatino up too
* 'main' of github.com:block/goose: refactor: when changing provider/model,load existing provider/model (#6334) chore: refactor configure_extensions_dialog to reduce line count (#6277) chore: refactor handle_configure to reduce line count (#6276) chore: refactor interactive session to reduce line count (#6274) chore: refactor docx_tool to reduce function size (#6273) chore: refactor cli() function to reduce line count (#6272) make sure the models are using streaming properly (#6331) feat: add a max tokens env var (#6264) docs: slash commands topic (#6333) fix(ci): prevent gh-pages branch bloat (#6340) chore(deps): bump qs and body-parser in /documentation (#6338) Skip the smoke tests for dependabot PRs (#6337)
Summary
Refactors the
docx_toolfunction incrates/goose-mcp/src/computercontroller/docx_tool.rsto address the clippytoo_many_lineswarning. The original function was 440 lines; it is now 33 lines.Changes
Split the monolithic
docx_toolfunction into focused helper functions:Error Helpers
docx_error: Creates INTERNAL_ERROR ErrorDatainvalid_params: Creates INVALID_PARAMS ErrorDataFile I/O Helpers
read_docx_file: Reads and parses a DOCX fileread_or_create_docx: Reads existing or creates new DOCXwrite_docx_file: Builds and writes DOCX to diskContent Extraction Helpers
extract_paragraph_text: Extracts text from a paragraphextract_text_from_docx: Extracts all text from a documentextract_structure_from_docx: Extracts heading structureContent Creation Helpers
add_styled_paragraphs: Creates styled paragraphs from contentparse_update_mode: Parses update mode from JSON paramsload_image_as_png: Loads and converts images to PNGOperation Handlers
do_extract_text: Handles extract_text operationdo_append: Handles append modedo_replace: Handles replace modedo_insert_structured: Handles structured insert modedo_add_image: Handles add_image modeTesting
All 9 existing tests pass:
test_docx_text_extractiontest_docx_update_appendtest_docx_update_styledtest_docx_update_replacetest_docx_add_imagetest_docx_invalid_pathtest_docx_invalid_operationtest_docx_update_without_contenttest_docx_update_preserve_contentResult
The main
docx_toolfunction is now a simple 33-line dispatcher, well under the 200 line target.