-
Notifications
You must be signed in to change notification settings - Fork 0
Add BGC category #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BGC category #37
Conversation
Add a new schema module defining BGC categories for natural product biosynthesis: - POLYKETIDE (CHEBI:26188) - NRP (Non-ribosomal peptide) - RIPP (Ribosomally synthesized and post-translationally modified peptide) - TERPENE (CHEBI:35186) - ALKALOID (CHEBI:22315) - SACCHARIDE (CHEBI:18154) - OTHER Categories based on MIBiG standard for biosynthetic gene cluster classification.
|
Claude finished @cmungall's task —— View job PR Review: Add BGC CategoryComprehensive review of the new BGC categories schema module. Review Checklist:
✅ STRENGTHSSchema Structure & Standards:
Content Quality:
Ontology Integration:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new schema module (bgc_categories.yaml) that defines biosynthetic gene cluster (BGC) categories for natural product biosynthesis, aligned with the MIBiG (Minimum Information about a Biosynthetic Gene cluster) standard. The schema provides a controlled vocabulary for classifying BGCs based on the type of secondary metabolite they produce, which is essential for genome mining and natural product discovery workflows.
Key changes:
- Introduces 7 BGC category enumerations (POLYKETIDE, NRP, RIPP, TERPENE, ALKALOID, SACCHARIDE, OTHER)
- Maps categories to ChEBI ontology terms where applicable to ensure semantic interoperability
- Includes detailed annotations with biosynthetic enzyme information and examples for each category
| annotations: | ||
| definition: BGCs that do not fit into standard categories or are of unknown | ||
| type | ||
| examples: Novel or hybrid BGCs |
Copilot
AI
Dec 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding license and see_also fields at the end of the schema file for consistency with other schema files like demographics.yaml and gb_edoh.yaml. These fields typically include:
- license: MIT
- see_also: (with relevant MIBiG standard documentation links)
This would help document the source of the standard and licensing terms, especially since the PR description mentions this is based on the MIBiG standard.
| examples: Novel or hybrid BGCs | |
| examples: Novel or hybrid BGCs | |
| license: MIT | |
| see_also: | |
| - https://mibig.secondarymetabolites.org/ | |
| - https://mibig.secondarymetabolites.org/documentation.html |
| NRP: | ||
| title: Non-ribosomal peptide | ||
| description: Non-ribosomal peptide biosynthetic gene clusters | ||
| annotations: | ||
| definition: Peptides synthesized by non-ribosomal peptide synthetases | ||
| biosynthetic_enzyme: Non-ribosomal peptide synthetase (NRPS) | ||
| examples: Penicillin, vancomycin, cyclosporin |
Copilot
AI
Dec 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NRP (Non-ribosomal peptide) category is missing a 'meaning' field with a ChEBI ontology mapping, while most other categories (POLYKETIDE, TERPENE, ALKALOID, SACCHARIDE) include this field. For consistency and to provide proper semantic grounding, consider adding a ChEBI term if one exists for non-ribosomal peptides. This would maintain uniformity across the enumeration values and improve interoperability with other ontology-based systems.
| RIPP: | ||
| title: Ribosomally synthesized and post-translationally modified peptide | ||
| description: RiPP biosynthetic gene clusters | ||
| annotations: | ||
| definition: Ribosomally synthesized peptides with extensive post-translational | ||
| modifications | ||
| biosynthetic_enzyme: Various modifying enzymes | ||
| examples: Nisin, thiopeptides, lanthipeptides | ||
| related_chebi: CHEBI:71629 (nisin) |
Copilot
AI
Dec 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RIPP category is missing a 'meaning' field with a ChEBI ontology mapping, while most other categories (POLYKETIDE, TERPENE, ALKALOID, SACCHARIDE) include this field. Although a related_chebi annotation is provided for nisin (a specific example), the category itself should have a ChEBI term if one exists for the class of ribosomally synthesized and post-translationally modified peptides. This would maintain consistency with the other categories and improve semantic interoperability.
Add a new schema module defining BGC categories for natural product biosynthesis:
Categories based on MIBiG standard for biosynthetic gene cluster classification.