Skip to content

Conversation

@chrisiacovella
Copy link
Member

@chrisiacovella chrisiacovella commented Jul 16, 2025

Pull Request Summary

This is a very small PR that focuses on making minor changes to the function that allows you to read an HDF5 file back into a SourceDataset instance. This function now will take a "property_map" that maps the name of a property in the HDF5 file to a specific class (e.g., "dft_energy" to the Energies class in properties). Since the name of a property can be anything, we cannot necessarily infer it.

Previously, these were simply loaded into the PropertyBaseModel class; while that class has the same variables (name, units, value, property_type, classification), as it is the parent for all the actual properties we define, it does not contain any of the specific validation functions associated with a given property. Also, if writing the dataset to an hdf5 file, lots of warnings will be logged since I have a check that requires instances of AtomicNumbers, Positions and Energies to be in every record (as the minimum info required by modelforge).

I also added a factory class that allows a user to generate an instance of a class based on its name as a string (so passing "positions" will return an instance of the Positions class). I ended up not using this, but left the code (with tests) as it may ultimately be useful.

Associated Issue(s)

Pull Request Checklist

  • Issue(s) raised/addressed and linked
  • Includes appropriate unit test(s)
  • Appropriate docstring(s) added/updated
  • Appropriate .rst doc file(s) added/updated
  • PR is ready for review

@codecov-commenter
Copy link

codecov-commenter commented Jul 16, 2025

Codecov Report

Attention: Patch coverage is 92.30769% with 1 line in your changes missing coverage. Please review.

Project coverage is 90.79%. Comparing base (d56fbfd) to head (0ac51d4).

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chrisiacovella chrisiacovella requested a review from Copilot July 16, 2025 23:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR focuses on enhancing the HDF5 file reading functionality by introducing property mapping support and adding a factory class for property instantiation. The main goal is to allow users to specify how properties should be mapped to specific classes when reading from HDF5 files, rather than defaulting to the generic PropertyBaseModel class.

  • Adds property_map parameter to create_dataset_from_hdf5 function for mapping property names to specific classes
  • Implements PropertyFactory class for programmatic property instantiation based on string names
  • Updates comprehensive test coverage for both new features and property mapping scenarios

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
modelforge-curate/modelforge/curate/tests/test_curate.py Adds extensive test coverage for HDF5 reading with property mapping and PropertyFactory functionality
modelforge-curate/modelforge/curate/sourcedataset.py Enhances create_dataset_from_hdf5 function with property_map parameter support
modelforge-curate/modelforge/curate/properties.py Implements PropertyFactory class for creating property instances from string names
modelforge-curate/modelforge/curate/examples/basic_usage.ipynb Updates notebook with demonstration of property mapping usage and corrects typos
Comments suppressed due to low confidence (1)

@chrisiacovella chrisiacovella merged commit ef9c8be into choderalab:main Jul 16, 2025
17 checks passed
@chrisiacovella chrisiacovella deleted the rev-curate_class_name branch August 27, 2025 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants