-
Notifications
You must be signed in to change notification settings - Fork 194
Builder for RowProcessor
#263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… improve usability and allow users to avoid error-prone many-argument constructors. See oracle#251 and oracle#260.
| private static final Logger logger = Logger.getLogger(RowProcessor.class.getName()); | ||
|
|
||
| private static final String FEATURE_NAME_REGEX = "["+ColumnarFeature.JOINER+FieldProcessor.NAMESPACE+"]"; | ||
| private static final String FEATURE_NAME_REGEX = "[" + ColumnarFeature.JOINER + FieldProcessor.NAMESPACE + "]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright year needs updating on RowProcessor
| * | ||
| * @param <T> | ||
| */ | ||
| public static class Builder<T extends Output<T>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this class be final?
| * @param responseProcessor The response processor to use. | ||
| * @return The RowProcessor represented by the builder's state | ||
| */ | ||
| public RowProcessor<T> build(List<FieldProcessor> fieldProcessors, ResponseProcessor<T> responseProcessor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's valid to build a RowProcessor with only regexMappingProcessors and an empty list of fieldProcessors, so that argument should be broken out into a separate builder method.
| @@ -0,0 +1,322 @@ | |||
| /* | |||
| * Copyright (c) 2015-2020, Oracle and/or its affiliates. All rights reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright year should just be the year of addition for new files.
| public RowProcessor<T> build(List<FieldProcessor> fieldProcessors, ResponseProcessor<T> responseProcessor) { | ||
| Map<String, FieldProcessor> fieldProcessorMap = new HashMap<>(); | ||
| for (FieldProcessor fieldProcessor : fieldProcessors) { | ||
| fieldProcessorMap.put(fieldProcessor.getFieldName(), fieldProcessor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably validate that the field processors don't collide, and/or we should expose two endpoints, one for a list which validates, and one which accepts a map.
| assertFalse(featureIterator.hasNext()); | ||
| } | ||
|
|
||
| static class MungingTokenizer implements Tokenizer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use this one from the RowProcessorTest file?
Craigacp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notebook needs updating, but the other two comments are not necessary.
| * Retrieves, if present, the fieldProcessor with the given name | ||
| */ | ||
| public Optional<FieldProcessor> getFieldProcessor(String fieldName) { | ||
| if (this.fieldProcessors.containsKey(fieldName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be return Optional.ofNullable(this.fieldProcessors.get(fieldName)).
tutorials/columnar-tribuo-v4.ipynb
Outdated
| " .setMetadataExtractors(metadataExtractors)\n", | ||
| " .setWeightExtractor(weightExtractor)\n", | ||
| " .setRegexMappingProcessors(regexMappingProcessors)\n", | ||
| " .build(fieldProcessors, responseProcessor);" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This build method doesn't exist anymore.
| * Retrieves, if present, the regexFieldProcessor with the given regex | ||
| */ | ||
| public Optional<FieldProcessor> getRegexFieldProcessor(String regexName) { | ||
| if (this.regexMappingProcessors.containsKey(regexName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional.ofNullable.
Craigacp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
Added a missing close paren.
Description
Added a builder class and tests for
org.tribuo.data.columnar.RowProcessor, and updated the columnar tutorial to use it.Motivation
Existing constructors for `RowProcessor have many arguments with similar types leading to confusing difficulties like those in #251 and #260