Skip to content

Reference plugin variables using "module:class" strings#1122

Closed
NickCrews wants to merge 5 commits intodedupeio:mainfrom
NickCrews:cusotm-vars-str-spec
Closed

Reference plugin variables using "module:class" strings#1122
NickCrews wants to merge 5 commits intodedupeio:mainfrom
NickCrews:cusotm-vars-str-spec

Conversation

@NickCrews
Copy link
Copy Markdown
Contributor

@NickCrews NickCrews commented Dec 11, 2022

Will close #1085

This probably could do with some more polishing, eg removing type from every Variable class, and swapping if variable_type == "Interaction" to parsing the variable type, then checkingif isinstance(variable_type, dedupe.variables.InteractionType)

@coveralls
Copy link
Copy Markdown

coveralls commented Dec 11, 2022

Coverage Status

Coverage remained the same at 74.136% when pulling bad32f6 on NickCrews:cusotm-vars-str-spec into b3efd1e on dedupeio:main.

@fgregg
Copy link
Copy Markdown
Contributor

fgregg commented Dec 11, 2022

i like the simplicity of this approach, but it will be a breaking change for data models that include datetime. in this code this is only working because of the surviving name spacing bits we are trying to get away from.

it will also be a be a breaking change for data models that include the name, address, and fuzzy category types that are not bundled with dedupe but are recommended in the docs

while this pattern is nicer for devs, i don’t like that if a library’s file structure changes, that can break a downstream users code. the file structure should not have to be part of an external API (though i grant that it often is)

@NickCrews
Copy link
Copy Markdown
Contributor Author

i like the simplicity of this approach, but it will be a breaking change for data models that include datetime. in this code this is only working because of the surviving name spacing bits we are trying to get away from.

Hmmm, you're right. Two possible solutions I see:

  1. Merge the datetime package into this library. It's already listed as a dep of dedupe, and I assume that it isn't used by some 3rd package. I don't understand why datetime is given this special treatment but name, address, and fuzzycategory are not?
  2. Update the datetime package so that it exports itself in a more normal way. Then here in variables/__init__.py we can say from dedupe_variable_datetime import DateTimeType as DateTime

it will also be a breaking change for data models that include the name, address, and fuzzy category types that are not bundled with dedupe but are recommended in the docs

Hmm, that's a good point. I was just thinking of breaking changes to people using their own custom plugins. I think the best option here is temporarily have a hardcoded mapping for these edge cases (eg "FuzzyCategorical"->"dedupe_variable_fuzzycategory:FuzzyCategorical"), and if you use them then you get a deprecation warning. We can remove them later, after users get a chance to switch. I think eventually they should follow the same rules as other 3rd party plugin variables, since they are external and need to be pip installed.

while this pattern is nicer for devs, i don’t like that if a library’s file structure changes, that can break a downstream users code. the file structure should not have to be part of an external API (though i grant that it often is)

That's a good point. I think that is the responsibility of the plugin authors and is easily avoided. e.g. in a zipcodelib package, in the root __init__.py, they should from zipcodelib.foo.bar import ZipCodeVariable, so that users can just use "zipcodelib:ZipVariable". Then the plugin author is free to move ZipCodeVariable from zipcodelib/foo/bar.py to zipcodelib/baz.py, adjust the import in the root __init__.py, and everything still works.

fgregg added a commit that referenced this pull request Jun 27, 2024
set up data models like

```python
[
  dedupe.variables.String("name"),
  dedupe.variables.Exact("address")
 ]
 ```

instead of 

```
[
  {"field", "name", "type": "String"},
  {"field", "address", "type": "String"},
 ]
```

supersedes #1122, #1121. will close #1085
@fgregg
Copy link
Copy Markdown
Contributor

fgregg commented Jun 27, 2024

closed by #1193

@fgregg fgregg closed this Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

transition to plugins for dedupe variables.

3 participants