Serializable Fonduer model
See original GitHub issueIs your feature request related to a problem? Please describe.
I develop a Fonduer-based app locally on my laptop. Once it’s done, I’d like to package the whole Fonduer pipeline (parsing, extraction, featurization, and classification) and deploy it to a remote place to serve. However, a Fonduer-based app is not easy to package hence not easy to deploy.
Describe the solution you’d like
Add a Fonduer model class that is
- Serializable (e.g., a class with
save
andload
member methods like below)
class FonduerModel:
def save(path_to_save):
def load(path_to_load):
- Capable of executing any phase of the Fonduer pipeline
- (Hopefully) Manageable by MLflow
Describe alternatives you’ve considered
I can create one or more of python scripts that do all the phase, package them, and deploy it. This is cumbersome because the python script has to include many things (matchers, mention_classes, mention_spaces, candidate_classes, etc.) and it is not obvious what should be included for serving.
Additional context
I’d like to make Fonduer more deployable and servable. I’ve been testing MLflow to package a Fonduer-based app and found it was difficult to do so when there is no serializable Fonduer model.
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (4 by maintainers)
I think fonduer-mlflow became in good shape and ready for to be submitted as a PR against fonduer. Let me create a PR and submit it.
@SenWu @trungtv I’ve created a new repository (https://github.com/HiromuHota/fonduer-mlflow) for this custom MLflow model for Fonduer. @SenWu I’d like this custom MLflow model for Fonduer (
fonduer_model.py
) to be merged to the Fonduer repository in the future. So please take a look at the repository and get familiar with it. Let me know if you have any question, suggestion, etc.