Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gsk 2559 add tabular classification pipeline #28

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

cy-moi
Copy link
Contributor

@cy-moi cy-moi commented Jan 15, 2024

Common Problems:

  1. no dataset (blocking)
  2. no config (not blocking)
    • could be solved by using dataset column names and sample predictions
  3. ML console trained models -> unable to process
  4. Pytorch -> Almost no usable models (classification 1, regressions 9 but all the same one bcwarner/audit-icu-gpt2-25_3M)
  5. Most models are scikit-learn with joblib/skops workflows, Keras models are almost all from Keras-io (valid ones less than 10 in total)

Version problems (deserialization):
Scikit-learn (skops)
Example: python cli.py --loader huggingface --model scikit-learn/Fish-Weight --dataset scikit-learn/Fish

Screenshot 2024-01-17 at 21 51 22

Scikit-learn (joblib) -> XgBoost
Example: Almost all titanic models
python cli.py --loader huggingface --model vabadeh213/autotrain-titanic-744222727 --dataset phihung/titanic

Screenshot 2024-01-17 at 21 16 18

Keras -> AdamW
Example: python cli.py --loader huggingface --model keras-io/tab_transformer --dataset scikit-learn/adult-census-income

Screenshot 2024-01-17 at 21 44 00

Copy link

linear bot commented Jan 15, 2024

@Inokinoki
Copy link
Member

Common Problems:

  1. no dataset (blocking)

  2. no config (not blocking)

    • could be solved by using dataset column names and sample predictions
  3. ML console trained models -> unable to process

  4. Pytorch -> Almost no usable models (classification 1, regressions 9 but all the same one bcwarner/audit-icu-gpt2-25_3M)

Version problems (deserialization): Scikit-learn (skops) Example: python cli.py --loader huggingface --model scikit-learn/Fish-Weight --dataset scikit-learn/Fish

Screenshot 2024-01-17 at 21 51 22 Scikit-learn (joblib) -> XgBoost Example: Almost all titanic models `python cli.py --loader huggingface --model vabadeh213/autotrain-titanic-744222727 --dataset phihung/titanic` Screenshot 2024-01-17 at 21 16 18 Keras -> AdamW Example: `python cli.py --loader huggingface --model keras-io/tab_transformer --dataset scikit-learn/adult-census-income` Screenshot 2024-01-17 at 21 44 00

The last two are apparently deps version issues.

I have also met the first one, no clue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants