Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

array_constructor does not work for more than 4 arguments #320

Open
YLGH opened this issue May 7, 2022 · 1 comment
Open

array_constructor does not work for more than 4 arguments #320

YLGH opened this issue May 7, 2022 · 1 comment

Comments

@YLGH
Copy link
Contributor

YLGH commented May 7, 2022

my df consists of int_0 through int_12 for, I'm trying to turn these into an array of features, however

df["dense_features"] = functional.array_constructor(
*[df[int_name] for int_name in DEFAULT_INT_NAMES]
)

fails with

Traceback (most recent call last):
  File "/home/ylgh/torchrec/examples/torcharrow/dataloader.py", line 52, in <module>
    df = criteo_preproc(df)
  File "/home/ylgh/torchrec/examples/torcharrow/dataloader.py", line 35, in criteo_preproc
    df["dense_features"] = functional.array_constructor(
  File "/home/ylgh/anaconda3/envs/torchrec/lib/python3.9/site-packages/torcharrow/_functional.py", line 54, in dispatch
    return op(*args)
  File "/home/ylgh/anaconda3/envs/torchrec/lib/python3.9/site-packages/torcharrow/velox_rt/functional.py", line 39, in dispatch
    result_col = ta.generic_udf_dispatch(op_name, *wrapped_args)
TypeError: generic_udf_dispatch(): incompatible function arguments. The following argument types are supported:
    1. (arg0: str, arg1: torcharrow._torcharrow.BaseColumn) -> torcharrow._torcharrow.BaseColumn
    2. (arg0: str, arg1: torcharrow._torcharrow.BaseColumn, arg2: torcharrow._torcharrow.BaseColumn) -> torcharrow._torcharrow.BaseColumn
    3. (arg0: str, arg1: torcharrow._torcharrow.BaseColumn, arg2: torcharrow._torcharrow.BaseColumn, arg3: torcharrow._torcharrow.BaseColumn) -> torcharrow._torcharrow.BaseColumn
    4. (arg0: str, arg1: torcharrow._torcharrow.BaseColumn, arg2: torcharrow._torcharrow.BaseColumn, arg3: torcharrow._torcharrow.BaseColumn, arg4: torcharrow._torcharrow.BaseColumn) -> torcharrow._torcharrow.BaseColumn

Invoked with: 'array_constructor', <torcharrow._torcharrow.SimpleColumnREAL object at 0x7f5844733270>, <torcharrow._torcharrow.SimpleColumnREAL object at 0x7f585895c1b0>, <torcharrow._torcharrow.SimpleColumnREAL object at 0x7f58590af7b0>, <torcharrow._torcharrow.SimpleColumnREAL object at 0x7f58590afa30>, <torcharrow._torcharrow.SimpleColumnREAL object at 0x7f58590afaf0>

Based on this, it seems that it only supports up to 4 args.
When I restrict it down to this DEFAULT_INT_NAMES[:4], it works

@wenleix
Copy link
Contributor

wenleix commented May 9, 2022

Yeah, today it's adhoc supported based on arity: #296

I think we need a variadic version for generic UDF call. In your case, it has 12 parameters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants