Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question #135

Open
houxiaosen opened this issue Dec 14, 2023 · 2 comments
Open

question #135

houxiaosen opened this issue Dec 14, 2023 · 2 comments

Comments

@houxiaosen
Copy link

你好,作者大大,能否给出quality计算IV的详细计算过程吗?我在做计算的时候发现全为空值的一列字段仍有属于它的IV值,在我理解看来尽管缺失值可以作为一箱,但某一列全为缺失值的IV仅有一箱的情况下不是应该等于零吗

@Secbone
Copy link
Member

Secbone commented Dec 17, 2023

@houxiaosen IV的计算公式就是定义的公式 $IV = (P_y - P_n) * ln(P_y / P_n)$,对于全是同一个值的特征来说,只要你的Y中的 1 和 0 的比例不是 50:50,IV就不是0,此时的IV相当于你采样这波数据的IV值。

@houxiaosen
Copy link
Author

作者你好,我的理解是
Py = good/good.sum()
pn=bad/bad.sum()
只有一箱的话
py=1,
pn=1,IV不应该等于零吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants