Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to print the confusion matrix #72

Open
rsuwaileh opened this issue Mar 17, 2021 · 5 comments
Open

Is there a way to print the confusion matrix #72

rsuwaileh opened this issue Mar 17, 2021 · 5 comments

Comments

@rsuwaileh
Copy link

Hey,

I want to print the FP and FN for my system. I checked the code and it seems you don't use them in the calculation and just use pred_sum and true_sum. Is there an easy way to get these numbers?

Thanks!

@rsuwaileh
Copy link
Author

rsuwaileh commented Mar 17, 2021

I just found this answer. However, this seems to be computed on the token level. Is there a way to get the confusion matrix on the entity level?

In the example in code you show these numbers:

    Example:
        >>> from seqeval.metrics import performance_measure
        >>> y_true = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'O', 'B-ORG'], ['B-PER', 'I-PER', 'O']]
        >>> y_pred = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O', 'O'], ['B-PER', 'I-PER', 'O']]
        >>> performance_measure(y_true, y_pred)
        (3, 3, 1, 4)

But when I run it, I get the following numbers:

from seqeval.metrics import performance_measure
y_true = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'O', 'B-ORG'], ['B-PER', 'I-PER', 'O']]
y_pred = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O', 'O'], ['B-PER', 'I-PER', 'O']]
performance_measure(y_true, y_pred)
{'TP': 3, 'FP': 2, 'FN': 1, 'TN': 4}

If it's token level, then it should be:
{'TP': 4, 'FP': 1, 'FN': 1, 'TN': 4}
If it's entity level, then it should be:
{'TP': 1, 'FP': ??, 'FN': 1, 'TN': 4}

Can you explain these numbers?
How the partial match is handled?

@rsuwaileh rsuwaileh reopened this Mar 17, 2021
@mustfkeskin
Copy link

I have same question how we can calculate confussion matrix using seqeval library

@mirfan899
Copy link

I have the same question. I am working on token classification and results are confusing

{'eval_loss': 1.503118872642517, 'eval_precision': 0.2734958710184821, 'eval_recall': 0.16045680009228286, 'eval_f1': 0.20225372591784804, 'eval_accuracy': 0.8713822804442352, 'eval_runtime': 73.1268, 'eval_samples_per_second': 59.937, 'epoch': 17.0}

Eval accuracy is high and precision, recall, and f1 scores are very low. It seems there might be a bug related to computing the score at the entity level.

@zingxy
Copy link

zingxy commented Sep 30, 2021

@mirfan899 it `s just normal, because for token classification, the number of O label much higher than B label.

@JanRodriguez
Copy link

JanRodriguez commented Aug 4, 2023

To complement what @zingxy said, accuracy is just "of all tokens, how many did I guess right?", with class O included. This makes it easy to have high/very high accuracies since most of them will usually be O.

On the other hand, the F1 score reported here is the micro average of the classes, without taking into account the O class. Check the numbers in the classification report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants