Predictions often fail on meta/llama-2-70b #259

jdkanu · 2024-03-15T16:36:23Z

Calls to meta/llama-2-70b are sometimes succeeding, but sometimes failing. It is very unreliable.

This is the code

output = replicate.run(
        "meta/llama-2-70b",
        input={
            "prompt": "Q: Would a pear sink in water? A: Let's think step by step. ",
            "max_new_tokens": 10000,
            "temperature": 0.01,
        }
    )

Example failure: https://replicate.com/p/x3brrjtbwq4ky6zm2z2ay27amy
Example failure: https://replicate.com/p/72pdpvtby7l7wgdzrpzzldqzne
Example failure: https://replicate.com/p/ucbimbtbhyjrw5udypzf6srsm4
Example success: https://replicate.com/p/n6hg2cdbym5ksifmlm6yfahjzm
Example success: https://replicate.com/p/j2jkwn3bfl6w2wc4q53mwju2o4
Example success: https://replicate.com/p/mtb6jcrbzjh57s2wjzfycegxoa

The text was updated successfully, but these errors were encountered:

mattt · 2024-03-15T16:51:18Z

Hi @jdkanu. Thank you for reporting this. Looking at our telemetry, it does seem like predictions on GPUs in certain regions are failing more often due to read timeouts. We're investigating the cause, and working on a remediation.

jdkanu · 2024-03-15T17:06:23Z

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predictions often fail on meta/llama-2-70b #259

Predictions often fail on meta/llama-2-70b #259

jdkanu commented Mar 15, 2024

mattt commented Mar 15, 2024 •

edited

jdkanu commented Mar 15, 2024

Predictions often fail on meta/llama-2-70b #259

Predictions often fail on meta/llama-2-70b #259

Comments

jdkanu commented Mar 15, 2024

mattt commented Mar 15, 2024 • edited

jdkanu commented Mar 15, 2024

mattt commented Mar 15, 2024 •

edited