-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async example, and link documentation #286
base: main
Are you sure you want to change the base?
Conversation
cbh123
commented
Apr 19, 2024
- update first example to point to Llama 3
- make it more explicit you can always check out a models API page (with link)
- provide an async example for an official (versionless) model call
) | ||
``` | ||
|
||
Here is the async equivalent of the above: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm against using async
here, given that there's an actual (and different asyncio
) way of running predictions async in Python land! WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that is a good point. I want some example of replicate.models.predictions.create
on this page and wasn't sure how to refer to it — async felt like the best option? Open to other ideas on how to word it though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbh123 Here's how we word the difference on the playground:
The run() function returns the output directly, which you can then use or pass as the input to another model. If you want to access the full prediction object (not just the output), use the replicate.predictions.create() method instead. This will include the prediction id, status, logs, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look a bit below for the words we already use: https://github.com/replicate/replicate-python?tab=readme-ov-file#run-a-model-in-the-background
"Run a model in the background"
It also shows rather than tells you what happens so you can clearly understand what it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might just want to be a part of that section, thinking about it, otherwise this is rather disjointed.
But, gosh, let's fix these APIs as soon as possible. How horrible. I'm tempted to say let's just use this energy to fix the API. Come grab me in Gather if you want to hash out a design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replicate.predictions.create
should have an overload that takes a model
kwarg.
prediction = replicate.predictions.create(model="meta/meta-llama-3-8b-instruct", input={...})
The original models
namespace indirection was a hedge, and I didn't have a chance to circle back to clean things up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR with that change here: #290
Please take a look and let me know what you think!