Extremely long cold boot times #242

UmarRamzan · 2024-02-05T09:40:48Z

Is there any way to store large models in some kind of network storage to avoid long cold boot times?

mattt · 2024-02-05T17:14:59Z

Hi @UmarRamzan. I hear you — large models can take a while to setup from a cold boot. We do what we can to optimize network storage and caches, but at a certain point you're limited by physical limitations of hardware for transferring and loading 10^2GB of weights into GPU VRAM.

We have some docs about cold boots here: https://replicate.com/docs/how-does-replicate-work#cold-boots

If your application is sensitive to long cold starts, you can try creating a deployment and configuring a certain number of instances to always be running.

What model are you seeing this for?

UmarRamzan · 2024-02-17T11:21:39Z

Hi, creating a deployment is not financially feasible for us. We're currently using Whisper Large v-3, which is around 8 GB and takes a minute or two to load. I had this in mind when originally asking the question: https://modal.com/docs/guide/checkpointing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extremely long cold boot times #242

Extremely long cold boot times #242

UmarRamzan commented Feb 5, 2024

mattt commented Feb 5, 2024

UmarRamzan commented Feb 17, 2024

Extremely long cold boot times #242

Extremely long cold boot times #242

Comments

UmarRamzan commented Feb 5, 2024

mattt commented Feb 5, 2024

UmarRamzan commented Feb 17, 2024