Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to use CoreML only on iOS without CPU fallback #93

Open
jobpaardekooper opened this issue Aug 7, 2023 · 6 comments
Open

Option to use CoreML only on iOS without CPU fallback #93

jobpaardekooper opened this issue Aug 7, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@jobpaardekooper
Copy link
Contributor

jobpaardekooper commented Aug 7, 2023

Is there a reason why we can not exclusively use CoreML on iOS for example? Because now we will still need to bundle a regular CPU model but it is unclear if it would be used.

The documentation states that the library might fallback to using the CPU when you try to use CoreML. It would be nice if the docs also included a reason on why this would happen. When would it fall back to CPU mode? Or does that only happen on android? It is not really clear to me from the current documentation.

Thanks for the great work on this library!

@UchennaOkafor
Copy link

I agree with this, I noticed the CoreML .modelrc files are smaller in comparison to the ggml .bin files. So if we only use the ColeML files on iOS then that would be easier because we could bundle a smaller app size.

@jhen0409
Copy link
Member

jhen0409 commented Aug 8, 2023

At this time the ggml model still in use as decoder. This is not ideal as it will take up more memory space, we can see how we can improve this.

The documentation states that the library might fallback to using the CPU when you try to use CoreML. It would be nice if the docs also included a reason on why this would happen.

By default we use WHISPER_COREML_ALLOW_FALLBACK compiler flag, so it will fallback to CPU if the Core ML model load failed (see this code).

For easier debug, we may want to have a field like usedCoreML in the Context instance.

@jhen0409 jhen0409 added the enhancement New feature or request label Aug 8, 2023
@jobpaardekooper
Copy link
Contributor Author

What does PR #123 add? I am not very familiar with all this ML stuff. I thought it might have something to do with this but now I think maybe not.

Is it just a different (faster way) to allocate memory on apple devices so the context init will be faster? Of is it something different? Sorry for not understanding but I want to learn.

@jhen0409
Copy link
Member

jhen0409 commented Oct 6, 2023

What does PR #123 add? I am not very familiar with all this ML stuff. I thought it might have something to do with this but now I think maybe not.

Is it just a different (faster way) to allocate memory on apple devices so the context init will be faster? Of is it something different? Sorry for not understanding but I want to learn.

ggml-alloc basically reduce memory usage of model compared to before.

ggml-metal allow GGML to access GPU resources on Apple devices. If you use it, you don't need to load the CoreML model separately and you can get similar performance, but this depends on the performance difference of the GPU / Neural Engine on the device. The results on Mac / iPhone will be different. Currently it's not enabled yet as I mentioned in #123 (comment).

@jobpaardekooper
Copy link
Contributor Author

Thanks for explaining! Do you know any good or interesting resources to read and learn more about this stuff?

@jhen0409
Copy link
Member

jhen0409 commented Oct 9, 2023

Thanks for explaining! Do you know any good or interesting resources to read and learn more about this stuff?

Do you mean ML? I'm not an expert so I afraid I can't provide helpful resources.

If you want to learning things about GGML, I would recommend you watching ggml / llama.cpp / whisper.cpp and community project that using GGML. Especially llama.cpp, most things happen in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants