feat(forge/llm): Add `LlamafileProvider` #7091

k8si · 2024-04-19T16:27:49Z

Background

This draft PR is a step toward enabling the use of local models in AutoGPT by adding llamafile as an LLM provider.

Implementation notes are included in forge/forge/llm/providers/llamafile/README.md

Related issues:

Depends on:

refactor(forge/llm): Create BaseOpenAIProvider -> deduplicate GroqProvider & OpenAIProvider #7178
docs+fix(agent): Update LLM setup instructions & remove GPT-specific flags #7183

Changes 🏗️

Add minimal implementation of LlamafileProvider, a new ChatModelProvider for llamafiles. It extends BaseOpenAIProvider and only overrides methods that are necessary to get the system to work at a basic level.
Add support for mistral-7b-instruct-v0.2. This is the only model currently supported by LlamafileProvider because this is the only model I tested anything with.
Misc changes to app configuration to enable switching between openai/llamafile providers. In particular, added config field LLM_PROVIDER that, when set to 'llamafile', will use LllamafileProvider in agents rather than OpenAIProvider.
Add instructions to use AutoGPT with llamafile in the docs at autogpt/setup/index.md

Limitations:

Only tested with (quantized) Mistral-7B-Instruct-v0.2
Only tested with a single AutoGPT 'task' ("Tell me about Roman dodecahedrons")
Did not attempt extensive refactoring of existing components; I just added special cases as necessary
Haven't added any tests for new classes/methods

PR Quality Scorecard ✨

Have you used the PR description template? +2 pts
Is your pull request atomic, focusing on a single change? +5 pts
Have you linked the GitHub issue(s) that this PR addresses? +5 pts
Have you documented your changes clearly and comprehensively? +5 pts
Have you changed or added a feature? -4 pts
- Have you added/updated corresponding documentation? +4 pts
- Have you added/updated corresponding integration tests? +5 pts
Have you changed the behavior of AutoGPT? -5 pts
- Have you also run agbenchmark to verify that these changes do not regress performance? +10 pts

…der for llamafiles. Currently it just extends OpenAIProvider and only overrides methods that are necessary to get the system to work at a basic level. Update ModelProviderName schema and config/configurator so that app startup using this provider is handled correctly. Add 'mistral-7b-instruct-v0' to OpenAIModelName/OPEN_AI_CHAT_MODELS registries.

…-Instruct chat template, which supports the 'user' & 'assistant' roles but does not support the 'system' role.

…kens`, and `get_tokenizer` from classmethods so I can override them in LlamafileProvide (and so I can access instance instance attributes from inside them). Implement class `LlamafileTokenizer` that calls the llamafile server's `/tokenize` API endpoint.

…tes on the integration; add helper scripts for downloading/running a llamafile + example env file.

…gs for reproducibility

…ange serve.sh to use model's full context size (this does not seem to cause OOM errors, surpisingly).

netlify · 2024-04-19T16:28:18Z

✅ Deploy Preview for auto-gpt-docs ready!

Name	Link
🔨 Latest commit	`65433ba`
🔍 Latest deploy log	https://app.netlify.com/sites/auto-gpt-docs/deploys/665e0ddd67bf630008d2bdb9
😎 Deploy Preview	https://deploy-preview-7091--auto-gpt-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

github-actions · 2024-04-22T15:19:58Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Swiftyos · 2024-04-23T14:16:22Z

@CodiumAI-Agent /review

CodiumAI-Agent · 2024-04-23T14:18:51Z

PR Review

⏱️ Estimated effort to review [1-5]	4, due to the complexity and breadth of the changes introduced, including new model provider integrations, extensive modifications to configuration and provider logic, and the addition of new scripts and documentation. The PR touches multiple core components and introduces a new LLM provider, which requires careful review to ensure compatibility and correctness.
🧪 Relevant tests	No
🔍 Possible issues	Possible Bug: The method `check_model_llamafile` in `configurator.py` uses `api_credentials.api_base.get_secret_value()` which might expose sensitive information in error messages. This could lead to security risks if the error messages are logged or displayed in an environment where unauthorized users can view them.
🔍 Possible issues	Possible Bug: In `LlamafileProvider`, the method `_create_chat_completion` hard-codes the `seed` for reproducibility, which might not be desirable in all use cases and could limit the functionality of the model in generating diverse responses.
🔒 Security concerns	Sensitive information exposure: The method `check_model_llamafile` potentially exposes sensitive API base URLs in exception messages, which could be a security risk if these messages are logged or improperly handled.

Code feedback:

relevant file	autogpts/autogpt/autogpt/app/configurator.py
suggestion	Consider removing or masking sensitive information such as `api_base` from error messages in `check_model_llamafile` to prevent potential leakage of sensitive data. [important]
relevant line	raise ValueError(f"llamafile server at {api_credentials.api_base.get_secret_value()} does not have access to {model_name}. Please configure {model_type} to use one of {available_model_ids} or use a different llamafile.")

relevant file	autogpts/autogpt/autogpt/core/resource/model_providers/llamafile.py
suggestion	Remove the hard-coded `seed` in `_create_chat_completion` or make it configurable via method parameters or configuration settings to allow for more dynamic behavior. [important]
relevant line	kwargs["seed"] = 0

✨ Review tool usage guide:

Overview:
The review tool scans the PR code changes, and generates a PR review which includes several types of feedbacks, such as possible PR issues, security threats and relevant test in the PR. More feedbacks can be added by configuring the tool.

The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.

When commenting, to edit configurations related to the review tool (pr_reviewer section), use the following template:

/review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=...

With a configuration file, use the following template:

[pr_reviewer]
some_config1=...
some_config2=...

See the review usage page for a comprehensive guide on using this tool.

Pwuts · 2024-05-24T21:46:50Z

@k8si any chance you could enable maintainer write access on this PR?

k8si · 2024-05-29T16:49:34Z

@Pwuts it doesn't look like I have the ability to do that. I added you as a maintainer to the forked project, is that sufficient or do others need write access?

Alternatively, you could branch off my branch and I can just accept the changes via PR?

github-actions · 2024-05-31T02:19:29Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

…vider`, `GroqProvider` and `LlamafileProvider` and rebase the latter three on `BaseOpenAIProvider`

github-actions · 2024-05-31T02:50:53Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

codecov · 2024-05-31T02:52:20Z

Codecov Report

Attention: Patch coverage is 36.92308% with 82 lines in your changes are missing coverage. Please review.

Project coverage is 42.38%. Comparing base (3192325) to head (3c8bf3c).

❗ Current head 3c8bf3c differs from pull request most recent head 65433ba

Please upload reports for the commit 65433ba to get more accurate results.

Files	Patch %	Lines
forge/forge/llm/providers/llamafile/llamafile.py	33.33%	82 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #7091      +/-   ##
==========================================
+ Coverage   35.01%   42.38%   +7.36%     
==========================================
  Files          18       84      +66     
  Lines        1211     4889    +3678     
  Branches      179      676     +497     
==========================================
+ Hits          424     2072    +1648     
- Misses        758     2724    +1966     
- Partials       29       93      +64

Flag	Coverage Δ
Linux	`42.38% <36.92%> (+7.36%)`	⬆️
Windows	`42.41% <36.92%> (+7.56%)`	⬆️
autogpt-agent	`35.01% <ø> (ø)`
forge	`44.80% <36.92%> (?)`
macOS	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2024-06-02T23:36:29Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

github-actions · 2024-06-03T13:54:07Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions · 2024-06-03T14:00:26Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

autogpt/scripts/llamafile/serve.sh

…rve.py

kcze

In general a solid step towards using local llms!
Some comments should be changed to docstrings and PR adds many todos/fixmes.

kcze · 2024-06-10T11:19:04Z

forge/forge/llm/providers/llamafile/llamafile.py

+        # Clean up model names:
+        # 1. Remove file extension
+        # 2. Remove quantization info
+        # e.g. 'mistral-7b-instruct-v0.2.Q5_K_M.gguf'
+        #   -> 'mistral-7b-instruct-v0.2'
+        # e.g. '/Users/kate/models/mistral-7b-instruct-v0.2.Q5_K_M.gguf'
+        #   ->                    'mistral-7b-instruct-v0.2'
+        # e.g. 'llava-v1.5-7b-q4.gguf'
+        #   -> 'llava-v1.5-7b'
+        def clean_model_name(model_file: str) -> str:


I like that the explanation is added here!
nit: we use """comment""" just below the member:

Suggested change

# Clean up model names:

# 1. Remove file extension

# 2. Remove quantization info

# e.g. 'mistral-7b-instruct-v0.2.Q5_K_M.gguf'

# -> 'mistral-7b-instruct-v0.2'

# e.g. '/Users/kate/models/mistral-7b-instruct-v0.2.Q5_K_M.gguf'

# -> 'mistral-7b-instruct-v0.2'

# e.g. 'llava-v1.5-7b-q4.gguf'

# -> 'llava-v1.5-7b'

def clean_model_name(model_file: str) -> str:

def clean_model_name(model_file: str) -> str:

"""Clean up model names:

1. Remove file extension

2. Remove quantization info

e.g. 'mistral-7b-instruct-v0.2.Q5_K_M.gguf'

-> 'mistral-7b-instruct-v0.2'

e.g. '/Users/kate/models/mistral-7b-instruct-v0.2.Q5_K_M.gguf'

-> 'mistral-7b-instruct-v0.2'

e.g. 'llava-v1.5-7b-q4.gguf'

-> 'llava-v1.5-7b'"""

kcze · 2024-06-10T11:20:47Z

autogpt/scripts/llamafile/serve.py

If this is just as example then fine but otherwise LLAMAFILE could be provided as an argument.

kcze · 2024-06-10T11:52:46Z

forge/forge/llm/providers/llamafile/llamafile.py

+    def _adapt_chat_messages_for_mistral_instruct(
+        self, messages: list[ChatCompletionMessageParam]
+    ) -> list[ChatCompletionMessageParam]:


For future: message format could be abstracted/handled by ModelProvided and assigned to each model.

What do you mean?

ntindle · 2024-06-10T17:01:20Z

Doesn't seem to work for me when using ./scripts/llamafile/serve.py

PS C:\Users\nicka\code\AutoGPTNew\autogpt> python3 .\scripts\llamafile\serve.py
Downloading mistral-7b-instruct-v0.2.Q5_K_M.llamafile.exe...
Downloading: [########################################] 100% - 5166.9/5166.9 MB
Traceback (most recent call last):
  File "C:\Users\nicka\code\AutoGPTNew\autogpt\scripts\llamafile\serve.py", line 56, in <module>
    download_llamafile()
  File "C:\Users\nicka\code\AutoGPTNew\autogpt\scripts\llamafile\serve.py", line 43, in download_llamafile
    subprocess.run([LLAMAFILE, "--version"], check=True)
  File "C:\Users\nicka\.pyenv\pyenv-win\versions\3.11.7\Lib\subprocess.py", line 548, in run
    with Popen(*popenargs, **kwargs) as process:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nicka\.pyenv\pyenv-win\versions\3.11.7\Lib\subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\nicka\.pyenv\pyenv-win\versions\3.11.7\Lib\subprocess.py", line 1538, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [WinError 193] %1 is not a valid Win32 application

Important sidenote: that binary doesn't work to be executed so python of course fails

ntindle · 2024-06-10T20:36:28Z

Mozilla-Ocho/llamafile#257 (comment)

ntindle · 2024-06-10T20:42:03Z

I also get this after using the workaround above

2024-06-10 15:40:50,164 ERROR  Please set your OpenAI API key in .env or as an environment variable.
2024-06-10 15:40:51,339 INFO  You can get your key from https://platform.openai.com/account/api-keys
Please enter your OpenAI API key if you have it:

################################################################################
### AutoGPT - GENERAL SETTINGS
################################################################################

## OPENAI_API_KEY - OpenAI API Key (Example: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
# OPENAI_API_KEY=

## ANTHROPIC_API_KEY - Anthropic API Key (Example: sk-ant-api03-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
# ANTHROPIC_API_KEY=

## GROQ_API_KEY - Groq API Key (Example: gsk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
# GROQ_API_KEY=


################################################################################
### LLM MODELS
################################################################################

## SMART_LLM - Smart language model (Default: gpt-4-turbo)
SMART_LLM=mistral-7b-instruct-v0.2

## FAST_LLM - Fast language model (Default: gpt-3.5-turbo)
FAST_LLM=mistral-7b-instruct-v0.2

## EMBEDDING_MODEL - Model to use for creating embeddings
# EMBEDDING_MODEL=text-embedding-3-small

k8si added 7 commits April 18, 2024 13:10

Adapt model prompt message roles to be compatible with the Mistral-7b…

ed1dfd0

…-Instruct chat template, which supports the 'user' & 'assistant' roles but does not support the 'system' role.

misc cleanup

234d059

add README for llamafile integration including setup instruction + no…

05d2b81

…tes on the integration; add helper scripts for downloading/running a llamafile + example env file.

simplify mistral message handling; set seed=0 in chat completion kwar…

1cd3e8b

…gs for reproducibility

set mistral max_tokens to actual value configured in the model and ch…

dc36c69

…ange serve.sh to use model's full context size (this does not seem to cause OOM errors, surpisingly).

github-actions bot added AutoGPT Agent size/xl labels Apr 19, 2024

Merge branch 'master' into draft-llamafile-support

e426766

github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Apr 22, 2024

Pwuts added 2 commits May 25, 2024 13:00

Merge branch 'master' into draft-llamafile-support

d63aa23

remove llamafile stuff from openai.py

7e7037d

Pwuts added the local llm Related to local llms label May 29, 2024

Pwuts added 2 commits May 30, 2024 17:23

Merge branch 'master' into draft-llamafile-support

3c1f283

fix linting errors

5d0f8b0

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label May 31, 2024

github-actions bot added conflicts Automatically applied to PRs with merge conflicts Forge labels May 31, 2024

Create BaseOpenAIProvider with common functionality from `OpenAIPro…

960155a

…vider`, `GroqProvider` and `LlamafileProvider` and rebase the latter three on `BaseOpenAIProvider`

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label May 31, 2024

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Jun 2, 2024

Pwuts mentioned this pull request Jun 3, 2024

docs+fix(agent): Update LLM setup instructions & remove GPT-specific flags #7183

Merged

github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Jun 3, 2024

Pwuts self-assigned this Jun 3, 2024

Merge branch 'master' into draft-llamafile-support

02d0691

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Jun 3, 2024

Pwuts added 2 commits June 3, 2024 16:40

move llamafile stuff into folders

f53c2de

clean up llamafile readme

f78ad94

github-actions bot added documentation Improvements or additions to documentation size/l and removed size/xl labels Jun 3, 2024

ntindle reviewed Jun 3, 2024

View reviewed changes

autogpt/scripts/llamafile/serve.sh Outdated Show resolved Hide resolved

Pwuts added 3 commits June 3, 2024 19:54

Improve llamafile model name cleaning logic

1a00ecf

expand setup instructions and info for llamafile

3c8bf3c

combine llamafile setup.sh and serve.sh into single cross-platform se…

65433ba

…rve.py

Pwuts marked this pull request as ready for review June 3, 2024 18:51

Pwuts requested a review from a team as a code owner June 3, 2024 18:51

Pwuts requested review from Swiftyos and majdyz and removed request for a team June 3, 2024 18:51

Torantulino requested a review from ntindle June 6, 2024 08:22

This was linked to issues Jun 9, 2024

Support using other/local LLMs #25

Open

Instructions for local models #6336

Open

kcze reviewed Jun 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(forge/llm): Add `LlamafileProvider` #7091

feat(forge/llm): Add `LlamafileProvider` #7091

k8si commented Apr 19, 2024 •

edited by Pwuts

netlify bot commented Apr 19, 2024 •

edited

github-actions bot commented Apr 22, 2024

Swiftyos commented Apr 23, 2024

CodiumAI-Agent commented Apr 23, 2024

Pwuts commented May 24, 2024

k8si commented May 29, 2024

github-actions bot commented May 31, 2024

github-actions bot commented May 31, 2024

codecov bot commented May 31, 2024 •

edited

github-actions bot commented Jun 2, 2024

github-actions bot commented Jun 3, 2024

github-actions bot commented Jun 3, 2024

kcze left a comment

kcze Jun 10, 2024

kcze Jun 10, 2024

kcze Jun 10, 2024

Pwuts Jun 10, 2024

ntindle commented Jun 10, 2024 •

edited

ntindle commented Jun 10, 2024

ntindle commented Jun 10, 2024 •

edited

feat(forge/llm): Add LlamafileProvider #7091

Are you sure you want to change the base?

feat(forge/llm): Add LlamafileProvider #7091

Conversation

k8si commented Apr 19, 2024 • edited by Pwuts

Background

Changes 🏗️

PR Quality Scorecard ✨

netlify bot commented Apr 19, 2024 • edited

✅ Deploy Preview for auto-gpt-docs ready!

github-actions bot commented Apr 22, 2024

Swiftyos commented Apr 23, 2024

CodiumAI-Agent commented Apr 23, 2024

PR Review

Pwuts commented May 24, 2024

k8si commented May 29, 2024

github-actions bot commented May 31, 2024

github-actions bot commented May 31, 2024

codecov bot commented May 31, 2024 • edited

Codecov Report

github-actions bot commented Jun 2, 2024

github-actions bot commented Jun 3, 2024

github-actions bot commented Jun 3, 2024

kcze left a comment

Choose a reason for hiding this comment

kcze Jun 10, 2024

Choose a reason for hiding this comment

kcze Jun 10, 2024

Choose a reason for hiding this comment

kcze Jun 10, 2024

Choose a reason for hiding this comment

Pwuts Jun 10, 2024

Choose a reason for hiding this comment

ntindle commented Jun 10, 2024 • edited

ntindle commented Jun 10, 2024

ntindle commented Jun 10, 2024 • edited

feat(forge/llm): Add `LlamafileProvider` #7091

feat(forge/llm): Add `LlamafileProvider` #7091

k8si commented Apr 19, 2024 •

edited by Pwuts

netlify bot commented Apr 19, 2024 •

edited

codecov bot commented May 31, 2024 •

edited

ntindle commented Jun 10, 2024 •

edited

ntindle commented Jun 10, 2024 •

edited