Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b1488
llama : mark LLM_ARCH_STARCODER as full offload supported (#3945) as done in https://github.com/ggerganov/llama.cpp/pull/3827
b1487
cmake : MSVC instruction detection (fixed up #809) (#3923) * Add detection code for avx * Only check hardware when option is ON * Modify per code review sugguestions * Build locally will detect CPU * Fixes CMake style to use lowercase like everywhere else * cleanup * fix merge * linux/gcc version for testing * msvc combines avx2 and fma into /arch:AVX2 so check for both * cleanup * msvc only version * style * Update FindSIMD.cmake --------- Co-authored-by: Howard Su <howard0su@gmail.com> Co-authored-by: Jeremy Dunn <jeremydunn123@gmail.com>
b1486
ci : use intel sde when ci cpu doesn't support avx512 (#3949)
b1485
cuda : revert CUDA pool stuff (#3944) * Revert "cuda : add ROCM aliases for CUDA pool stuff (#3918)" This reverts commit 629f917cd6b96ba1274c49a8aab163b1b189229d. * Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)" This reverts commit d6069051de7165a4e06662c89257f5d2905bb156. ggml-ci
b1483
metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion… … (#3938)
b1481
ggml-cuda : move row numbers to x grid dim in mmv kernels (#3921)
b1477
cuda : add ROCM aliases for CUDA pool stuff (#3918)
b1476
cmake : fix relative path to git submodule index (#3915)
b1474
cuda : fix const ptrs warning causing ROCm build issues (#3913)
b1473
cuda : use CUDA memory pool with async memory allocation/deallocation… … when available (#3903) * Using cuda memory pools for async alloc/dealloc. * If cuda device doesnt support memory pool than use old implementation. * Removed redundant cublasSetStream --------- Co-authored-by: Oleksii Maryshchenko <omaryshchenko@dtis.com>