Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win 10 wsl bladebit_cuda "cudaErrorMemoryAllocation : out of memory" although card has enough memory #444

Open
brause opened this issue Dec 1, 2023 · 10 comments

Comments

@brause
Copy link

brause commented Dec 1, 2023

Hi there,

I am trying to make some --disk-16 cuda plots with bladebit_cuda under wsl but it seems my GPU ram is failing.
Since I have no idea what to do anymore about this error I am posting this here.
Output from the bladebit_cuda and nvidia-smi.exe is included below.

What I tried: I tried reinstalling the newest nvidia driver and installing and older(546.01) version of the nvidia driver. Same result.
I checked the card if it is a fake(has no vga port and all drivers install fine) but it seems a genuine nvidia card.

Any more checks on the card i can do ?

Gruß,
Karsten

bruch@Himbeer:/mnt/j/bladebit/build-release$ ./bladebit_cuda -f 8025cdb69d131cee2264785bd9e3ff7c5f7eceeb855951bcb2e471776e7fd59a0c4bdc87a659d8fc88bd35a0ee4179b2 -p 98c3089ecadcebec5b6e7ec9e8652f87e923a065856403542afd0902802c5733f0005dba963c09b101aa028ba28d2b89 --compress
 5 --benchmark cudaplot --disk-16 -t1 /mnt/h/tmp/ /mnt/j/farm/

Bladebit Chia Plotter
Version      : 3.1.0-dev
Git Commit   : e9836f8bd963321457bc86eb5d61344bfb76dcf0
Compiled With: gcc 11.4.0

[Global Plotting Config]
 Will create 1 plots.
 Thread count          : 16
 Warm start enabled    : false
 NUMA disabled         : false
 CPU affinity disabled : false
 Farmer public key     : 8025cdb69d131cee2264785bd9e3ff7c5f7eceeb855951bcb2e471776e7fd59a0c4bdc87a659d8fc88bd35a0ee4179b2
 Pool public key       : 98c3089ecadcebec5b6e7ec9e8652f87e923a065856403542afd0902802c5733f0005dba963c09b101aa028ba28d2b89
 Compression Level     : 5
 Benchmark mode        : enabled
Warning: 16G mode is experimental and still under development.
         Please use the --check <n> parameter to validate plots when using this mode.
         Direct I/O not supported in 16G mode at the moment. Disabing it.

[Bladebit CUDA Plotter]
 Host RAM            : 19 GiB
 Plot checks         : disabled

Selected cuda device 0 : NVIDIA GeForce GTX 1070
 CUDA Compute Capability   : 6.1
 SM count                  : 15
 Max blocks per SM         : 32
 Max threads per SM        : 2048
 Async Engine Count        : 5
 L2 cache size             : 2.00 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 8.00 GB
  Free                     : 7.06 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 4828776144   bytes ( 4605.08   MiB or 4.50   GiB )
Intermediate RAM required : 4378927104   bytes ( 4176.07   MiB or 4.08   GiB )
Host RAM required         : 2147483648   bytes ( 2048.00   MiB or 2.00   GiB )
Total Host RAM required   : 6976259792   bytes ( 6653.08   MiB or 6.50   GiB )
GPU RAM required          : 6163050496   bytes ( 5877.54   MiB or 5.74   GiB )
Allocating buffers...
CUDA error: 2 (0x2 ) cudaErrorMemoryAllocation : out of memory

*** Panic!!! *** Fatal Error:
CUDA error cudaErrorMemoryAllocation : out of memory.
./bladebit_cuda(_ZN7SysHost14DumpStackTraceEv+0x53)[0x56131c8fad93]
./bladebit_cuda(_Z9PanicExitv+0xf)[0x56131ca8c27f]
./bladebit_cuda(+0x7dbaf)[0x56131c8a1baf]
./bladebit_cuda(+0x85f70)[0x56131c8a9f70]
./bladebit_cuda(main+0xa61)[0x56131c89f121]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f029c285d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f029c285e40]
./bladebit_cuda(_start+0x25)[0x56131c8a09a5]
bruch@Himbeer:/mnt/j/bladebit/build-release$ nvidia-smi.exe
Fri Dec  1 17:46:35 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 546.01                 Driver Version: 546.01       CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1070      WDDM  | 00000000:23:00.0  On |                  N/A |
|  0%   47C    P2              28W / 151W |    468MiB /  8192MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1584    C+G   C:\Windows\System32\dwm.exe               N/A      |
|    0   N/A  N/A     11568    C+G   ...oogle\Chrome\Application\chrome.exe    N/A      |
|    0   N/A  N/A     11720    C+G   C:\Windows\explorer.exe                   N/A      |
|    0   N/A  N/A     13096    C+G   ....Search_cw5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     14188    C+G   ...CBS_cw5n1h2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A     17016    C+G   ...GeForce Experience\NVIDIA Share.exe    N/A      |
|    0   N/A  N/A     18572    C+G   ...5n1h2txyewy\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     19920    C+G   ....Search_cw5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     24264    C+G   ...siveControlPanel\SystemSettings.exe    N/A      |
+---------------------------------------------------------------------------------------+
@teamwest93
Copy link

Had this problem too when started.
But it resolved by itself - just gone.

@brause
Copy link
Author

brause commented Dec 1, 2023

did you use wsl-ubuntu as well ?

@teamwest93
Copy link

Yes.
I tried few guides, but i dont remember which one help - Nvidia for WSL, or Ubuntu for Nvidia.
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_local

@teamwest93
Copy link

so, it helps?

@brause
Copy link
Author

brause commented Dec 3, 2023

sadly, no. I still got:

Selected cuda device 0 : NVIDIA GeForce GTX 1070
 CUDA Compute Capability   : 6.1
 SM count                  : 15
 Max blocks per SM         : 32
 Max threads per SM        : 2048
 Async Engine Count        : 5
 L2 cache size             : 2.00 MB
 L2 persist cache max size : 0.00 MB
 Stack Size                : 1.00 KB
 Memory:
  Total                    : 8.00 GB
  Free                     : 7.06 GB

Allocating buffers (this may take a few seconds)...
Kernel RAM required       : 4828776144   bytes ( 4605.08   MiB or 4.50   GiB )
Intermediate RAM required : 4378927104   bytes ( 4176.07   MiB or 4.08   GiB )
Host RAM required         : 2147483648   bytes ( 2048.00   MiB or 2.00   GiB )
Total Host RAM required   : 6976259792   bytes ( 6653.08   MiB or 6.50   GiB )
GPU RAM required          : 6163050496   bytes ( 5877.54   MiB or 5.74   GiB )
Allocating buffers...
CUDA error: 2 (0x2 ) cudaErrorMemoryAllocation : out of memory

*** Panic!!! *** Fatal Error:
CUDA error cudaErrorMemoryAllocation : out of memory.
./bladebit_cuda(_ZN7SysHost14DumpStackTraceEv+0x53)[0x562451e7fd93]
./bladebit_cuda(_Z9PanicExitv+0xf)[0x56245201127f]
./bladebit_cuda(+0x7dbaf)[0x562451e26baf]
./bladebit_cuda(+0x85f70)[0x562451e2ef70]
./bladebit_cuda(main+0xa61)[0x562451e24121]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fc44c480d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fc44c480e40]
./bladebit_cuda(_start+0x25)[0x562451e259a5]

@brause
Copy link
Author

brause commented Dec 3, 2023

what does you nvidia-smi.exe say inside wsl ? I think I could try to replicate that.

@teamwest93
Copy link

teamwest93 commented Dec 3, 2023

Sun Dec 3 19:12:57 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.04 Driver Version: 546.17 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+====================================|
| 0 NVIDIA GeForce RTX 2070 ... On | 00000000:01:00.0 On | N/A |
| N/A 50C P8 7W / 80W | 231MiB / 8192MiB | 27% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+

@brause
Copy link
Author

brause commented Dec 3, 2023

You are on win11 ? It could just simply not work in win 10.
Edit: i tried downgrading the drivers. Did not work. Same Problem.

@brause brause changed the title bladebit_cuda "cudaErrorMemoryAllocation : out of memory" although card has enough memory win 10 bladebit_cuda "cudaErrorMemoryAllocation : out of memory" although card has enough memory Dec 3, 2023
@brause brause changed the title win 10 bladebit_cuda "cudaErrorMemoryAllocation : out of memory" although card has enough memory win 10 wsl bladebit_cuda "cudaErrorMemoryAllocation : out of memory" although card has enough memory Dec 3, 2023
@teamwest93
Copy link

Win 11, yes

@brause
Copy link
Author

brause commented Dec 4, 2023

I guess that could be it. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants