Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent File Staleness Errors in GlusterFS Volume #4332

Open
meetsoni15 opened this issue Apr 10, 2024 · 2 comments
Open

Persistent File Staleness Errors in GlusterFS Volume #4332

meetsoni15 opened this issue Apr 10, 2024 · 2 comments

Comments

@meetsoni15
Copy link

meetsoni15 commented Apr 10, 2024

Description of problem:

We have configured a GlusterFS cluster with three peers and utilized six SSD drives for improved performance. The GlusterFS volume is mounted on a client using Server 5, where MinIO is deployed to provide object storage capabilities.

Configuration Details:

Servers:
    Server 5: 172.16.16.10 (acts as both server and client)
    Server 6: 172.16.16.7
    Server 7: 172.16.16.6
Drives:
    Each server hosts 2 SSD drives, 1TB each, formatted with Ext4.

Issue Encountered:

After performing manual file cleanup within a specific folder on the mounted GlusterFS volume, file staleness errors started appearing. To address this, we created a new folder and refrained from further cleanup activities. However, the same file staleness error resurfaced within the new folder after 2 days.

image

Expected results:
Resolve file staleness issue in GlusterFS volume for stable data access and improved system reliability.

Mandatory info:
- The output of the gluster volume info command:

Volume Name: minio-vol
Type: Distribute
Volume ID: a3bea87d-748e-4a15-80af-3343aa7608b3
Status: Started
Snapshot Count: 0
Number of Bricks: 6
Transport-type: tcp
Bricks:
Brick1: 172.16.16.6:/opt/disk1/minio
Brick2: 172.16.16.7:/opt/disk1/minio
Brick3: 172.16.16.6:/opt/disk2/minio
Brick4: 172.16.16.7:/opt/disk2/minio
Brick5: 172.16.16.10:/opt/disk2/minio
Brick6: 172.16.16.10:/opt/disk1/minio
Options Reconfigured:
performance.client-io-threads: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
cluster.eager-lock: off

- The output of the gluster volume status command:

Status of volume: minio-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 172.16.16.6:/opt/disk1/minio          54797     0          Y       665614
Brick 172.16.16.7:/opt/disk1/minio          58252     0          Y       2540584
Brick 172.16.16.6:/opt/disk2/minio          50657     0          Y       665623
Brick 172.16.16.7:/opt/disk2/minio          60607     0          Y       2540600
Brick 172.16.16.10:/opt/disk2/minio         60216     0          Y       2036048
Brick 172.16.16.10:/opt/disk1/minio         49344     0          Y       2036055

Task Status of Volume minio-vol
------------------------------------------------------------------------------
There are no active volume tasks

- The output of the gluster volume heal command:

Launching heal operation to perform index self heal on volume minio-vol has been unsuccessful:
Self-heal-daemon is disabled. Heal will not be triggered on volume minio-vol

**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/

We are using server and client on same server i.e server5

image

**- Is there any crash ? Provide the backtrace and coredump
No

Additional info:
image

- The operating system / glusterfs version:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy
@aravindavk
Copy link
Member

Remount the mount and check if the issue persists. How was the file cleaned up? Was it deleted from the mount or from the backend brick? Please share the steps to cleanup the file.

@meetsoni15
Copy link
Author

meetsoni15 commented Apr 11, 2024

@aravindavk

Have remounted it multiple times. It still persists.

Files were cleaned up from the Mounted folder, not from the brick.

Steps Followed for File Cleanup:

  1. We created a Golang utility function that finds files older than a certain date and deletes them.
  2. After deleting files, we encountered an issue regarding file staleness and found that we had to delete the associated folders as well.
  3. We deleted folders of the deleted files.
  4. We still encountered NFS file stale error.
  5. We created a new folder in the same GlusterFS mounted directory, and after a few days, we encountered the same issue; we didn't delete any files at all.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants