Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize full mode #64

Open
jpdoyle opened this issue Jun 30, 2018 · 11 comments
Open

Optimize full mode #64

jpdoyle opened this issue Jun 30, 2018 · 11 comments

Comments

@jpdoyle
Copy link

jpdoyle commented Jun 30, 2018

Is it possible to multithread "full" mode encryption? As it is, full mode seems much slower than it should be, and I think this is because it is forced to run in single-threaded mode.

I might be able to help with this (I have some experience writing safe parallel data structures using fine-grained-locking), but if there are known limitations or correctness concerns I'd like to know about those before I start experimenting.

@netheril96
Copy link
Owner

Full mode encryption does more things and that is the major reason it is slower. Per the past profiling, the bottleneck is on HMAC computation and syscalls to access the meta file. It is not possible to improve as HMAC speed is determined by Crypto++ and syscall speed is determined by the OS.

Once upon a time, securefs is mulithreaded for all modes. But profiling showed that it barely improved performance on full mode encryption, and it wasn't verfied to be race-free, so I removed the mulitithreading.

@jpdoyle
Copy link
Author

jpdoyle commented Jul 1, 2018

Okay, it's good to hear that it's been profiled. I wonder if there's some performance to be gained with some sort of in-memory caching of the B-trees or deferring high-latency tasks to background worker threads so you can have better overall throughput. I'll read through operations.cpp when I get the chance, and I'll send you a pull request if I make any progress!

Do you have any standard workloads or profiling techniques? If I start tweaking things I'd like to keep consistent with the performance measures you're already using.

@jpdoyle jpdoyle changed the title Multithread version 2 Optimize full mode Jul 1, 2018
@jpdoyle
Copy link
Author

jpdoyle commented Jul 1, 2018

If I'm reading operations.cpp correctly, most filesystem calls use open_base_dir (

FileGuard open_base_dir(FileSystemContext* fs, const char* path, std::string& last_component)
), which traverses the whole directory path from the root each time to determine the correct ID. Implementing a cache for path -> ID mapping should significantly reduce the required amount of disk read and decryption overhead, especially in deeply nested paths. The tricky part will be making sure the cache entries are invalidated when they go out of sync -- but I think the only things that can change the path -> ID mapping are moves and deletions.

@jpdoyle
Copy link
Author

jpdoyle commented Jul 1, 2018

I ran an experiment and this kind of caching seems to have negligible effect. However, it seems that a lot of time gets spent "evicting" closed files from the cache. I'm looking at what kinds of things take so much time when freeing a FileBase right now. There might also be some merit to adding a secondary victim cache if it isn't feasible to optimize file closing. The victim cache can be handled asynchronously by a background thread or two. That way, you don't have to block the main thread on every eviction.

@netheril96
Copy link
Owner

Did you profile the program and find the time spent evicting closed files?

@jpdoyle
Copy link
Author

jpdoyle commented Jul 1, 2018

I ran securefs mount with --trace, and then ran ls on the directory. I observed that all the FS read operations and stats went by quickly, then there would be a pause for each "Evicting" message. I have a custom build of securefs running now that changes FileTable's behavior and defers deleting these objects to a thread pool, and will wait for the thread pool to finish closing that file if the FileTable is asked to re-open it. I also included the path -> ID cache I implemented earlier, though I'm not sure how much that is factoring in. I seem to be getting over an order of magnitude of speed improvement:

# using securefs 0.8.2, installed via Nix
$ time ls
...
ls --color=tty  0.01s user 0.02s system 0% cpu 2:10.97 total

# Using my build
$ time ls
...
ls --color=tty  0.01s user 0.01s system 5% cpu 0.259 total

I'm using a USB3 flash drive, if that changes anything about the performance expectations.

@netheril96
Copy link
Owner

netheril96 commented Jul 1, 2018

When you time ls, do you mount securefs with --trace option?

I thought about your design before, but did not choose it, because I was not confident that the filesystem could remain intact when securefs or the OS crash and any kind of caching is involved. To cache and maintain integrity of the filesystem would probably require journaling, another complexity and performance blocker.

I did not think thoroughly though. If you think it is possible to maintain filesystem integrity even when it crashes, then I might implement the caching or accept your PR.

@netheril96
Copy link
Owner

Wait a minute. I forget all the design rationales for the full mode. I need to think carefully.

@jpdoyle
Copy link
Author

jpdoyle commented Jul 1, 2018

Those runs of time ls were both with securefs mount (no --trace), and I ran time ls as the first thing after mounting it, so the path cache was starting cold.

The concern I'd have with this in-memory cache is that if something other than securefs directly modifies any of the encrypted files, securefs might crash, or worse, corrupt the file structure. It might be a good idea to do some sort of inotify watch to catch this.

@jpdoyle
Copy link
Author

jpdoyle commented Jul 1, 2018

(the encrypted files getting modified is plausible in a situation where the cyphertext is synchronized via a cloud service while the FS is mounted locally)

@netheril96
Copy link
Owner

Send me your modified version as a pull request, even if it is just a draft. It is hard to discuss without a concrete implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants