Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libsmb2 not supporting certain characters #231

Open
geofstro opened this issue Aug 1, 2022 · 36 comments
Open

libsmb2 not supporting certain characters #231

geofstro opened this issue Aug 1, 2022 · 36 comments

Comments

@geofstro
Copy link

geofstro commented Aug 1, 2022

I opened an issue on the AMSMB2 pages, since it is not allowing to extract data from files with certain latin characters (e.g. ñ, ì).

I notice some work was done here recently on unicode and hope this will fix the problem.

Is it enough to do the following…

git clone https://github.com/amosavian/AMSMB2
cd AMSMB2/buildtools
./build.sh

To update to the latest libsmb2 within AMSMB2? Or does the Author of AMSMB2 need to do something?

Apologies, as this is not really an issue with libsmb2 as such, I believe; just me not understanding how to ensure I'm using the latest version with AMSMB2, and assuming my unicode issue has been fixed in libsmb2 itself with the latest pull request.

@sahlberg
Copy link
Owner

sahlberg commented Aug 1, 2022

I think so.
Note, I do not have access to IOS and can not test AMSMB2 but I think that should work.

@geofstro
Copy link
Author

geofstro commented Aug 2, 2022

Thanks for your response. I rebuilt libsmb2 using the above commands to produce the IOS version. When I tested, it did not cure my problem with those latin characters from Windows shares.

So right now, I don't know if:

a) The rebuilt version includes the fix for this, or if it's just resulting in the same version of libsmb2 for IOS being built, that was already in the AMSMB2 distrib.

b) Whether the unicode changes submitted here, did cure this problem. Perhaps it was just wishful thinking on my part.

I'm referring, of course, to this fix:

#207

@geofstro
Copy link
Author

geofstro commented Aug 2, 2022 via email

@delins
Copy link

delins commented Aug 2, 2022

I don't have time to dive very deep into this at the moment, but my first guess is that it should have worked even before #208. ñ can be expressed using a single UTF-16 unit and therefore the #208 fix wasn't necessary. Unless Swift does some transcoding as is talked about in this so post before handing the filename off to libsmb2, but I doubt that.

How did you rebuild AMSMB2 with libsmb2? AMSMB2 comes bundled with its own statically built version of libsmb2 and refers to it in https://github.com/amosavian/AMSMB2/blob/master/Package.swift. My guess is that you should replace the static libraries in https://github.com/amosavian/AMSMB2/tree/master/libsmb2/lib with the ones you built yourself. If you built libsmb2 as a dynamic library (.so) that won't do you any good in this case since it won't get picked up.

I've been using libsmb2 with the #208 fix for a while now and haven't got any issues with weird unicode characters anymore. But I only use dirlisting at the moment, no file reading, copying or renaming.

Could you name a file abc😀.txt and test how AMSMB2 handles that with your use cases? The smiley uses a surrogate pair in UTF-16, pehaps we learn something from the results.

Also, did you do anything with the locale of your application? My guess is that by default everything in iOS uses UTF-8 (which should be OK), but if you changed it that might cause issues with character encoding.

@geofstro
Copy link
Author

geofstro commented Aug 10, 2022

I understand and thanks for your response anyway.

I tried naming an example file named abc😀.flac (flac because a music file) and got the same error as with the ñ character…

NSLocalizedFailureReason=Error code 2: Open failed with (0xc0000034) STATUS_OBJECT_NAME_NOT_FOUND.}

As I mentioned before, I have no issue with listing these files either; but I need to extract the data from them, move or download them. Swift uses UTF-8. So I don’t know if that would be an issue. I don’t think so either.

I understand it’s a licensing requirement of AMSMB2 to include libsmb2 as a static library, if I plan to distribute my app commercially, which I intend to do.

From the read me of AMSMB2…

"While source code shipped with project is MIT licensed, but it has static link to libsmb2 which is LGPL v2.1, consequently the whole project becomes LGPL v2.1.

You must link this library dynamically to your app if you intend to distribute your app on App Store."

It looks to me as though libsmb2 is not able to fully support filenames with latin characters from Windows shares. The same problem occurred with the emoji in the filename.

I consider this a significant limitation!

@delins
Copy link

delins commented Aug 10, 2022

I'm not a licensing lawyer so don't take it as fact, but I'm pretty sure it's the opposite: as libsmb2 is lgpl you need to link to it dynamically, ie ship it alongside your application as an .so file so that it's loaded when your application starts, and not link against the .a file. Perhaps this is what you meant.

I agree it would be a significant limitation.

Btw how do you load the file paths into your program? Do you perform a dirlisting with libsmb2, and then use the returned filenames to create the file path, or do you load the paths from a file, stdin, or hardcoded in your application? Any of thoses places may mess up the encoding as well, especially if you develop on windows.

@geofstro
Copy link
Author

ie ship it alongside your application as an .so file so that it's loaded when your application starts, and not link against the .a file. Perhaps this is what you meant.

Yes, that is what I meant. Sorry for the confusion.

Do you perform a dirlisting with libsmb2, and then use the returned filenames to create the file path

This is what I do. For Windows, strangely I have to drop the first slash from the returned filename to be able to fetch the data, which works when those characters aren't present. For Mac and Linux, the first slash didn't need to be dropped.

@delins
Copy link

delins commented Aug 10, 2022

I've been reading up on Swift string encodings. It appears that prior to Swift 5, it used either ASCII or UTF-16. I'm not sure which would have been default, but I guess it's ASCII. AMSMB2 creates strings from C's cstring wherever it interfaces with libsmb2, and my best guess is that it somewhere interprets the names as ASCII strings instead of UTF-8 strings (which is what libsmb2 returns).

You might try peppering explicit UTF-8 interpretations into AMSMB2's cString stuff, or perhaps just hardcode a path with the ñ character into some test code to see what that does (and make sure that the Swift string you create with it interprets the string as UTF-8).

@geofstro
Copy link
Author

I think I read somewhere Swift used UTF-16 prior to Swift 5. Thanks for the ideas, I will try them out. I don't understand why this is only an issue for Windows SMB shares though. All characters in file names work just fine when the SMB share is MacOSX or Linux.

@delins
Copy link

delins commented Aug 10, 2022

Ah I didn't know you didn't have this problem with MacOSX and Linux shares. That's odd indeed, as far as I know the SMB specification states that path encodings should be UTF-16 LE, and there's no way for a server to specify something else. It may have something to do with the SMB versions, but I doubt it.

@geofstro
Copy link
Author

I should have probably made that clear in the title, as I did on the AMSMB2 issues page.

@geofstro geofstro changed the title AMSMB2 not supporting certain latin characters AMSMB2 not supporting certain latin characters from Windows shares Aug 11, 2022
@geofstro geofstro changed the title AMSMB2 not supporting certain latin characters from Windows shares libsmb2 not supporting certain characters from Windows shares Aug 11, 2022
@geofstro geofstro changed the title libsmb2 not supporting certain characters from Windows shares libsmb2 not supporting certain characters Sep 15, 2022
@geofstro
Copy link
Author

geofstro commented Sep 15, 2022

I have changed the title back again, because further testing has shown that libsmb2 cannot handle these characters on Linux either.

A major problem as most NASs are Linux based.

I will do some further testing on Mac to determine if MacOS is the one exception.

@geofstro
Copy link
Author

Further testing confirmed that non-English characters do not work for any copying, downloading or reading data for files with paths that contain these characters.

This is true of both Linux and Windows SMB shares.

The only exception is when the SMB share is MacOS, where all works fine.

Before, I was wrongly assuming that because it worked on MacOS, Linux would also be fine; but that's not the case.

This has created a major problem for my IOS Swift app using the AMSMB2 wrapper lib.

I am trying to understand where the fault lies.

These file paths can be listed just fine and, of course, the latin characters, show up in Windows and Linux file browsers just fine.

I don't see the problem being with AMSBM2 as it's just a wrapper for this lib.

The problem seems to be with this libs handling of file paths on Linux and Windows, for all operations that would involve data reading.

I hope someone can help. I'm getting quite desperate trying to resolve this, as I put a lot of work in before discovering this problem.

Thanks in advance to anyone that can help.

@delins
Copy link

delins commented Sep 21, 2022

I ran some tests with the current master (A) and commit 1838d99 (B). The latter is from around the time the libsmb2 archive files in AMSMB2 were comitted. For both commits I compiled (a slightly modified version of) the smb2-cat-sync.c example. I ran the binary on a Linux machine, the SMB server runs on Windows.

First I tried reading from a file named testñ.txt. This ran fine with both A and B. Then I tried reading from a file named test😀.txt. This ran fine with A, but I got a generic error "smb2_open_async failed" with B. This has likely to do with the surrogate pair conversion issue that was fixed in #208, since the error is thrown even before the request is sent out to the server. Since you get an error response from the server, I assume you correctly linked to the newer version of libsmb2. So that's good.

So what I can say is that your use case runs fine on a Linux box using vanilla libsmb2 contacting an SMB server on Windows. Therefore it currently makes most sense to me that the issue lies with AMSMB2 . It would be useful to know what AMSMB2 actually hands over to libsmb2. Could you add the following code at the top of smb2_cmd_create_async's body in smb2-cmd-create.c?

char *s = req->name;
printf("Path: ");
while(*s)
    printf("%02x", (unsigned int) *s++);
printf("\n");

If you recompile libsmb2 and run your app again I expect this would print the path that libsmb2 is asked to access. Note that the hex string will contain the whole path on your SMB server.

You could also run tcpdump on the server to see what request it receives. If you'd post a pcap here I could take a look, but preferably use a test account with a test password. The NTLM hashing that SMB uses isn't great. You could also use wireshark yourself if you have experience with it. It has a good disector for SMB and it should clearly show the requested path.

@geofstro
Copy link
Author

Thanks very much for looking into this further. I think I'd rather concentrate on Wireshark to determine the path being sent, as it looks like a very useful tool. I'd like to get familiar with it.

Any further tips on dissecting SMB to determine the path being sent, will of course, also be welcome.

@geofstro
Copy link
Author

geofstro commented Oct 11, 2022 via email

@sahlberg
Copy link
Owner

sahlberg commented Oct 11, 2022 via email

@sahlberg
Copy link
Owner

sahlberg commented Oct 11, 2022 via email

@geofstro
Copy link
Author

Quite happy to donate PS2 games, if you can only solve the problem I have; namely that I still cannot extract data/move or copy any files which contain latin characters, such as "ó". This is when the SMB share is either Linux or Windows. Mac SMB shares seem to work fine.

Listing directories with these non-ASCII characters works fine; but all functions where reading data is involved do not.

It's bizarre that listing with these UTF-8 characters in the file paths works; but reading the data of any files with these characters in their paths does not.

I don't know if the problem lies with the AMSMB2 wrapper or libsmb2.

I looked at the code for AMSMB2 though and couldn't find anything obvious to me that would cause this problem.

It severely limits how libsmb2 can be used especially for IOS/Mac development.

I'm finding this extremely frustrating and I'm willing to be quite generous with PS2 game donation or other donations if you can fix it.

If the problem is with AMSMB2, then perhaps you could also take a look at that, since there's been no activity there for a couple years.

Thanks in advance.

geofstro

@delins
Copy link

delins commented Oct 11, 2022

It seems geofstro accidentally reposted an older message. @geofstro from #234 I gathered that you had debugged this further and found reasons to believe the problem lies with AMSMB2. Did you see issues in the pcap? Or did you find anything useful using the debugging statement I posted? It would be nice to have a better understanding of this and I can't test it since I don't develop Apple stuff.

@geofstro
Copy link
Author

I was going by your previous post, which seems to indicate AMSMB2 is at fault…

"So what I can say is that your use case runs fine on a Linux box using vanilla libsmb2 contacting an SMB server on Windows. Therefore it currently makes most sense to me that the issue lies with AMSMB2 . It would be useful to know what AMSMB2 actually hands over to libsmb2. Could you add the following code at the top of smb2_cmd_create_async's body in smb2-cmd-create.c?

char *s = req->name;
printf("Path: ");
while(*s)
printf("%02x", (unsigned int) *s++);
printf("\n");"

I was unable to modify that command as I am linking to libsmb2 as a dynamic lib, so don't have access to the code; but AMSMB2 is merely a wrapper that allows to call libsmb2 from Swift syntax.

I opened 234 because, if AMSMB2 is at fault and the Author is no longer active in solving issues; it no longer seems a viable wrapper for libsmb2, although it is recommended in your README, for that purpose. Especially as it also doesn't run on the latest xcode14.

How can we set up a test to ensure libsmb2 is working correctly when accessing Linux and Windows shares from IOS/Mac apps?

I could try to write a new Swift wrapper for testing purposes; but I have no experience of that and would need some help.

@delins
Copy link

delins commented Oct 11, 2022

You already built libsmb2 at least once right? You can just add the print code, build libsmb2 again, and repackage your app. or am I missing something?

@geofstro
Copy link
Author

geofstro commented Oct 11, 2022

Ok, thanks. I guess I can add those lines you suggested, re-build it and call the modified version from ASMB2.

Then I should get errors that'll indicate where the fault lies, right?

Sorry; but I'm a bit lacking in confidence when it comes to working with C libs and integrating them with Swift.

@delins
Copy link

delins commented Oct 11, 2022

Not errors, but it should print the path that you're trying to read. If I'm not mistaken that function is the first (or among the first) that is called once AMSMB2 passes control to libsmb2. If that path is incorrect then we know AMSMB2 is passing it wrong. If the path is correct, then it must be libsmb2 that messes up.

@geofstro
Copy link
Author

Apologies again. Yes, you did explain that before.

@geofstro
Copy link
Author

OK, so I'm looking at this again and I realised I have only re-built libsmb2 by issuing these commands from your read me…
git clone https://github.com/amosavian/AMSMB2
cd AMSMB2/buildtools
./build.sh

In this case that will not do as I need to modify the libsmb2 file; "smb2-cmd-create.c" to add the lines you advised and compile a completely fresh libsmb2 for IOS and MacOS from scratch, on my Arm64 M1 Mac.

I don't know how to do that, and will need some further instructions from you.

Thanks in advance.

@delins
Copy link

delins commented Oct 12, 2022

If you want to change the way something is built, the logical place to look and figure it out is to look at a script called build.sh ;) I know C is different from Swift and has a more difficult build process, but if you read that script I'm pretty sure you could still figure it out. What I would try is the following, although I haven't tested any of it:

  1. Get a fresh clone of https://github.com/amosavian/AMSMB2.git
  2. Read through AMSMB2/buildtools/build.sh. You don't have to understand everything, but it will give you a rough idea of the steps that are performed.
  3. Remove the libsmb2 folder in the cloned repo (build.sh does this as well but just make sure it's really gone)
  4. Copy the build.sh script so that you have two. In the first one, remove everything after line 39. In the second one, remove the specific parts before line 39 where it removes the libsmb2 folder and downloads a new libsmb2. Make sure to not remove anything other than that.
  5. Run the first part. This should give you the libsbm2 source code in the libsmb2 folder.
  6. Make the required edits.
  7. Run the second part.

@geofstro
Copy link
Author

That's a lot clearer, thanks. I'll follow your instructions.

@geofstro
Copy link
Author

geofstro commented Oct 14, 2022

Trying your instructions, I'm tripping up on the second part. I think I must be removing too much or too little. Could you please post exactly what build.sh should look like for the second run.
Thanks

@delins
Copy link

delins commented Oct 14, 2022

I don't know exactly either. I can't replicate it because I don't have access to a MacOS system.

@geofstro
Copy link
Author

geofstro commented Oct 14, 2022

Ok, so here is the original build.sh, which I first removed everything after line 39. That ran fine, producing the libsmb2 source code. I then edited the necessary .c file. After that I had no success when I tried to run the second version of build.sh, after trying to just remove the lines "where it removes the libsmb2 folder and downloads a new libsmb2".

That part always fails. So if you could send me a modified build.sh for that second run, I should be able to complete the exercise.

Thanks…

#!/bin/sh

for i in "$@" ; do
if [[ $i == "--with-libkrb5" ]] ; then
WITH_KRB5="YES"
echo "Building with Kerberos 5."
break
fi
done

cd ..
rm -rf "libsmb2"
mkdir "libsmb2"
mkdir "libsmb2/include"
mkdir "libsmb2/lib"
PACKAGE_DIRECTORY=pwd
export LIB_OUTPUT="${PACKAGE_DIRECTORY}/libsmb2/lib"
cd buildtools

brew update
for pkg in cmake automake autoconf libtool; do
if brew list -1 | grep -q "^${pkg}$"; then
echo "Updating ${pkg}."
brew upgrade $pkg &> /dev/null
else
echo "Installing ${pkg}."
brew install $pkg > /dev/null
fi
done

if [ ! -d libsmb2 ]; then
git clone https://github.com/sahlberg/libsmb2
cd libsmb2
echo "Bootstrapping..."
./bootstrap &> /dev/null
else
cd libsmb2
fi

export USECLANG=1
export CFLAGS="-fembed-bitcode -Wno-everything -DHAVE_SOCKADDR_LEN=1 -DHAVE_SOCKADDR_STORAGE=1"
export CPPFLAGS="-I${PACKAGE_DIRECTORY}/buildtools/include"
#export CPPFLAGS="-I/usr/local/opt/openssl/include"
export LDFLAGS="-L${LIB_OUTPUT}"
#export PKG_CONFIG_PATH="/usr/local/opt/openssl/lib/pkgconfig"

echo "Making libsmb2 static libararies"
if [[ -z "${WITH_KRB5}" ]]; then
FRPARAM="--without-libkrb5 --disable-werror"
else
FRPARAM="--disable-werror"
fi

echo " Build iOS"
export OS=ios
export MINSDKVERSION=9.0
../autoframework libsmb2 $FRPARAM > /dev/null
echo " Build macOS"
export OS=macos
export MINSDKVERSION=10.11
../autoframework libsmb2 $FRPARAM > /dev/null
echo " Build tvOS"
export OS=tvos
export MINSDKVERSION=9.0
../autoframework libsmb2 $FRPARAM > /dev/null
cd ..

echo "Copying additional headers"
cp "libsmb2/include/libsmb2-private.h" "${PACKAGE_DIRECTORY}/libsmb2/include/"
cp "module.modulemap" "${PACKAGE_DIRECTORY}/libsmb2/include/"

rm -rf libsmb2
rm -rf include
rm -rf lib

@delins
Copy link

delins commented Oct 14, 2022

I don't have time to guide you through all the steps. Even if my guess at the correct second file would be correct, something else may fail after that. You have to start debugging on your own. Run the commands in the your second file one by one and see where it fails, then go from there.

@geofstro
Copy link
Author

I also have limited time and I'll do the best I can when I can spare the time.

Just to back up for a moment; I'm doing this de-bugging to try to find the cause of this issue, that I reported. It's by no means a forgone conclusion that the problem is with AMSMB2. It could still be with libsmb2 itself.

That's, I believe, what we're trying to determine.

@delins
Copy link

delins commented Oct 14, 2022

I tested your use case with libsmb2 directly in #231 (comment) and everything worked just fine, therefore as far as I'm concerned it's a problem with AMSMB2. So people here may help if they feel like it, but ultimately you're the only one here that has any personal interest. If I ever run into this using vanilla libsmb2 I'll revisit this.

@geofstro
Copy link
Author

Yes; but your test wasn't my use case: "So what I can say is that your use case runs fine on a Linux box using vanilla libsmb2 contacting an SMB server on Windows."

Whereas I'm running on IOS/MacOS to Linux or Windows and encountering this problem in both cases.

From your test I don't see how you can conclude that the problem lies with AMSMB2. It might; but not necessarily!

@dream4java
Copy link

This error maybe is caused by the URL(ios,macos), not this library.

I modified the code and it can be read normally (AMSMB2.swift)

private func listDirectory(context: SMB2Context, path: String, recursive: Bool) throws
-> [[URLResourceKey: Any]]
{
var contents = [URLResourceKey: Any]
let dir = try SMB2Directory(path.canonical, on: context)
for ent in dir {
let name = String(cString: ent.name)
if [".", ".."].contains(name) { continue }
var result = URLResourceKey: Any
result[.nameKey] = name

        var resultPath = path
        if (ent.st.isDirectory)
        {
            resultPath.append("\(name)/")
        }else {
           resultPath.append("\(name)")
        }
        result[.pathKey] = resultPath
        
        /*result[.pathKey] =
            path.fileURL().appendingPathComponent(name, isDirectory: ent.st.isDirectory).path
        */
        ent.st.populateResourceValue(&result)
        contents.append(result)
    }

    if recursive {
        let subDirectories = contents.filter(\.isDirectory)

        for subDir in subDirectories {
            try contents.append(
                contentsOf: listDirectory(
                    context: context, path: subDir.path.unwrap(), recursive: true
                )
            )
        }
    }

    return contents
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants