Rewrite casync-http to use curl multi #208

elboulangero · 2019-03-23T11:03:38Z

This commit brings parallel downloads to casync-http, by using the curl
multi interface, and more precisely the "select()" flavour. For details
see https://curl.haxx.se/libcurl/c/libcurl-multi.html.

Libcurl has two ways to achieve parallel downloads:

for HTTP/1, it can open parallel connections. The maximum number of
parallel connections is user-defined, through MAX_HOST_CONNECTIONS
and MAX_TOTAL_CONNECTIONS.
for HTTP/2, it can attempt to multiplex on a single connection. The
maximum number of parallel downloads in this case is negociated
between the client and the server (we talk about number of streams in
the HTTP/2 jargon).
(note that libcurl used to do pipelining over HTTP/1.1, but this is no
longer supported since 7.62.0, and casync-http doesn't use it anyway)

Accordingly, this commit also introduces two new command-line arguments
to better control the behavior of parallel downloads:

--max-active-chunks is the sum of 1. the number of chunks in the
curl multi and 2. chunks downloaded and waiting to be sent to the
remote. It allows to limit the number of chunks waiting in RAM, in
case we download faster than we can send to remote. It also gives a
limit for the maximum number of concurrent downloads.
--max-host-connections is for the case where libcurl opens parallel
connections to the server. In all likelihood it's only used for HTTP1.

We probably want a large number for max-active-chunks, to ensure we
don't starve the libcurl multi handle, but at the same time we probably
don't want to open too many connections in parallel, and that's why
max-host-connections is a much lower number. It seems to be a sensible
default, according to my understanding so far. User might want to adjust
these number for their specific use-case.

Note that the command-line argument --rate-limit-bps doesn't make much
sense anymore, since it's set for each chunk, but now chunks are
downloaded in parallel, and we don't really know how many downloads are
actually happening in parallel. And from Daniel Steinberg:

We don't have settings that limit the transfer speed of multiple,
combined, transfers.

So we might want to completely remove this option, or rework it somehow.

Note that this commit removes the wrapper robust_curl_easy_perform()
introduced in 328f13d. Quick reminder: this wrapper was used to sleep
and retry on CURLE_COULDNT_CONNECT, and allowed to workaround what
seemed to be a misbehavior of the Python Simple HTTP Server. Now that we
do parallel downloads, we can't apply this workaround "as is", we can't
just sleep. So I removed the wrapper. The issue is still present and
reproducible though, but I just assume it's a server issue, not a casync
issue.

Regarding unit tests

This commit also opens interesting questions regarding the unit tests.
For now we're using the Python Simple HTTP Server, which can only serve
requests sequentially. It doesn't allow to really test parallelism. It's
not really representative of real-life scenario where, I assume, chunks
are served by a production server such as apache or nginx. Additionally,
I think it would be best to run the test for both a HTTP/1 server and a
HTTP/2 server.

One possibility is to use nginx, it's easy enough to run it. nginx can
serve HTTP/2 only if TLS is enabled though, and out of the box
casync-http will fail if it can't recognize the certificate. So we might
want to add a command-line argument to trust any random certificate.

Additionally, nginx requires root, maybe not very suitable for a test
suite, however there might be some workarounds?

Another possibility is to run nghttpd. This option is light on
dependencies, in the sense that libcurl already relies on libnghttp2,
however I didn't succeed in using this server yet.

TODOs

There are a few todos left, mainly:

add argument or env variables support in casync-tool, to match new
arguments in casync-http.
documentation, readme and such.
some details in the code, see MR in GitHub.

meson.build

elboulangero · 2019-03-23T11:13:31Z

src/casync-http.c

+struct CaChunkDownloader {
+        CaRemote *remote;
+        CURLM *multi;
+        Queue *ready;       /* CURL handles waiting to be used */


Since we know the size of the queues already, it would be more efficient to have static arrays, and avoid a lot of malloc/free. I can rework this.

src/casync-http.c

elboulangero · 2019-03-23T11:22:51Z

src/casync-http.c

+}
+
+static int ca_chunk_downloader_step(CaChunkDownloader *dl) {
+        int r;


So, this function is tricky. This is basically what happens during a loop iteration.

See the call ca_chunk_downloader_remote_step() below? If you move it to the end, then you break the testsuite, and a bunch of casync list commands hang forever. I don't really know why. But it's not sure you will reproduce it. It's interesting to see that if I run the testuite without parallel tests (ie. MESON_TESTTHREADS=1 ninja test), then everything works fine.

So there's some subtleties going here that I didn't fully understand, and I ended up ordering things in this function by trial and failure. In other words, this needs thorough review: this function, the functions it calls, and basically the communication with the casync remote.

elboulangero · 2019-04-09T12:28:00Z

Ping?

poettering

hmm, so if i get this right, then in this PR we'll synchronously run the curl logic, then synchronously the ca_remote logic and then the curl logic again and so on. Ideally both logics would run asynchronously in the same poll() loop, i.e. so that we get the fds to wait for out of curl and out of caremote (the latter is easy since it's only two) and pass them to the same poll() call. Can we make that happen with libcurl? I don't know curl well enough I must admit, and the reason I did my original http client the way i did (with its synchronous behaviour) was laziness. But if we fix this for multiple getters maybe we can fix that too?

Also, I am not sure why we need a queue for the requests? note that CaRemote already implements one anyway?

src/casync-http.c

poettering · 2019-04-09T17:00:36Z

src/casync-http.c

+        QueueItem *next;
+};
+
+typedef struct Queue {


hmm, instead of introducing a new, local queue implementation here: let's just copy list.h from systemd's tree (i.e. https://github.com/systemd/systemd/blob/master/src/basic/list.h) and use that? it's an embedded list which means we need fewer allocations

Done in commit 48f37dd and fdd4689. See my temporary branch mr/casync-http-curl-multi.

I will push-force latter.

src/casync-http.c

elboulangero · 2019-04-10T04:29:04Z

hmm, so if i get this right, then in this PR we'll synchronously run the curl logic, then synchronously the ca_remote logic and then the curl logic again and so on. Ideally both logics would run asynchronously in the same poll() loop, i.e. so that we get the fds to wait for out of curl and out of caremote (the latter is easy since it's only two) and pass them to the same poll() call. Can we make that happen with libcurl? I don't know curl well enough I must admit, and the reason I did my original http client the way i did (with its synchronous behaviour) was laziness. But if we fix this for multiple getters maybe we can fix that too?

Not sure I understand you, so let me detail a bit how things work.

There's actually only one poll for all events (curl and casync altogether). This happens in curl_multi_wait() at

casync/src/casync-http.c

Line 999 in 3c1f551

    
           c = curl_multi_wait(dl->multi, waitfds, ELEMENTSOF(waitfds), curl_timeout_ms, &n);

The pros of curl_multi_wait is that we don't have to deal with all fds that are in use by curl and feed them to poll manually, because that exactly what curl_multi_wait does already. Additionally, this function allows user to give additional fds to add to the poll, and that's what we do here: we give casync two fds, so that they're part of the poll.

If you want to peak into curl's implementation for curl_multi_wait, here are some quick links:

Calling the poll wrapper in curl_multi_wait: https://github.com/curl/curl/blob/c4e0be44089408d65220ab0174ad3443724092a0/lib/multi.c#L1103
Calling poll proper (if poll is supported): https://github.com/curl/curl/blob/c4e0be44089408d65220ab0174ad3443724092a0/lib/select.c#L439

So it's an easy way to poll on both casync and curl fds, basically. The downside, maybe, is that when we're out of this function, we don't know which fds were triggered, all we can know is how many fds were triggered (which doesn't help much). That's why a "loop iteration" involves running both curl logic and casync logic, in case something happened (we're not sure).

That could be done differently, we could do the poll ourself instead of using curl_multi_wait, but basically it would mean replicating what curl does in curl_multi_wait, and I don't see any benefit. But I'm no poll expert, so I might miss something.

Also, I am not sure why we need a queue for the requests?

So we need to keep tracks of the active requests somehow. Active requests being either easy handles added to the curl multi, or either chunks downloaded, and waiting to be sent to the remote.

I implemented a queue mechanism because it seemed to be suitable, and having 3 queues for the 3 states possibles (ready, in progress and completed) seemed to make things easy. For example, by keeping a bit of stats inside the queues (n_added, n_removed), it's then super easy to display statistics at the end. And this is valuable to see where time is spent.

For example while testing locally (ie. super fast localhost network), I can see that the average size of the completed queue is 62 (out of 64 active chunks), which means that most of the time, we're waiting for chunks to be sent to the remote. So the casync communication between the two casync processes is the bottleneck.

OTOH, while testing with a remote server, obviously things are different, and this time it's the average size of the inprogress queue that is close to 64, which means that most of the time we're waiting for chunks to be downloaded.

Of course, this is not the only implementation possible, and instead of having 3 queues, we could have a "state" enum, and for each chunk we could keep track of the state it's in. No more queue, just a static array of chunks. But I'm not sure it would make the code easier to read and maintain.

Performance wise, it don't think there's any cons in having these 3 queues. It's small amount of data (max-active-chunks x 3). Right now, we use dynamic memory only because at some point I wasn't sure if the queue would be of fixed size or not. Right now, it turns out that it's a static size (fixed by the parameter --max-active-chunks), and if we agree on this implementation, I'll rewrite that part to make it static and remove all the malloc/free.

note that CaRemote already implements one anyway?

This part I don't really understand. Can I make use of that on the casync-http side?

elboulangero · 2019-04-10T04:31:10Z

(Note that I fixed all the details you mentioned above. Since it was trivial enough I hit "Resolve" button myself, but truth to be told, I'm not familiar with reviewing stuff on github, and maybe you prefer to hit "Resolve" yourself, so please don't hesitate to tell me how it should be done).

keszybz

It would be great to have some tests for this. It is far from trivial ;) I agree we should switch to one of the "real" implementations, but I'm not sure which one is appropriate. By nghttpd do you mean https://www.nghttp2.org/? That seems to be dead (website from 2015) and is not packaged for Fedora, which would make things difficult for us.

I think it'd be easier to review this patchset if some parts were split out. E.g. the part to move arg_protocol enum higher could be separated, and it would just make things easier to review. I suggest some other parts to split out inline.

Like @poettering suggested, we want to pull in the list.h implementation from systemd. This will remove a large chunk of this patch too.

As for the general approach, I think it's reasonable. According to the documentation, the curl multi interface supports both doing poll internally and integrating into an external event loop. I think it's reasonable to start with the approach you picked. We have the option to switch to an external loop later on if necessary.

keszybz · 2019-04-25T07:31:28Z

src/casync-http.c

+
+        c = curl_easy_setopt(handle, CURLOPT_PROTOCOLS,
+                             arg_protocol == PROTOCOL_FTP ? CURLPROTO_FTP :
+                             arg_protocol == PROTOCOL_SFTP? CURLPROTO_SFTP:


Maybe PROTOCOL_SFTP ? CURLPROTO_SFTP :? Squishing it together like this looks off.

Done in #219

keszybz · 2019-04-25T07:32:57Z

src/casync-http.c

+                }
+        }
+
+        /* (void) curl_easy_setopt(handle, CURLOPT_SSL_VERIFYPEER, false); */


It'd be nice to make this into a commandline option, as you suggest. It's useful for debugging independently of this PR.

Done in #219

keszybz · 2019-04-25T07:33:17Z

src/casync-http.c

+
+        /* (void) curl_easy_setopt(handle, CURLOPT_SSL_VERIFYPEER, false); */
+
+        /* (void) curl_easy_setopt(handle, CURLOPT_VERBOSE, 1L); */


The same here.

Didn't do it, as -v is already an option to enable casync verbosity. Maybe -vv to enable curl verbose as well?

Since there's already #194 on the same topic, I would batch that altogether. I have some WIP regarding this issue, but I wanted to wait until this PR is completed and merged before finishing it and opening a PR.

#217 propagates both options --log-level and -v from casync to casync-http. The cURL verbosity is enabled for log-level notice and higher (i.e. on debug).

keszybz · 2019-04-25T12:31:04Z

src/casync-http.c

+        do {                                                            \
+                if (ENABLE_LOG_TRACE)                                   \
+                        log_debug("[%d] " fmt, (int) getpid(), ##__VA_ARGS__); \
+        } while (false)


log_trace should go in log.h.

@gportay will follow up on that

Done in commit de2ba61. See my temporary branch mr/casync-http-curl-multi.

I will push-force latter.

keszybz · 2019-04-25T12:37:28Z

src/casync-http.c

+                queue_push(dl->inprogress, handle);
+
+                /* We know there must be something to do, since we just added something. */
+                c = curl_multi_perform(dl->multi, &running_handles);


It seems strange to call curl_multi_perform in a loop. It should handle all handles at once, no?

@gportay will follow up on that

@keszybz: indeed curl_multi_perform handles all the handles.

This function handles transfers on all the added handles that need attention in an non-blocking fashion.

I will call the function once, outside the for loop.

keszybz · 2019-04-25T13:01:27Z

src/casync-http.c

+        for (;;) {
+                if (quit) {
+                        log_info("Got exit signal, quitting");
+                        r = 0;


Just do return 0.

Done in commit fdd4689. See my temporary branch mr/casync-http-curl-multi.

I will push-force latter.

keszybz · 2019-04-25T13:01:58Z

src/casync-http.c

        }

-        return c;
+        return r;


... and remove the this line completely.

Done in commit fdd4689. See my temporary branch mr/casync-http-curl-multi.

I will push-force latter.

keszybz · 2019-04-25T13:02:49Z

src/casync-http.c

@@ -180,7 +1151,7 @@ static size_t write_index(const void *buffer, size_t size, size_t nmemb, void *u

        r = ca_remote_put_index(rr, buffer, product);
        if (r < 0) {
-                log_error("Failed to put index: %m");
+                log_error_errno(r, "Failed to put index: %m");


This patch would be much easier to read if those fixes (which are independent) were split out into a separate patch.

Done in #219

keszybz · 2019-04-25T13:06:38Z

src/casync-http.c

@@ -623,9 +1393,6 @@ static int run(int argc, char *argv[]) {
        r = process_remote(rr, PROCESS_UNTIL_FINISHED);

 finish:
-        if (curl)
-                curl_easy_cleanup(curl);
-
        return r;


So this cleanup block is not necessary anymore. It would be nice to drop the label, and simply use return instead of goto everywhere above. Could be done as a separate commit if you prefer.

Done in #219

keszybz · 2019-04-25T13:44:25Z

src/casync-http.c

-static curl_off_t arg_rate_limit_bps = 0;
+static bool arg_verbose = false;
+static curl_off_t arg_rate_limit_bps = 0;
+static uint64_t arg_max_active_chunks = MAX_ACTIVE_CHUNKS;


uint64_t seems a bit over the top. Maybe just make this unsigned?

Done in #219

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

This is to prepare the next commit, where we will use the protocol enum for more than just the protocol passed in arguments, and the arg_ prefix won't make sense anymore. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

This commits brings in two helpers: - protocol_str() to convert an enum protocol to a string, which is useful mainly for logs. - protocol_status_ok() as a unique place to check if the protocol status that we get from libcurl means OK or KO. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

It seems to me that the condition PROCESS_UNTIL_WRITTEN is reached when there's no more data to write, hence ca_remote_has_unwritten() returns 0. And it also seems that this is could be copy/paste mistake, as all the code above is similar, but the condition matches the function we call, ie: - PROCESS_UNTIL_CAN_PUT_CHUNK > ca_remote_can_put_chunk - PROCESS_UNTIL_CAN_PUT_INDEX > ca_remote_can_put_index - PROCESS_UNTIL_CAN_PUT_ARCHIVE > ca_remote_can_put_archive - PROCESS_UNTIL_HAVE_REQUEST > ca_remote_has_pending_requests But here, the function returns the opposite of what we want: - PROCESS_UNTIL_WRITTEN > ca_remote_has_unwritten Note that I didn't observe any bug due to that, and the test suite succeeds before and after this patch. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

While working on this code, I stumbled on cases where casync got stuck because we forgot to call PROCESS_UNTIL_WRITTEN here. Well, TBH as long as we download chunks, we're fine because we end up calling PROCESS_UNTIL_WRITTEN afterwards. But in any case, it seems more correct to sync after downloading these files, and it doesn't hurt. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

The way we use curl handle is that we create it once, and then re-use it again and again, as it's more efficient than re-allocating a new one for each request. By looking at the code closely, it turns out that the setup of the curl handle needs to be done only once, then afterwards we only need to change the URL in order to re-use the handle. So this commit brings two helper functions to reflect that: - make_curl_easy_handle() does the init work and set all the options for the handle. - configure_curl_easy_handle() does the things that are needed in order to re-use the handle. In effect, it only sets the URL. Additionally, this commit introduces curl_easy_cleanupp, in order to use automatic pointers. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

These macros aim to make setting curl options easier. CURL_SETOPT_EASY() sets the option, and on failure it outputs a generic error message with the name of the option that failed, and returns -EIO. The CURL_SETOPT_EASY_CANFAIL() variant does not return, it only outputs an error message. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

This removes the need for the 'finish' label, hence a bunch of goto go away. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

elboulangero · 2019-06-25T09:46:32Z

Hi and sorry for being that long, lately I've been under water due to other projects.

I split all the work that is not really related to multi http request, but more refactoring, into #219, which can be merged to master if it's good enough. It's mostly shuffling code, nothing polemic I think, but I'm sure it will benefit from a round of review / rework.

Due to other tasks ongoing on my side, I won't be able to work further on that, hence let me introduce @gportay who will keep the good job going. I'm still around and will help him catching up.

Thanks for the collaboration and I hope to see u again through more casync PR!

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

The goal is to make the run() function more readable, and only outline the major steps, while the bulk of the work is left to other functions. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

This long option does not use a short option (see the optstring in call of getopt_long). Use a constant instead, as it is done in casync-tool.c.

This can be useful for testing, if ever we do HTTP2/SSL with a local, untrusted server. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

elboulangero · 2019-06-25T10:33:06Z

It would be great to have some tests for this. It is far from trivial ;) I agree we should switch to one of the "real" implementations, but I'm not sure which one is appropriate. By nghttpd do you mean https://www.nghttp2.org/? That seems to be dead (website from 2015) and is not packaged for Fedora, which would make things difficult for us.

I think nginx is best, as it's real in the sense that it powers the Internet for real, and it's also easy enough to set up for local tests. The only downside I see is that it requires root, and in Debian we build packages and run test suites without root. I don't know about fedora.

I also proposed nghttp2 because it's a dependency of curl (under the form of libnghttp2), so it's already around when we build casync. I'm not familiar with the project though. It seems that it's maintained: https://github.com/nghttp2/nghttp2/graphs/code-frequency.

I think it'd be easier to review this patchset if some parts were split out. E.g. the part to move arg_protocol enum higher could be separated, and it would just make things easier to review. I suggest some other parts to split out inline.

Done in #219

Like @poettering suggested, we want to pull in the list.h implementation from systemd. This will remove a large chunk of this patch too.

@gportay will take care of that.

As for the general approach, I think it's reasonable. According to the documentation, the curl multi interface supports both doing poll internally and integrating into an external event loop. I think it's reasonable to start with the approach you picked. We have the option to switch to an external loop later on if necessary.

Yes indeed. It's very easy with curl, as you basically just call curl_multi_wait() and curl_multi_perform() one after another. It's a bit more difficult with casync though, as you have to understand very well the protocol that the casync remotes speak to be sure that you do the right calls, and go to poll at the right moment after you did everything you could. Otherwise it's easy to get stuck.

This reverts commit 328f13d.

The file is taken from systemd[1]. Note: The inclusion of the macro header is removed. [1]: https://raw.githubusercontent.com/systemd/systemd/0d7f7c2fde377d9bf618d16aa230757f956f8191/src/basic/list.h

This commit brings parallel downloads to casync-http, by using the cURL multi interface, and more precisely the "select()" flavour. For details see https://curl.haxx.se/libcurl/c/libcurl-multi.html. The cURL library has two ways to achieve parallel downloads: for HTTP/1, it can open parallel connections. The maximum number of parallel connections is user-defined, through MAX_HOST_CONNECTIONS and MAX_TOTAL_CONNECTIONS. for HTTP/2, it can attempt to multiplex on a single connection. The maximum number of parallel downloads in this case is negociated between the client and the server (we talk about number of streams in the HTTP/2 jargon). Note that libcurl used to do pipelining over HTTP/1.1, but this is no longer supported since 7.62.0, and casync-http doesn't use it anyway) Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com> Signed-off-by: Gaël PORTAY <gael.portay@collabora.com>

Currently, casync-http handles the option --log-level but casync does not propagate its verbosity to its protocol helpers. This commit propagates the --log-level to the protocol helpers as it is done for option --rate-limit-bps. Now, the debug messages from the helper casync-http are printed to stderr if the --log-level is specified to casync.

Enable CURL verbosity for the notice and higher log levels (ie. debug). When CURLOPT_VERBOSE is enabled, libcurl outputs traces to stderr, such as headers and a lot of internal information. Now, the libcurl verbose informations are dumped to stdout when the options --log-level notice or debug are specified to casync.

This commit introduces a new option --max-active-chunks=<MAX> that limits the number of simultaneous chunk transfers from the remote. The MAX number is the sum of: 1. the number of chunks added to the cURL multi interface, and 2. chunks downloaded and waiting to be sent to the remote. It limits the number of chunks stored in memory that are ready to be sent to the casync process; this limit the memory usage in the situation where the network is faster than the pipe communication between the helper and casync.

This commit introduces a new option --max-host-connections=<MAX> that limits the number of connections to a single host. See https://curl.haxx.se/libcurl/c/CURLMOPT_MAX_HOST_CONNECTIONS.html.

Currently, casync-http handles the option --trust-peer but casync does not propagate it to its protocol helpers. This commit propagates the --trust-peer the protocol helpers as it is done for option --rate-limit-bps.

The macro log_trace is imported from systemd.

gportay · 2019-07-04T19:16:14Z

I have push-force a new version. It is based on #219, with more atomic commits.

gportay · 2019-07-18T03:09:31Z

@poettering, @keszybz is there any chance you have time to make a review on all PR? I assume you are very busy these days :) Thanks anyway.

elboulangero force-pushed the mr/casync-http-curl-multi branch from f7cf33e to ebc2243 Compare March 23, 2019 11:05

elboulangero commented Mar 23, 2019

View reviewed changes

meson.build Outdated Show resolved Hide resolved

elboulangero commented Mar 23, 2019

View reviewed changes

src/casync-http.c Outdated Show resolved Hide resolved

elboulangero commented Mar 23, 2019

View reviewed changes

poettering requested changes Apr 9, 2019

View reviewed changes

elboulangero force-pushed the mr/casync-http-curl-multi branch from ebc2243 to 3c1f551 Compare April 10, 2019 03:32

keszybz reviewed Apr 25, 2019

View reviewed changes

elboulangero added 10 commits June 18, 2019 00:31

Copy TAKE_PTR from systemd

9af7c5f

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

casync-http: Use automatic pointer in acquire_file()

d7b7bb2

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

casync-http: Use an automatic pointer for the chunks curl handle

1a29cdc

This removes the need for the 'finish' label, hence a bunch of goto go away. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

casync-http: Log curl error code in acquire_file()

6057748

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

elboulangero and others added 4 commits June 25, 2019 16:59

casync-http: Use log_error_errno to report casync error code

77652a8

Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

casync-http: Move chunks download in a separate function for clarity

e6bfd49

The goal is to make the run() function more readable, and only outline the major steps, while the bulk of the work is left to other functions. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

casync-http: use a constant for long option --rate-limit-bps

7b5ab17

This long option does not use a short option (see the optstring in call of getopt_long). Use a constant instead, as it is done in casync-tool.c.

casync-http: Add cmdline option to trust ssl peers

bc85e26

This can be useful for testing, if ever we do HTTP2/SSL with a local, untrusted server. Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>

gportay and others added 3 commits July 3, 2019 08:35

Revert "casync-http: retry when curl fails to connect"

e9901db

This reverts commit 328f13d.

Import list.h from systemd

48f37dd

The file is taken from systemd[1]. Note: The inclusion of the macro header is removed. [1]: https://raw.githubusercontent.com/systemd/systemd/0d7f7c2fde377d9bf618d16aa230757f956f8191/src/basic/list.h

gportay added 6 commits July 4, 2019 14:41

casync-http: add max-host-connections option

3907d67

This commit introduces a new option --max-host-connections=<MAX> that limits the number of connections to a single host. See https://curl.haxx.se/libcurl/c/CURLMOPT_MAX_HOST_CONNECTIONS.html.

casync: propagate --trust-peer option to helpers

edb7225

Currently, casync-http handles the option --trust-peer but casync does not propagate it to its protocol helpers. This commit propagates the --trust-peer the protocol helpers as it is done for option --rate-limit-bps.

casync-http: add some traces to debug

d8748c9

The macro log_trace is imported from systemd.

gportay force-pushed the mr/casync-http-curl-multi branch from 3c1f551 to d8748c9 Compare July 4, 2019 19:12


		/* (void) curl_easy_setopt(handle, CURLOPT_SSL_VERIFYPEER, false); */

		/* (void) curl_easy_setopt(handle, CURLOPT_VERBOSE, 1L); */

Rewrite casync-http to use curl multi #208

Are you sure you want to change the base?

Rewrite casync-http to use curl multi #208

Conversation

elboulangero commented Mar 23, 2019 • edited

Regarding unit tests

TODOs

Choose a reason for hiding this comment

elboulangero Mar 23, 2019 • edited

Choose a reason for hiding this comment

elboulangero commented Apr 9, 2019

poettering left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elboulangero commented Apr 10, 2019

elboulangero commented Apr 10, 2019

keszybz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elboulangero Jun 25, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elboulangero commented Jun 25, 2019

elboulangero commented Jun 25, 2019

gportay commented Jul 4, 2019 • edited

gportay commented Jul 18, 2019

elboulangero commented Mar 23, 2019 •

edited

elboulangero Mar 23, 2019 •

edited

elboulangero Jun 25, 2019 •

edited

gportay commented Jul 4, 2019 •

edited