========================================================================
CVE-2020-NLEND -- New-line injection into spool header file (local)
========================================================================

When Exim receives a mail, it creates two files in the "input"
subdirectory of its spool directory: a "data" file, which contains the
body of the mail, and a "header" file, which contains the headers of the
mail and important metadata (the sender and the recipient addresses, for
example). Such a header file consists of lines of text separated by '\n'
characters.

Unfortunately, an unprivileged local attacker can send a mail to a
recipient whose address contains '\n' characters, and can therefore
inject new lines into the spool header file and change Exim's behavior:

echo | /usr/sbin/exim4 -odf -oep -oi $'\"hello\nworld\"'

The effect of this vulnerability is similar to CVE-2020-8794 in
OpenSMTPD, but in Exim's case it is not enough to execute arbitrary
commands. To understand how we transformed this vulnerability into an
arbitrary command execution, we must digress briefly.

Most of the vulnerabilities in this advisory are memory corruptions, and
despite modern protections such as ASLR, NX, and malloc hardening,
memory corruptions in Exim are easy to exploit:

1/ Exim's memory allocator (store.c, which calls malloc() and free()
internally) unintentionally provides attackers with powerful exploit
primitives. In particular, if an attacker can pass a negative size to
the allocator (through an integer overflow or direct control), then:

119 static void *next_yield[NPOOLS];
120 static int yield_length[NPOOLS] = { -1, -1, -1,  -1, -1, -1 };
...
231 void *
232 store_get_3(int size, BOOL tainted, const char *func, int linenumber)
233 {
...
248 if (size > yield_length[pool])
249   {
...
294   }
...
299 store_last_get[pool] = next_yield[pool];
...
316 next_yield[pool] = (void *)(CS next_yield[pool] + size);
317 yield_length[pool] -= size;
318 return store_last_get[pool];
319 }

1a/ At line 248, store_get() believes that the current block of memory
is large enough (because size is negative), and goes to line 299. As a
result, store_get()'s caller can overflow the current block of memory (a
"forward-overflow").

1b/ At line 317, the free size of the current block of memory
(yield_length) is mistakenly increased (because size is negative), and
at line 316, the next pointer returned by store_get() (next_yield) is
mistakenly decreased (because size is negative). As a result, the next
memory allocation can overwrite the beginning of Exim's heap: a relative
write-what-where, which naturally bypasses ASLR (a "backward-jump", or
"back-jump").

2/ The beginning of the heap contains Exim's configuration, which
includes various strings that are passed to expand_string() at run time.
Consequently, an attacker who can "back-jump" can overwrite these
strings with "${run{...}}" and execute arbitrary commands (thus
bypassing NX).

Note: Exim 4.94 (the latest version) introduces "tainted" memory (i.e.,
untrusted, possibly attacker-controlled data) and refuses to process it
in expand_string(). This mechanism protects Exim against unintentional
expansion of tainted data (CVE-2014-2957 and CVE-2019-10149), but not
against memory corruption: an attacker can simply overwrite untainted
memory with tainted data, and still execute arbitrary commands in
expand_string(). For example, we exploited CVE-2020-NLEND,
CVE-2020-CLOSE, and CVE-2020-MAUTH in Exim 4.94.

CVE-2020-NLEND allows us to inject new lines into a spool header file.
To transform this vulnerability into an arbitrary command execution (as
root, since deliver_drop_privilege is false by default), we exploit the
following code in spool_read_header():

 341 int n;
 ...
 910 while ((n = fgetc(fp)) != EOF)
 911   {
 ...
 914   int i;
 915
 916   if (!isdigit(n)) goto SPOOL_FORMAT_ERROR;
 917   if(ungetc(n, fp) == EOF  ||  fscanf(fp, "%d%c ", &n, flag) == EOF)
 918     goto SPOOL_READ_ERROR;
 ...
 927     h->text = store_get(n+1, TRUE);     /* tainted */
 ...
 935     for (i = 0; i < n; i++)
 936       {
 937       int c = fgetc(fp);
 ...
 940       h->text[i] = c;
 941       }
 942     h->text[i] = 0;

- at line 917, we start a fake header with a negative length n;

- at line 927, we back-jump to the beginning of the heap (Digression
  1b), because n is negative;

- at line 935, we avoid the forward-overflow (Digression 1a), because n
  is negative;

- then, our next fake header is allocated to the beginning of the heap
  and overwrites Exim's configuration strings (with "${run{command}}");

- last, our arbitrary command is executed when deliver_message()
  processes our fake (injected) recipient and expands the overwritten
  configuration strings (Digression 2).

We can also transform CVE-2020-NLEND into an information disclosure, by
exploiting the following code in spool_read_header():

 756 for (recipients_count = 0; recipients_count < rcount; recipients_count++)
 757   {
 ...
 765   if (Ufgets(big_buffer, big_buffer_size, fp) == NULL) goto SPOOL_READ_ERROR;
 766   nn = Ustrlen(big_buffer);
 767   if (nn < 2) goto SPOOL_FORMAT_ERROR;
 ...
 772   p = big_buffer + nn - 1;
 773   *p-- = 0;
 ...
 809   while (isdigit(*p)) p--;
 ...
 840   else if (*p == '#')
 841     {
 842     int flags;
 ...
 848     (void)sscanf(CS p+1, "%d", &flags);
 849
 850     if ((flags & 0x01) != 0)      /* one_time data exists */
 851       {
 852       int len;
 853       while (isdigit(*(--p)) || *p == ',' || *p == '-');
 854       (void)sscanf(CS p+1, "%d,%d", &len, &pno);
 855       *p = 0;
 856       if (len > 0)
 857         {
 858         p -= len;
 859         errors_to = string_copy_taint(p, TRUE);
 860         }
 861       }
 862
 863     *(--p) = 0;   /* Terminate address */

For example, if we send a mail to the recipient
'"X@localhost\njohn@localhost 8192,-1#1\n\n1024* "' (where john is our
username, and localhost is one of Exim's local_domains), then:

- at line 848, we set flags to 1;

- at line 854, we set len to 8KB;

- at line 858, we decrease p (by 8KB) toward the beginning of the heap;

- at line 859, we read the errors_to string out of big_buffer's bounds;

- finally, we receive our mail, which includes the errors_to string in
  its From line.

