Commit graph

10 commits

Author SHA1 Message Date
Laslo Hunhold
cb7a1f6390
Replace off_t with size_t
While off_t might be better suited for file-offsets and -sizes, the
IEEE Computer Society was unable to mandate limits (min, max) for it
in the POSIX specification in the last 32 years. Because it's impossible
to portably determine these numbers for signed integers, I decided
to switch to size_t for the offsets to be able to pass proper values
to strtonum(), because C99 is sane and has defined limits for size_t
(i.e. SIZE_MIN and SIZE_MAX).

On my system, long long and off_t have the same size, so it didn't
trigger any bugs, but strtonum() could pass a bigger number to
lower and upper than they can handle and make them overflow.

The rationale for switching to size_t is actually given by the fact that
functions like mmap() blur the border between memory and filesystem.
Another point is that glibc has a horrible define _FILE_OFFSET_BITS
you need to set to 64 to actually get decent values for off_t, which
was a huge headache in sbase until we found that out.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 18:59:55 +02:00
Laslo Hunhold
d105c28aad
Ensure const-correctness where possible and refactor parse_range()
I know that the effect of 'const' on compiler optimizations is smaller
than many believe, but it provides a good insight to the caller which
parameters are not modified and simplifies parallelization, in case
that is desired at a later point.

Throughout processing, the big structs mostly remained unmodified, with
the exception of parse_range(), which added a null-byte in the "Range"-
header to simplify its parsing. This commit refactors parse_range()
such that it won't modify this string anymore.

Additionally, the parser was made even stricter: Usually, strtoll()
(which is wrapped by strtonum()) allows whitespace and plus and minus
signs before the number, which is not part of the specification. The
stricter parser also better differentiates now between invalid requests
and range-lists. In that context, the switch in http_send_response()
was replaced for better readability.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 18:28:21 +02:00
Laslo Hunhold
2c50d0c654
Rename request "r" to "req"
Now that we have response-structs called "res", the naming "r" is a
bit ambiguous.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 15:43:29 +02:00
Laslo Hunhold
c51b31d7ac
Refactor response-generation
I wasn't happy with how responses were generated. HTTP-headers were
handled by hand and it was duplicated in multiple parts of the code.
Due to the duplication, some functions like timestamp() had really
ugly semantics.

The HTTP requests are parsed much better: We have an enum of fields
we care about that are automatically read into our request struct. This
commit adapts this idea to the response: We have an enum of fields
we might put into our response, and a response-struct holds the
content of these fields. A function http_send_header() automatically
sends a header based on the entries in response. In case we don't
use a field, we just leave the field in the response-struct empty.

With this commit, some logical changes came with it:

  - timestamp() now has a sane signature, TIMESTAMP_LEN is no more and
    it can now return proper errors and is also reentrant by using
    gmtime_r() instead of gmtime()
  - No more use of a static timestamp-array, making all the methods
    also reentrant
  - Better internal-error-reporting: Because the fields are filled
    before and not during sending the response-headers, we can better
    report any internal errors as status 500 instead of sending a
    partial non-500-header and then dying.

These improved data structures make it easier to read and hack the code
and implement new features, if desired.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 13:41:44 +02:00
Laslo Hunhold
5a7994bc61
Send Accept-Ranges-header for file-requests
Now that the range-support is actually working, we can send out
the Accept-Ranges-header so that clients know they can send
range-requests to the server.

This can be seen empirically when watching a video and skipping around
into unbuffered space. Previously, it would not be possible and the
time-selector would flip back to the furthest point the previous
buffering had progressed. Now it is working flawlessly.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-07-23 18:54:43 +02:00
Rainer Holzner
b7d0d6889d
Fix for sending HTTP response status 304
Stop immediately after responding with status code 304 "Not Modified".
This also solves missing log output for status 304.

If there is an error while sending a file, try to clean up and close the
file.
2020-05-07 13:40:29 +02:00
Laslo Hunhold
48e74a5982
Properly HTML-escape names in dirlistings
Based on a patch by guysv. We now make sure that the valid
path-characters ", ', <, >, & can not be used for XSS on a target, for
example with a file called

   "><img src="blabla" onerror="alert(1)"

by properly HTML-escaping these characters.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-03-25 14:07:17 +01:00
Laslo Hunhold
bbd47e1427
Specify UTF-8 for non-binary content-types
If charset is unspecified, the encoding falls back to ISO 8859-1 or
something else that is defined in HTTP/1.1.

Given there is no reason not to use UTF-8 nowadays[0] and one can convert
legacy encodings to UTF-8 easily, if the case comes up, it is a sane
default to specify it in the config.def.h.

[0]: https://utf8everywhere.org/

Signed-off-by: Laslo Hunhold <dev@frign.de>
2019-01-02 17:04:23 +01:00
Laslo Hunhold
1879e14e79 Be extra pedantic again and remove all warnings
Since now config.def.h has been reduced we don't have any more unused
variables and thus the manual fiddling with error-levels is no longer
necessary.
To get a completely clean result though we have to still cast some
variables here and there.
2018-03-05 00:30:53 +01:00
Laslo Hunhold
ccdb51b96d Refactor the single source file into multiple modules
And many other things, too many to list here. For example, it now
properly logs uds instead of erroring out.
Separating concerns in many places definitely improves the readability.
2018-02-04 21:27:33 +01:00