Commit graph

270 commits

Author SHA1 Message Date
e262166522 Re-add replace and read_file to util 2020-08-30 14:32:48 +02:00
30ca105c66 Merge branch 'master' into dirlist 2020-08-30 14:16:41 +02:00
ab42879bc6 Merge remote-tracking branch 'upstream/master' 2020-08-30 09:42:18 +02:00
eba491b4b5 Find template in root, don't search tail dir twice 2020-08-30 09:38:02 +02:00
bd51ff593d Refactoring 2020-08-30 08:43:10 +02:00
a22ac176ca Template overrides in subdirs, prefetch templates 2020-08-29 20:36:10 +02:00
Laslo Hunhold
0823ba4c3e
Add logmsg() and refactor connection handling
Also use compound literals for immediate pointers we don't use later
(same as with setsockopt() in 32223c96bd).

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-29 13:02:51 +02:00
Laslo Hunhold
a36b901d40
Add http_send_body() and data_send_error() and refactor
This turns the data-functions into the only functions "allowed"
to send body-data (called with http_send_body()). The previous (hacky)
approach of doing this in http_send_header() is not only out of place,
it's an easy source of bugs given, for instance, the sending of body
data is not expected with HEAD-requests.

Given html_escape() is now only used in data.c, we move it there from
util.c and make it a static method again.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-29 00:49:16 +02:00
Laslo Hunhold
db127723c6
Rename functions in data.h and adapt ifdef
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-28 23:46:12 +02:00
Laslo Hunhold
a94b15814c
Rename resp.{c,h} to data.{c,h}
The methods in data.h only deal with the actual response data, not
the request handling itself, which has been formalized a bit more
in http.h. To avoid confusion, we rename it to data.h.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-28 23:29:54 +02:00
Laslo Hunhold
9a95d9183c
Rename status to s in serve()
This is more consistent with the codebase.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-28 23:19:29 +02:00
Laslo Hunhold
68e4ff3021
Return proper error-status when http_send_header() fails
Explicitly show that we set the status of the response struct to the
returned error status. This makes it clear that we are beyond the point
where the "form" of the response struct matters and it's now only about
the log-output.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-28 23:16:47 +02:00
Laslo Hunhold
c0909c70e4
Improve http_prepare_response()'s error semantics
I don't like the juggling with status-values in serve. It makes
sense for http_recv_header() and http_parse_header(), because we
don't have a response-struct yet that we can "fill". We could pass
it to them, but that would make the usage a bit messy.

However, in http_prepare_response(), we are already entrusted with
a pointer to a response-struct, and just failing here (by returning
an error value) leaves the response-struct in an invalid state. Instead,
we make it a void function and reflect the status using the status field
in the passed response struct.

This way, there is no case where the response struct is in an
invalid state after calling a http_prepare_*()-method.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-28 22:52:04 +02:00
Laslo Hunhold
123f168a3b
Replace http_send_status() with http_prepare_error_response()
This approach fits better in line of first initializing the response
struct and then sending the header with http_send_header() later.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-28 22:34:46 +02:00
5b6f8e5083 Read from template files with placeholder replacement 2020-08-28 00:46:08 +02:00
329dedc33b Read dirlist from templates
Reads header, footer and entry templates from files defined in config.h.
Defaults to original output if not found.
2020-08-25 21:08:23 +02:00
670d2ed65c Merge remote-tracking branch 'upstream/master' 2020-08-23 16:31:13 +02:00
Laslo Hunhold
601b56d270
Mention default behaviour in the manual when the host is not given
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-23 13:36:56 +02:00
Laslo Hunhold
27f8bbfac4
Refactor sock_get_uds() a bit
This refines the error messages a bit and makes clearer what went
wrong.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-23 13:35:49 +02:00
Laslo Hunhold
1ccaac023c
Rename s to srv
This improves readability a bit and helps iron out confusions with
status-variables called s in other methods.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-23 11:03:18 +02:00
Laslo Hunhold
50c85ec642
Rename "target" to "URI" where appropriate
Of course URIs point at "targets", but the URIs themselves should
be called what they are, not only in the interest of clarity in terms
of nomenclature.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-22 23:38:38 +02:00
Laslo Hunhold
68be64e2c1
Remove unused field in the request-struct
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-22 23:31:42 +02:00
Laslo Hunhold
58d0f44e03
Refactor http_send_response() into http_prepare_response()
The function http_send_response() did too much. It not only took
the request fields and built them together into a response, it
delegated too little and many functions were "hacked" into it, for
instance shady directory-changes for vhosts and hand-construction
of response structs.

The preparations for a rework were already made in previous commits,
including a tighter focus on the response-struct itself. Instead of
doing everything locally in the http_send_response() function, the
new http_prepare_response() only really takes the request-struct and
builds a response-struct. The response-struct is expanded such that
it's possible to do the data-sending simply with the response-struct
itself and not any other magic parameters that just drop out of the
function.

Another matter are the http_send_status()-calls. Because the
aforementioned function is so central, this refactoring has included
many areas. Instead of calling http_send_status() in every error-case,
which makes little sense now given we first delegate everything through
a response struct, errors are just sent as a return value and caught
centrally (in serve() in main.c), which centralizes the error handling
a bit.

It might look a bit strange now and it might not be clear in which
direction this is going, but subsequent commits will hopefully give
clarity in this regard.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-22 23:20:00 +02:00
Laslo Hunhold
a5163d0813
Split up http_get_request()
The function has become too long and basically did two things: Receiving
the header and parsing it. To better reflect this, we split it up into
the two functions http_recv_header() and http_parse_header(). This way,
we also obtain a better separation of concerns and can further reduce
the scope of each parameter-list.

http_recv_header() has been written in such a way that it can be
reentered and fill up the header-buffer bit by bit using a pointer to
an offset value.

The error handling was improved by only returning the immediate error
status codes and letting the caller do the error-handling with
http_send_status().

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-22 11:05:20 +02:00
Laslo Hunhold
c1b242e405
Add connection struct
This struct contains the request and response structs, represents a state
and has some utility-buffers.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-22 09:24:57 +02:00
Laslo Hunhold
6d2fe7f29e
Move infd and header into request-struct
This compacts the connection state into one struct.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-21 19:38:29 +02:00
f1fed77826 Add patch to README 2020-08-18 21:27:53 +02:00
ac056a792d Add drone build to README 2020-08-18 19:51:06 +02:00
Laslo Hunhold
ce77dd7962
Update manpage to list capabilities and behaviour
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-18 08:46:52 +02:00
Laslo Hunhold
65600ffe7a
Reduce global state by localizing the server-struct
The server-struct variable s was global, which made it readable and
modifiable from any point in the code. Making it a local variable in
main() instead and passing it as a pointer to constant memory to each
function needing it makes much more sense and allows the compiler to
warn us if we do try to modify it, which it wouldn't have before.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-17 11:37:25 +02:00
Laslo Hunhold
3bd49b2456
Implement RFC 8615 (Well-Known URIs) and refine access errors
We generally rejected any URI that had a path component beginning
with a '.', i.e. a hidden file. RFC 8615 specifies the well-known URI,
which is used, for instance, with the "http-01" challenge type in
acme-client(1) and will probably see more usage in the future.

To support it, we move the hidden target check after the stat(), so we
don't have to worry about canonicalization of dir-URIs (i.e. missing
trailing '/'). This changes the behaviour a bit, as now quark won't
only send out a 403 whenever a hidden target is requested, but only
if it actually exists, and a 404 otherwise.

Given the earlier call to normabspath() ensures that our path begins
with a '/', we don't need the first check "realtarget[0] == '.'"
anymore, so it can be removed.

Thanks to Robert Russell <robertrussell.72001@gmail.com> for reporting
the lack of support of the RFC 8615 in quark.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-17 10:39:54 +02:00
e5ac379442 Add drone build 2020-08-16 23:57:45 +02:00
d91b363627 Add README 2020-08-16 23:05:04 +02:00
Laslo Hunhold
660699492f
Make user/group-handling-code more robust
As is there is no security issue, but _if_ we end up with a user
or group set to NULL after e.g. ARGEND, we would've hit a null-pointer-
dereference of grp in which is now line 311.

What we want to check instead is if user or group are NULL respectively
and throw an error. Consequently, we can remove the later checks in the
drop root section, as we now guarantee that grp and pwd are not NULL.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-09 23:20:06 +02:00
Laslo Hunhold
b1dca4cf97
Remove three dead stores in main()
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-09 22:43:46 +02:00
Laslo Hunhold
03ee1df4c3
Add space in list
Thanks Hiltjo!

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 23:27:05 +02:00
Laslo Hunhold
2318a89ecd
Begin comment in lowercase
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 19:14:10 +02:00
Laslo Hunhold
cb7a1f6390
Replace off_t with size_t
While off_t might be better suited for file-offsets and -sizes, the
IEEE Computer Society was unable to mandate limits (min, max) for it
in the POSIX specification in the last 32 years. Because it's impossible
to portably determine these numbers for signed integers, I decided
to switch to size_t for the offsets to be able to pass proper values
to strtonum(), because C99 is sane and has defined limits for size_t
(i.e. SIZE_MIN and SIZE_MAX).

On my system, long long and off_t have the same size, so it didn't
trigger any bugs, but strtonum() could pass a bigger number to
lower and upper than they can handle and make them overflow.

The rationale for switching to size_t is actually given by the fact that
functions like mmap() blur the border between memory and filesystem.
Another point is that glibc has a horrible define _FILE_OFFSET_BITS
you need to set to 64 to actually get decent values for off_t, which
was a huge headache in sbase until we found that out.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 18:59:55 +02:00
Laslo Hunhold
d105c28aad
Ensure const-correctness where possible and refactor parse_range()
I know that the effect of 'const' on compiler optimizations is smaller
than many believe, but it provides a good insight to the caller which
parameters are not modified and simplifies parallelization, in case
that is desired at a later point.

Throughout processing, the big structs mostly remained unmodified, with
the exception of parse_range(), which added a null-byte in the "Range"-
header to simplify its parsing. This commit refactors parse_range()
such that it won't modify this string anymore.

Additionally, the parser was made even stricter: Usually, strtoll()
(which is wrapped by strtonum()) allows whitespace and plus and minus
signs before the number, which is not part of the specification. The
stricter parser also better differentiates now between invalid requests
and range-lists. In that context, the switch in http_send_response()
was replaced for better readability.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 18:28:21 +02:00
Laslo Hunhold
90d5179ea0
Rename REQ_MOD to REQ_IF_MODIFIED_SINCE
The named constants for header fields of the response struct all
pretty much matched the actual header name, which I think improves
readability for everyone familiar with the HTTP-spec.

The request header fields named constants followed the rule, except
the "If-Modified-Since"-header, which is addressed in this commit.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 15:46:03 +02:00
Laslo Hunhold
2c50d0c654
Rename request "r" to "req"
Now that we have response-structs called "res", the naming "r" is a
bit ambiguous.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 15:43:29 +02:00
Laslo Hunhold
c51b31d7ac
Refactor response-generation
I wasn't happy with how responses were generated. HTTP-headers were
handled by hand and it was duplicated in multiple parts of the code.
Due to the duplication, some functions like timestamp() had really
ugly semantics.

The HTTP requests are parsed much better: We have an enum of fields
we care about that are automatically read into our request struct. This
commit adapts this idea to the response: We have an enum of fields
we might put into our response, and a response-struct holds the
content of these fields. A function http_send_header() automatically
sends a header based on the entries in response. In case we don't
use a field, we just leave the field in the response-struct empty.

With this commit, some logical changes came with it:

  - timestamp() now has a sane signature, TIMESTAMP_LEN is no more and
    it can now return proper errors and is also reentrant by using
    gmtime_r() instead of gmtime()
  - No more use of a static timestamp-array, making all the methods
    also reentrant
  - Better internal-error-reporting: Because the fields are filled
    before and not during sending the response-headers, we can better
    report any internal errors as status 500 instead of sending a
    partial non-500-header and then dying.

These improved data structures make it easier to read and hack the code
and implement new features, if desired.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-05 13:41:44 +02:00
Laslo Hunhold
26c593ade1
Refactor range-parsing into a separate function
The method http_send_response() is already long enough and this
separation of concerns both helps shorten it a bit, improves
readability and reduces the chance of programming errors.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-08-04 16:32:54 +02:00
Laslo Hunhold
5a7994bc61
Send Accept-Ranges-header for file-requests
Now that the range-support is actually working, we can send out
the Accept-Ranges-header so that clients know they can send
range-requests to the server.

This can be seen empirically when watching a video and skipping around
into unbuffered space. Previously, it would not be possible and the
time-selector would flip back to the furthest point the previous
buffering had progressed. Now it is working flawlessly.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-07-23 18:54:43 +02:00
Laslo Hunhold
db4e35d3d5
Refactor range-parsing
Quark previously didn't really handle suffix-range-requests
(those of the form "-num", asking for the last num bytes) properly
and also did not catch the error when the lower in the range
"lower-upper" was actually larger than or equal to the size of the
requested file.

I always planned to refactor the parsing but got the motivation by
Eric Radman <ericshane@eradman.com>, who kindly reported the latter bug
to me.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-07-23 18:16:08 +02:00
Laslo Hunhold
6b508a0e07
Explicitly initialize struct tm with zeroes
This is recommended by the manual as strptime(), in principle,
might only touch the fields it parses from the string. Given
the struct tm implementations differ from operating system to
operating system, we make sure and set everything to zero
before passing it to strptime().

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-07-23 16:54:21 +02:00
Laslo Hunhold
660b308617
Use timegm() instead of mktime() to generate UNIX-timestamp
The broken down time-representation tm generated earlier is in UTC,
and mktime() assumes that it's in local time instead, leading to
the problem that quark might not send a NOT_MODIFIED in a different
timezone.

timegm() instead correctly interprets the broken down
time-representation tm as UTC and returns the proper timestamp.
It might not be portable like mktime(), but it's complicated to
emulate it otherwise.

Thanks to Jeremy Bobbin <jer@jer.cx> for reporting the bug and
providing this fix, which is why I've added him to the LICENSE.
Thanks also to Hiltjo for his input.

Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-07-23 16:48:34 +02:00
Laslo Hunhold
a55df3915d
Update LICENSE
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-05-07 13:41:11 +02:00
Rainer Holzner
b7d0d6889d
Fix for sending HTTP response status 304
Stop immediately after responding with status code 304 "Not Modified".
This also solves missing log output for status 304.

If there is an error while sending a file, try to clean up and close the
file.
2020-05-07 13:40:29 +02:00
Laslo Hunhold
9dda0028db
Update LICENSE
Signed-off-by: Laslo Hunhold <dev@frign.de>
2020-04-21 17:48:20 +02:00