incubator/dirl - Forgejo: Beyond coding. We forge.

Author	SHA1	Message	Date
Laslo Hunhold	660699492f	Make user/group-handling-code more robust As is there is no security issue, but _if_ we end up with a user or group set to NULL after e.g. ARGEND, we would've hit a null-pointer- dereference of grp in which is now line 311. What we want to check instead is if user or group are NULL respectively and throw an error. Consequently, we can remove the later checks in the drop root section, as we now guarantee that grp and pwd are not NULL. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-09 23:20:06 +02:00
Laslo Hunhold	b1dca4cf97	Remove three dead stores in main() Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-09 22:43:46 +02:00
Laslo Hunhold	03ee1df4c3	Add space in list Thanks Hiltjo! Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-05 23:27:05 +02:00
Laslo Hunhold	2318a89ecd	Begin comment in lowercase Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-05 19:14:10 +02:00
Laslo Hunhold	cb7a1f6390	Replace off_t with size_t While off_t might be better suited for file-offsets and -sizes, the IEEE Computer Society was unable to mandate limits (min, max) for it in the POSIX specification in the last 32 years. Because it's impossible to portably determine these numbers for signed integers, I decided to switch to size_t for the offsets to be able to pass proper values to strtonum(), because C99 is sane and has defined limits for size_t (i.e. SIZE_MIN and SIZE_MAX). On my system, long long and off_t have the same size, so it didn't trigger any bugs, but strtonum() could pass a bigger number to lower and upper than they can handle and make them overflow. The rationale for switching to size_t is actually given by the fact that functions like mmap() blur the border between memory and filesystem. Another point is that glibc has a horrible define _FILE_OFFSET_BITS you need to set to 64 to actually get decent values for off_t, which was a huge headache in sbase until we found that out. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-05 18:59:55 +02:00
Laslo Hunhold	d105c28aad	Ensure const-correctness where possible and refactor parse_range() I know that the effect of 'const' on compiler optimizations is smaller than many believe, but it provides a good insight to the caller which parameters are not modified and simplifies parallelization, in case that is desired at a later point. Throughout processing, the big structs mostly remained unmodified, with the exception of parse_range(), which added a null-byte in the "Range"- header to simplify its parsing. This commit refactors parse_range() such that it won't modify this string anymore. Additionally, the parser was made even stricter: Usually, strtoll() (which is wrapped by strtonum()) allows whitespace and plus and minus signs before the number, which is not part of the specification. The stricter parser also better differentiates now between invalid requests and range-lists. In that context, the switch in http_send_response() was replaced for better readability. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-05 18:28:21 +02:00
Laslo Hunhold	90d5179ea0	Rename REQ_MOD to REQ_IF_MODIFIED_SINCE The named constants for header fields of the response struct all pretty much matched the actual header name, which I think improves readability for everyone familiar with the HTTP-spec. The request header fields named constants followed the rule, except the "If-Modified-Since"-header, which is addressed in this commit. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-05 15:46:03 +02:00
Laslo Hunhold	2c50d0c654	Rename request "r" to "req" Now that we have response-structs called "res", the naming "r" is a bit ambiguous. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-05 15:43:29 +02:00
Laslo Hunhold	c51b31d7ac	Refactor response-generation I wasn't happy with how responses were generated. HTTP-headers were handled by hand and it was duplicated in multiple parts of the code. Due to the duplication, some functions like timestamp() had really ugly semantics. The HTTP requests are parsed much better: We have an enum of fields we care about that are automatically read into our request struct. This commit adapts this idea to the response: We have an enum of fields we might put into our response, and a response-struct holds the content of these fields. A function http_send_header() automatically sends a header based on the entries in response. In case we don't use a field, we just leave the field in the response-struct empty. With this commit, some logical changes came with it: - timestamp() now has a sane signature, TIMESTAMP_LEN is no more and it can now return proper errors and is also reentrant by using gmtime_r() instead of gmtime() - No more use of a static timestamp-array, making all the methods also reentrant - Better internal-error-reporting: Because the fields are filled before and not during sending the response-headers, we can better report any internal errors as status 500 instead of sending a partial non-500-header and then dying. These improved data structures make it easier to read and hack the code and implement new features, if desired. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-05 13:41:44 +02:00
Laslo Hunhold	26c593ade1	Refactor range-parsing into a separate function The method http_send_response() is already long enough and this separation of concerns both helps shorten it a bit, improves readability and reduces the chance of programming errors. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-08-04 16:32:54 +02:00
Laslo Hunhold	5a7994bc61	Send Accept-Ranges-header for file-requests Now that the range-support is actually working, we can send out the Accept-Ranges-header so that clients know they can send range-requests to the server. This can be seen empirically when watching a video and skipping around into unbuffered space. Previously, it would not be possible and the time-selector would flip back to the furthest point the previous buffering had progressed. Now it is working flawlessly. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-07-23 18:54:43 +02:00
Laslo Hunhold	db4e35d3d5	Refactor range-parsing Quark previously didn't really handle suffix-range-requests (those of the form "-num", asking for the last num bytes) properly and also did not catch the error when the lower in the range "lower-upper" was actually larger than or equal to the size of the requested file. I always planned to refactor the parsing but got the motivation by Eric Radman <ericshane@eradman.com>, who kindly reported the latter bug to me. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-07-23 18:16:08 +02:00
Laslo Hunhold	6b508a0e07	Explicitly initialize struct tm with zeroes This is recommended by the manual as strptime(), in principle, might only touch the fields it parses from the string. Given the struct tm implementations differ from operating system to operating system, we make sure and set everything to zero before passing it to strptime(). Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-07-23 16:54:21 +02:00
Laslo Hunhold	660b308617	Use timegm() instead of mktime() to generate UNIX-timestamp The broken down time-representation tm generated earlier is in UTC, and mktime() assumes that it's in local time instead, leading to the problem that quark might not send a NOT_MODIFIED in a different timezone. timegm() instead correctly interprets the broken down time-representation tm as UTC and returns the proper timestamp. It might not be portable like mktime(), but it's complicated to emulate it otherwise. Thanks to Jeremy Bobbin <jer@jer.cx> for reporting the bug and providing this fix, which is why I've added him to the LICENSE. Thanks also to Hiltjo for his input. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-07-23 16:48:34 +02:00
Laslo Hunhold	a55df3915d	Update LICENSE Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-05-07 13:41:11 +02:00
Rainer Holzner	b7d0d6889d	Fix for sending HTTP response status 304 Stop immediately after responding with status code 304 "Not Modified". This also solves missing log output for status 304. If there is an error while sending a file, try to clean up and close the file.	2020-05-07 13:40:29 +02:00
Laslo Hunhold	9dda0028db	Update LICENSE Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-04-21 17:48:20 +02:00
Nihal Jere	8ccef4b27a	Make host parameters optional For me at least, the first valid configuration found by getaddrinfo works fine most of the time. Obviously if this isn't the configuration you want, you can specify the host explicitly.	2020-04-21 17:04:52 +02:00
Laslo Hunhold	48e74a5982	Properly HTML-escape names in dirlistings Based on a patch by guysv. We now make sure that the valid path-characters ", ', <, >, & can not be used for XSS on a target, for example with a file called "><img src="blabla" onerror="alert(1)" by properly HTML-escaping these characters. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-03-25 14:07:17 +01:00
Laslo Hunhold	5ee8c07e7e	Fix unveil(2) usage Thanks to the feedback by z0lqLA! I forgot that unveil(NULL, NULL) only locks further unveil calls when there has been at least _one_ prior call to unveil! To fix this, we reorder the calls and also make sure to call unveil() before we disallow unveils via pledge. Signed-off-by: Laslo Hunhold <dev@frign.de>	2020-03-20 20:35:34 +01:00
Laslo Hunhold	3c7049e906	Use pledge(2) and unveil(2) on OpenBSD It has been on my todo-list for a long time. I tested it on OpenBSD 6.5. Thanks Richard Ulmer for the reminder. Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-09-23 16:56:28 +02:00
Laslo Hunhold	32223c96bd	Use compound literals and explicit initialization I didn't really like the use of a "yes"-variable for setsockopt(). A better way is to use compound literals (part of C99). Another point are the structs. Instead of memsetting to zero we make use of the standard which guarantees that "unmentioned" fields are set to zero anyways. Just to note it here: The use of memset() also sets padding to zero, which is not guaranteed with the method of "unmentioned" fields. Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-05-30 23:15:47 +02:00
Laslo Hunhold	33def953e9	Improve tokenization for m- and v-flag parsing I wasn't happy with the tokenizer for the m- and v-flags, because it was handling space-separated input and there was no way to have spaces within the tokens themselves. This is a fine detail, but I didn't want to impose this restriction where it could be solved (path prefixes or folder names can very well contain spaces). Given it's a bit quirky to handle multiple arguments to a single flag in the command line, especially when parameters are optional, this alternative wasn't further considered and I instead implemented a tokenizer that allows escaping spaces with '\'. While at it, I clarified the manual regarding this point. Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-02-24 21:50:39 +01:00
Laslo Hunhold	065394cb64	Change target prefix mapping argument order Put the chost-specification at the end and make it optional. This makes more sense than having to give an arbitrary useless name in case you weren't using virtual hosts in the first place. While at it, clear up the wording in the manpage. Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-02-24 00:53:03 +01:00
Laslo Hunhold	48ddb8fefb	Sort flag-switch alphabetically Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-02-23 13:50:59 +01:00
Laslo Hunhold	f2afbc4dd7	Add a space after the number in the Xr mandoc macro Detected with the mandoc(1)-linter. Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-02-18 23:44:12 +01:00
Laslo Hunhold	e299e186ed	Don't replace '+' with ' ' when decoding URLs After the initial report by Platon Ryzhikov, I couldn't validate this behaviour with the given RFC 3986[0], which only speaks of percent encoding for reserved characters. [0]:https://tools.ietf.org/html/rfc3986 Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-01-10 22:02:23 +01:00
Laslo Hunhold	bbd47e1427	Specify UTF-8 for non-binary content-types If charset is unspecified, the encoding falls back to ISO 8859-1 or something else that is defined in HTTP/1.1. Given there is no reason not to use UTF-8 nowadays[0] and one can convert legacy encodings to UTF-8 easily, if the case comes up, it is a sane default to specify it in the config.def.h. [0]: https://utf8everywhere.org/ Signed-off-by: Laslo Hunhold <dev@frign.de>	2019-01-02 17:04:23 +01:00
Aaron Burrow	d2013a6337	Fix one byte NULL stack overflow Don't append a forward slash if the length of a folder is PATH_MAX-1. This can happen if HEADER_MAX is larger than PATH_MAX or if the `-m` option is used to increase the path length.	2018-07-16 22:48:20 +02:00
Laslo Hunhold	72b309bbe4	Correct arg.h license Credit where credit is due.	2018-07-16 11:49:51 +02:00
Laslo Hunhold	9ff3f780e1	Send a relative redirection header wherever possible This makes quark much more flexible when it is run behind a network filter or other kind of tunnel. Only send an absolute redirection when we are handling vhosts.	2018-07-02 18:43:06 +02:00
Laslo Hunhold	34189e0a1f	Use sizeof() - 1 rather than strlen() I know, most compiler probably optimize this anyway, but why not do it right in the first place?	2018-07-02 18:41:29 +02:00
Laslo Hunhold	b354ffb238	Add Dominik Schmidt to license	2018-07-02 07:15:19 +02:00
Dominik Schmidt	094c8ba814	Open a new process group before setting up signal handler When cleaning up after a caught signal, quark forwards the signal to all processes in the process group with `kill(0, ...)`. If we do not open up a new process group in the parent process, quarks parent will be sent a SIG... too, resulting it to shut down (especially considering that the parent process might run as root). As a result, if we set up the service with djb's excellent daemontools, `svc -d quark` will terminate the svscan-process and tear all other services down with it. See also <https://cr.yp.to/daemontools/faq/create.html#pgrphack>.	2018-07-02 07:14:00 +02:00
Laslo Hunhold	ba38b0969f	Give an indication of the time zone in the log We use Zulu-time (aka UTC) for the log timestamps.	2018-04-03 01:23:00 +02:00
Laslo Hunhold	3ff82c514b	Clean up request host properly We all agree that the IPv6 address format is a big clusterfuck and only an insane person would've come up with it given the double colons interfere with the way one actually appends a port to a normal IPv4 address. To counteract in this issue, the RFC specifies that one should enclose IPv6-addresses in square brackets to make the disctinction possible, i.e. host: ::1 port: 80 --> [::1]:80 The host field can contain both a port suffix and, of course by the RFC, have the address enclosed in square brackets. Given I personally see this as a "transport enclosure" I'd rather like to see it gone as soon as possible and thus implement this cleanup in the http-header-parser so the output is nice and clean and we don't have to deal with this garbage later on. Thanks to Josuah Demangeon <mail@josuah.net> for his wonderful input and his dedication to read the RFCs 3986 and 2732 in such great detail.	2018-04-03 01:03:03 +02:00
Josuah Demangeon	c3ddb2dd14	permit prefix to be empty in -v format string The previous parsing of the -v vhosts made sure there were 4 tokens. If there was no prefix specified, usage() is called. Now, it only checks for the firsts 3, with .prefix set to null if there are only 3 tokens.	2018-04-02 09:57:58 +02:00
Josuah Demangeon	69bb7710eb	fix segfault on parsing of -v and -m The length is initially 0 so it needs to be incremented before reallocarray to avoid ...alloc(0); and keep some space for the element to insert.	2018-04-02 09:53:41 +02:00
Laslo Hunhold	6770dc06e6	Add netinet/in.h to sock.c It was missing but necessary for some defines.	2018-03-05 10:24:46 +01:00
Laslo Hunhold	a20136fa18	Update the documentation to reflect the new flag-centric usage	2018-03-05 09:51:29 +01:00
Hiltjo Posthuma	444b8f5b32	http_send_response: fix undefined behaviour for copying the target string ... the format string and buffer were the same (undefined behaviour).	2018-03-05 01:21:14 +01:00
Laslo Hunhold	ed8b7e8954	Fix a logic error We want to xor s.host and udsname, so it especially errors out when none are given.	2018-03-05 01:04:51 +01:00
Hiltjo Posthuma	01ed0dac83	util: don't initialize server, it's already done in main() ... this removes a stupid compiler warning too.	2018-03-05 01:00:50 +01:00
Laslo Hunhold	c8401c591f	Add esnprintf() and refactor some code The (size_t) discards the case where the return value of snprintf is < 0. This is rather unlikely, but we'll keep it in mind anyway.	2018-03-05 00:59:37 +01:00
Laslo Hunhold	1879e14e79	Be extra pedantic again and remove all warnings Since now config.def.h has been reduced we don't have any more unused variables and thus the manual fiddling with error-levels is no longer necessary. To get a completely clean result though we have to still cast some variables here and there.	2018-03-05 00:30:53 +01:00
Quentin Rameau	3ff3e5ea6e	Add some missing headers and interface visibility macro strings.h for strncasecmp time.h for strptime	2018-03-05 00:21:54 +01:00
Laslo Hunhold	6b55e36036	Introduce flag-centric usage The config.h-interface has proven to be very effective for a lot of suckless tools, but it just does not make too much sense for a web server like quark. $ quark If you run multiple instances of it, you want to see in the command line (or top) what it does, and given the amount of options it's logical to just express them as options given in the command line. It also is a problem if you can modify quark via the config.h, contradicting the manual. Just saying "Well, then don't touch config.h" is also not good, as the vhost and map options were only exposed via this interface. What is left in config.h are mime-types and two constants relating to the incoming HTTP-header-limits. In order to introduce these changes, some structs and safe utility functions were added and imported from OpenBSD respectively.	2018-03-05 00:14:25 +01:00
Laslo Hunhold	b40b11a40e	Fix the regex in the example Thanks Hiltjo!	2018-02-27 13:00:23 +01:00
Laslo Hunhold	7b7f166dd5	Add target prefix mapping This allows e.g. to redirect when a directory has been moved.	2018-02-27 12:43:05 +01:00
Laslo Hunhold	02d6ae5a57	Add support for adding a prefix to a target when matching vhosts This makes quark's vhost-handling very powerful while still being simple. Imagine you have a website with a subdomain you really want to move back to your main domain. Say the subdomain is called "old.example.org" and you want to serve it under "example.org" but in the subdirectory "old/", i.e. you want to redirect a request "old.example.org/subdir/" to "example.org/old/subdir". For a vhost-handler that only takes 4 arguments for each vhost this is actually pretty powerful.	2018-02-27 11:36:24 +01:00

1 2 3 4 5

237 commits