Writing a CGI Script in Kotlin
...or more accurately, What Not to Do if You Want to Write CGI Scripts
Who?
Hi, I'm sschr15, a Kotlin fanatic and a member of this public circle.
Why?
Kotlin has a scripting feature. CGI scripts are scripts. A fantastic person is hosting a server that supports CGI scripts.
Therefore, I should write a CGI script in Kotlin and make it work tm.
How?
The big part: how was this done?
Part 1: The Basics
The CGI communication method provided by this particular server is rather
friendly — it supports any file marked executable in a user's
public_html/cgi-bin
directory. Many others in this
cyberspace use this to write CGI scripts in
Scheme, Bash, and possibly other
languages. As the avid Kotlin fan I am, I wanted to then write a CGI script
using Kotlin.
The "Easy" Way
Kotlin's modern versions directly support calling any scripts ending in
.main.kts
without any additional setup. This makes it really easy to launch
a Kotlin script from the command line:
kotlin my_script.main.kts
To make this easier, Kotlin's parser for .main.kts
scripts will ignore a
shebang line at the beginning, meaning this is a valid script!
#!/usr/bin/env kotlin
println("Hello, world!")
This is a great way to write scripts, but it is a little problematic for CGI scripts for a couple of reasons:
- This requires not only Java to be installed, but also a standalone Kotlin installation. This may not be a problem for most people, but it is for this group as it's running on NixOS with not too much space. A full Kotlin standalone installation is rather large, especially if native compilation is included in the installation.
- Kotlin is slow. Builds via a build system get around a large portion of this by using a daemon keeping the JVM and Kotlin's compiler running in the background. Running a Kotlin script directly means each launch requires the full startup time alongside compilation. When CGI scripts are expected to be speedy, this is a problem.
- The server uses NixOS, and Kotlin on NixOS is rather odd: it launches a
launcher which launches a launcher for launching a loader (loading Kotlin).
fcgiwrap
, the CGI runner used on this server, doesn't like this and fails to receive any output from the script.
Given all these issues, how in the world is one supposed to use Kotlin, a very slow-to-compile language without a speedy interpreter replacement, to write a CGI script?
Part 2: The Requirements
With all this said, the best option is clearly to write an entire library designed specifically for CGI scripts (though it works just as well in non-CGI scenarios). Such a library should have a few things:
- It needs to be responsive. Any time a script is requested, it should immediately run with little to no startup time.
- It should be extensible. Individual scripts should be able to add arbitrary dependencies for additional capabilities.
- It should be easy to use. The library should be simple to use, with a minimal API that is easy to understand.
- It needs to be (relatively) small. There's no need to have a full native system for simple scripting tasks.
Through about two weeks of work, I created a library that does a pretty decent job of achieving these goals.
Part 3: The Library
cgi-kts is a library that provides a minimal DSL For creating CGI responses as well as a framework for compiling and evaluating scripts.
The Daemon
The library's primary entrypoint is DaemonMain
, which runs a daemon that
will precompile scripts and evaluate them based on socket communications.
It goes through a few steps while running:
- Upon startup, it launches a coroutine that repeatedly checks in a given
directory for scripts. If a script is found but has not been compiled or
has been modified, it will compile the script. If a script that previously
was compiled is no longer found, it will remove the compilation of the
script.
- Compilation results are serialized to disk on each compilation to allow for quicker startup times. Despite the caching, the current code will recompile scripts shortly after loading.
- While running, it listens for requests on a given Unix socket. When a request
is received, it goes through a series of steps:
- The requester sends sixteen bytes to represent the script name. It is one
of:
- The full script name1 if it is sixteen bytes or fewer, or
- The first nine bytes of the script name, code point
u+0002
, and the hash code of the script name if it is seventeen bytes or more. Any unused bytes are zero, along with a seventeenth byte that is always zero, meant for easy implementation in C-like languages.
- The daemon finds the script in its cache. If the script is not found, it sends an error message back to the requester.
- The script is evaluated. If requested, additional communication may occur
to send environment variables from the requester to the script:
- The script requests an environment variable.
- The daemon sends the name of the environment variable.
- The requester sends the length of the environment variable, or -1 if it doesn’t exist.
- If the variable exists, the requester sends the value of the variable. No null terminator is sent alongside the value.
- The script completes evaluation with response data. The daemon sends the response data to the requester.
- The requester sends sixteen bytes to represent the script name. It is one
of:
The Requester
Within the repository, there is a small
C implementation
which can communicate with the daemon. It gets compiled with a SOCKET_LOCATION
macro to specify the location of the Unix socket. If also compiled with
RUN_AS_MAIN
, it will input one argument as the script to run. If not, it will
use its own name as the script to run.
Any implementation can work with the daemon, as long as it follows the protocol described in the previous section.
The "DSL"
Scripts written to interact with the library have a few utilities available:
respond(HttpStatusCode, ResponseBuilder.() -> Unit)
: This function is the primary way to respond to a request. It takes anHttpStatusCode
and a block to create the response.html(HTML.() -> Unit)
: Within the block, this will create an HTML response. This sets theContent-Type
andContent-Length
headers automatically.headers(HeaderBuilder.() -> Unit)
: Within the block, this will set headers for the response.cookie(...)
: Within aheaders
block, this will add aSet-Cookie
header to the response. It has parameters for each cookie field.
text(String)
: In case the response shouldn't be HTML, this will allow any custom response. This will not set theContent-Type
header, but will still set theContent-Length
header.
respond(String)
: In case the standard HTTP response isn't desired, another response can be sent with this function.stop()
: This function intentionally throws to halt the script sooner than its end. This is useful for stopping the script if an error occurs. Be sure to send any kind of response first, as not doing so will cause the daemon to crash.- A few standard CGI environment variables can be directly called:
requestUri
: The URI requested by the client.requestType
: The request method (GET, POST, etc.).queryString
: The query string of the request.query
: a map of the query string, with keys and values URL-decoded.
userAgent
: The user agent of the client.
requestEnvironmentVariable(String)
: This function will request an environment variable from the requester. It will returnnull
if the variable doesn’t exist.- Note that calling
System.getenv(...)
may give different results, as the daemon is not necessarily run by the same user that handles CGI requests.
- Note that calling
Huh?
So, what's the point of all this? Why go through all this trouble to write a CGI script in Kotlin?
No reason.
In case you're curious, the source code for the CGI script running this page (and my other blog post pages if they ever come about) is available on the site.