httpd
An implementation of an HTTP 1.1 compliant Web server, as defined in RFC 2616.
Documents the HTTP server start options, some administrative functions and also specifies the Erlang Web server callback API
COMMON DATA TYPES
Type definitions that are used more than once in this module:
boolean() = true | false
string() = list of ASCII characters
path() = string() - representing a file or directory path.
ip_address() = {N1,N2,N3,N4} % IPv4
| {K1,K2,K3,K4,K5,K6,K7,K8} % IPv6
hostname() = string() - representing a host ex "foo.bar.com"
property() = atom()
ERLANG HTTP SERVER SERVICE START/STOP
A web server can be configured to start when starting the inets
application or started dynamically in runtime by calling the
Inets application API inets:start(httpd, ServiceConfig)
, or
inets:start(httpd, ServiceConfig, How)
,
see inets(3) Below follows a
description of the available configuration options, also called
properties.
File properties
When the web server is started
at application start time the properties should be fetched from a
configuration file that could consist of a regular erlang property
list, e.i. [{Option, Value}]
where Option = property()
and Value = term()
, followed by a full stop, or for
backwards compatibility an Apache like configuration file. If the
web server is started dynamically at runtime you may still specify
a file but you could also just specify the complete property
list.
- {proplist_file, path()}
- If this property is defined inets will expect to find all other properties defined in this file. Note that the file must include all properties listed under mandatory properties.
- {file, path()}
- If this property is defined
inets will expect to find all other properties defined in this
file, that uses Apache like syntax. Note that the file must
include all properties listed under mandatory properties. The
Apache like syntax is the property, written as one word where
each new word begins with a capital, followed by a white-space
followed by the value followed by a new line. Ex:
{server_root, "/urs/local/www"} -> ServerRoot /usr/local/www
With a few exceptions, that are documented for each property that behaves differently, and the special case {directory, {path(), PropertyList}} and {security_directory, {Dir, PropertyList}} that are represented as:
<Directory Dir> <Properties handled as described above> </Directory>
Note!
The properties proplist_file and file are mutually exclusive.
Mandatory properties
- {port, integer()}
- The port that the HTTP server shall listen on. If zero is specified as port, an arbitrary available port will be picked and you can use the httpd:info/2 function to find out which port was picked.
- {server_name, string()}
- The name of your server, normally a fully qualified domain name.
- {server_root, path()}
- Defines the servers home directory where log files etc can be stored. Relative paths specified in other properties refer to this directory.
- {document_root, path()}
- Defines the top directory for the documents that are available on the HTTP server.
Communication properties
- {bind_address, ip_address() | hostname() | any}
-
Defaults to
any
. Note thatany
is denoted * in the apache like configuration file. - {socket_type, ip_comm | ssl}
-
Defaults to
ip_comm
. - {ipfamily, inet | inet6 | inet6fb4}
-
Defaults to
inet6fb4.
Note that this option is only used when the option
socket_type
has the valueip_comm
.
Erlang Web server API modules
- {modules, [atom()]}
-
Defines which modules the HTTP server will use to handle
requests. Defaults to:
[mod_alias, mod_auth, mod_esi, mod_actions, mod_cgi, mod_dir, mod_get, mod_head, mod_log, mod_disk_log]
Note that some mod-modules are dependent on others, so the order can not be entirely arbitrary. See the Inets Web server Modules in the Users guide for more information.
Limit properties
- {disable_chunked_transfer_encoding_send, boolean()}
- This property allows you to disable chunked transfer-encoding when sending a response to a HTTP/1.1 client, by default this is false.
- {keep_alive, boolean()}
- Instructs the server whether or not to use persistent connections when the client claims to be HTTP/1.1 compliant, default is true.
- {keep_alive_timeout, integer()}
- The number of seconds the server will wait for a subsequent request from the client before closing the connection. Default is 150.
- {max_body_size, integer()}
- Limits the size of the message body of HTTP request. By the default there is no limit.
- {max_clients, integer()}
- Limits the number of simultaneous requests that can be supported. Defaults to 150.
- {max_header_size, integer()}
- Limits the size of the message header of HTTP request. Defaults to 10240.
- {max_uri, integer()}
- Limits the size of the HTTP request URI. By default there is no limit.
- {max_keep_alive_requests, integer()}
- The number of request that a client can do on one connection. When the server has responded to the number of requests defined by max_keep_alive_requests the server close the connection. The server will close it even if there are queued request. Defaults to no limit.
Administrative properties
- {mime_types, [{MimeType, Extension}] | path()}
-
Where MimeType = string() and Extension = string(). Files delivered to the client are MIME typed according to RFC 1590. File suffixes are mapped to MIME types before file delivery. The mapping between file suffixes and MIME types can be specified as an Apache like file as well as directly in the property list. Such a file may look like:
# MIME type Extension text/html html htm text/plain asc txt
Defaults to [{"html","text/html"},{"htm","text/html"}]
- {mime_type, string()}
- When the server is asked to provide a document type which cannot be determined by the MIME Type Settings, the server will use this default type.
- {server_admin, string()}
- ServerAdmin defines the email-address of the server administrator, to be included in any error messages returned by the server.
- {log_format, common | combined}
-
Defines if access logs should be written according to the common log format or to the extended common log format. The
common
format is one line that looks like this:remotehost rfc931 authuser [date] "request" status bytes
remotehost Remote rfc931 The client's remote username (RFC 931). authuser The username with which the user authenticated himself. [date] Date and time of the request (RFC 1123). "request" The request line exactly as it came from the client(RFC 1945). status The HTTP status code returned to the client (RFC 1945). bytes The content-length of the document transferred.
The
combined
format is on line that look like this:remotehost rfc931 authuser [date] "request" status bytes "referer" "user_agent"
"referer" The url the client was on before requesting your url. (If it could not be determined a minus sign will be placed in this field) "user_agent" The software the client claims to be using. (If it could not be determined a minus sign will be placed in this field)
This affects the access logs written by mod_log and mod_disk_log.
- {error_log_format, pretty | compact}
-
Defaults to pretty. If the error log is meant to be read directly by a human
pretty
will be the best option.pretty
has the format corresponding to:io:format("[~s] ~s, reason: ~n ~p ~n~n", [Date, Msg, Reason]).
compact
has the format corresponding to:io:format("[~s] ~s, reason: ~w ~n", [Date, Msg, Reason]).
This affects the error logs written by mod_log and mod_disk_log.
ssl properties
- {ssl_ca_certificate_file, path()}
- Used as cacertfile option in ssl:listen/2 see ssl(3)
- {ssl_certificate_file, path()}
- Used as certfile option in ssl:listen/2 see ssl(3)
- {ssl_ciphers, list()}
- Used as ciphers option in ssl:listen/2 see ssl(3)
- {ssl_verify_client, integer()}
- Used as verify option in ssl:listen/2 see ssl(3)
- {ssl_verify_depth, integer()}
- Used as depth option in ssl:listen/2 see ssl(3)
- {ssl_password_callback_function, atom()}
- Used together with ssl_password_callback_module to retrieve a value to use as password option to ssl:listen/2 see ssl(3)
- {ssl_password_callback_arguments, list()}
- Used together with ssl_password_callback_function to supply a list of arguments to the callback function. If not specified the callback function will be assumed to have arity 0.
- {ssl_password_callback_module, atom()}
- Used together with ssl_password_callback_function to retrieve a value to use as password option to ssl:listen/2 see ssl(3)
URL aliasing properties - requires mod_alias
- {alias, {Alias, RealName}}
- Where Alias = string() and RealName = string().
The Alias property allows documents to be stored in the local file
system instead of the document_root location. URLs with a path that
begins with url-path is mapped to local files that begins with
directory-filename, for example:
{alias, {"/image", "/ftp/pub/image"}
and an access to http://your.server.org/image/foo.gif would refer to the file /ftp/pub/image/foo.gif. - {directory_index, [string()]}
-
DirectoryIndex specifies a list of resources to look for
if a client requests a directory using a / at the end of the
directory name. file depicts the name of a file in the
directory. Several files may be given, in which case the server
will return the first it finds, for example:
{directory_index, ["index.hml", "welcome.html"]}
and access to http://your.server.org/docs/ would return http://your.server.org/docs/index.html or http://your.server.org/docs/welcome.html if index.html do not exist.
CGI properties - requires mod_cgi
- {script_alias, {Alias, RealName}}
- Where Alias = string() and RealName = string().
Has the same behavior as the Alias property, except that
it also marks the target directory as containing CGI
scripts. URLs with a path beginning with url-path are mapped to
scripts beginning with directory-filename, for example:
{script_alias, {"/cgi-bin/", "/web/cgi-bin/"}
and an access to http://your.server.org/cgi-bin/foo would cause the server to run the script /web/cgi-bin/foo. - {script_nocache, boolean()}
- If ScriptNoCache is set to true the HTTP server will by default add the header fields necessary to prevent proxies from caching the page. Generally this is something you want. Defaults to false.
- {script_timeout, integer()}
- The time in seconds the web server will wait between each chunk of data from the script. If the CGI-script not delivers any data before the timeout the connection to the client will be closed. Defaults to 15.
- {action, {MimeType, CgiScript}} - requires mod_action
- Where MimeType = string() and CgiScript = string().
Action adds an action, which will activate a cgi-script
whenever a file of a certain mime-type is requested. It
propagates the URL and file path of the requested document using
the standard CGI PATH_INFO and PATH_TRANSLATED environment
variables.
{action, {"text/plain", "/cgi-bin/log_and_deliver_text"}
- {script, {Method, CgiScript}} - requires mod_action
- Where Method = string() and CgiScript = string().
Script adds an action, which will activate a cgi-script
whenever a file is requested using a certain HTTP method. The
method is either GET or POST as defined in RFC 1945. It
propagates the URL and file path of the requested document using
the standard CGI PATH_INFO and PATH_TRANSLATED environment
variables.
{script, {"PUT", "/cgi-bin/put"}
ESI properties - requires mod_esi
- {erl_script_alias, {URLPath, [AllowedModule]}}
- Where URLPath = string() and AllowedModule = atom().
erl_script_alias marks all URLs matching url-path as erl
scheme scripts. A matching URL is mapped into a specific module
and function. For example:
{erl_script_alias, {"/cgi-bin/example" [httpd_example]}
and a request to http://your.server.org/cgi-bin/example/httpd_example:yahoo would refer to httpd_example:yahoo/2 and http://your.server.org/cgi-bin/example/other:yahoo would not be allowed to execute. - {erl_script_nocache, boolean()}
- If erl_script_nocache is set to true the server will add http header fields that prevents proxies from caching the page. This is generally a good idea for dynamic content, since the content often vary between each request. Defaults to false.
- {erl_script_timeout, integer()}
- If erl_script_timeout sets the time in seconds the server will wait between each chunk of data to be delivered through mod_esi:deliver/2. Defaults to 15. This is only relevant for scripts that uses the erl scheme.
- {eval_script_alias, {URLPath, [AllowedModule]}}
- Where URLPath = string() and AllowedModule = atom(). Same as erl_script_alias but for scripts using the eval scheme. Note that this is only supported for backwards compatibility. The eval scheme is deprecated.
Log properties - requires mod_log
- {error_log, path()}
- Defines the filename of the error log file to be used to log server errors. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root
- {security_log, path()}
- Defines the filename of the access log file to be used to log security events. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
- {transfer_log, path()}
- Defines the filename of the access log file to be used to log incoming requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
Disk Log properties - requires mod_disk_log
- {disk_log_format, internal | external}
- Defines the file-format of the log files see disk_log for more information. If the internal file-format is used, the logfile will be repaired after a crash. When a log file is repaired data might get lost. When the external file-format is used httpd will not start if the log file is broken. Defaults to external.
- {error_disk_log, internal | external}
- Defines the filename of the (disk_log(3)) error log file to be used to log server errors. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
- {error_disk_log_size, {MaxBytes, MaxFiles}}
- Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the (disk_log(3)) error log file. The disk_log(3) error log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.
- {security_disk_log, path()}
- Defines the filename of the (disk_log(3)) access log file which logs incoming security events i.e authenticated requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
- {security_disk_log_size, {MaxBytes, MaxFiles}}
- Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3) access log file. The disk_log(3) access log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.
- {transfer_disk_log, path()}
- Defines the filename of the (disk_log(3)) access log file which logs incoming requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
- {transfer_disk_log_size, {MaxBytes, MaxFiles}}
- Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3) access log file. The disk_log(3) access log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.
Authentication properties - requires mod_auth
{directory, {path(), [{property(), term()}]}}
Here follows the valid properties for directories
- {allow_from, all | [RegxpHostString]}
-
Defines a set of hosts which should be granted access to a
given directory.
For example:
{allow_from, ["123.34.56.11", "150.100.23"]
The host 123.34.56.11 and all machines on the 150.100.23 subnet are allowed access. - {deny_from, all | [RegxpHostString]}
-
Defines a set of hosts
which should be denied access to a given directory.
For example:
{deny_from, ["123.34.56.11", "150.100.23"]
The host 123.34.56.11 and all machines on the 150.100.23 subnet are not allowed access. - {auth_type, plain | dets | mnesia}
- Sets the type of authentication database that is used for the directory.The key difference between the different methods is that dynamic data can be saved when Mnesia and Dets is used. This property is called AuthDbType in the Apache like configuration files.
- {auth_user_file, path()}
-
Sets the name of a file which contains the list of users and
passwords for user authentication. filename can be either
absolute or relative to the
server_root
. If using the plain storage method, this file is a plain text file, where each line contains a user name followed by a colon, followed by the non-encrypted password. If user names are duplicated, the behavior is undefined. For example:ragnar:s7Xxv7 edward:wwjau8
If using the dets storage method, the user database is maintained by dets and should not be edited by hand. Use the API functions in mod_auth module to create / edit the user database. This directive is ignored if using the mnesia storage method. For security reasons, make sure that theauth_user_file
is stored outside the document tree of the Web server. If it is placed in the directory which it protects, clients will be able to download it. - {auth_group_file, path()}
- Sets the name of a file which contains the list of user
groups for user authentication. Filename can be either
absolute or relative to the
server_root
. If you use the plain storage method, the group file is a plain text file, where each line contains a group name followed by a colon, followed by the member user names separated by spaces. For example:group1: bob joe ante
If using the dets storage method, the group database is maintained by dets and should not be edited by hand. Use the API for mod_auth module to create / edit the group database. This directive is ignored if using the mnesia storage method. For security reasons, make sure that theauth_group_file
is stored outside the document tree of the Web server. If it is placed in the directory which it protects, clients will be able to download it. - {auth_name, string()}
- Sets the name of the authorization realm (auth-domain) for a directory. This string informs the client about which user name and password to use.
- {auth_access_password, string()}
- If set to other than "NoPassword" the password is required for all API calls. If the password is set to "DummyPassword" the password must be changed before any other API calls. To secure the authenticating data the password must be changed after the web server is started since it otherwise is written in clear text in the configuration file.
- {require_user, [string()]}
- Defines users which should be granted access to a given directory using a secret password.
- {require_group, [string()]}
- Defines users which should be granted access to a given directory using a secret password.
Htaccess authentication properties - requires mod_htaccess
- {access_files, [path()]}
- Specify which filenames that are used for access-files. When a request comes every directory in the path to the requested asset will be searched after files with the names specified by this parameter. If such a file is found the file will be parsed and the restrictions specified in it will be applied to the request.
Security properties - requires mod_security
{security_directory, {path(), [{property(), term()}]}
Here follows the valid properties for security directories
- {security_data_file, path()}
- Name of the security data file. The filename can either absolute or relative to the server_root. This file is used to store persistent data for the mod_security module.
- {security_max_retries, integer()}
- Specifies the maximum number of tries to authenticate a user has before the user is blocked out. If a user successfully authenticates when the user has been blocked, the user will receive a 403 (Forbidden) response from the server. If the user makes a failed attempt while blocked the server will return 401 (Unauthorized), for security reasons. Defaults to 3 may also be set to infinity.
- {security_block_time, integer()}
- Specifies the number of minutes a user is blocked. After this amount of time, he automatically regains access. Defaults to 60
- {security_fail_expire_time, integer()}
- Specifies the number of minutes a failed user authentication is remembered. If a user authenticates after this amount of time, his previous failed authentications are forgotten. Defaults to 30
- {security_auth_timeout, integer()}
- Specifies the number of seconds a successful user authentication is remembered. After this time has passed, the authentication will no longer be reported. Defaults to 30.
Functions
info(Pid) ->
info(Pid, Properties) -> [{Option, Value}]
Properties = [property()]
Option = property()
Value = term()
Fetches information about the HTTP server. When called with only the pid all properties are fetched, when called with a list of specific properties they are fetched. Available properties are the same as the servers start options.
Note!
Pid is the pid returned from inets:start/[2,3]. Can also be retrieved form inets:services/0, inets:services_info/0 see inets(3)
info(Address, Port) ->
info(Address, Port, Properties) -> [{Option, Value}]
Address = ip_address()
Port = integer()
Properties = [property()]
Option = property()
Value = term()
Fetches information about the HTTP server. When called with only the Address and Port all properties are fetched, when called with a list of specific properties they are fetched. Available properties are the same as the servers start options.
Note!
Address has to be the ip-address and can not be the hostname.
reload_config(Config, Mode) -> ok | {error, Reason}
Config = path() | [{Option, Value}]
Option = property()
Value = term()
Mode = non_disturbing | disturbing
Reloads the HTTP server configuration without restarting the server. Incoming requests will be answered with a temporary down message during the time the it takes to reload.
Note!
Available properties are the same as the servers start options, although the properties bind_address and port can not be changed.
If mode is disturbing, the server is blocked forcefully and all ongoing requests are terminated and the reload will start immediately. If mode is non-disturbing, no new connections are accepted, but the ongoing requests are allowed to complete before the reload is done.
ERLANG WEB SERVER API DATA TYPES
ModData = #mod{} -record(mod, { data = [], socket_type = ip_comm, socket, config_db, method, absolute_uri, request_uri, http_version, request_line, parsed_header = [], entity_body, connection }).
The fields of the mod
record has the following meaning:
data
- Type
[{InteractionKey,InteractionValue}]
is used to propagate data between modules. Depictedinteraction_data()
in function type declarations. socket_type
socket_type()
, Indicates whether it is an ip socket or a ssl socket.socket
- The actual socket in
ip_comm
orssl
format depending on thesocket_type
. config_db
- The config file directives stored as key-value tuples in
an ETS-table. Depicted
config_db()
in function type declarations. method
- Type
"GET" | "POST" | "HEAD" | "TRACE"
, that is the HTTP method. absolute_uri
- If the request is a HTTP/1.1
request the URI might be in the absolute URI format. In that
case httpd will save the absolute URI in this field. An Example
of an absolute URI could
be
"http://ServerName:Part/cgi-bin/find.pl?person=jocke"
request_uri
- The
Request-URI
as defined in RFC 1945, for example"/cgi-bin/find.pl?person=jocke"
http_version
- The
HTTP
version of the request, that is "HTTP/0.9", "HTTP/1.0", or "HTTP/1.1". request_line
- The
Request-Line
as defined in RFC 1945, for example"GET /cgi-bin/find.pl?person=jocke HTTP/1.0"
. parsed_header
- Type
[{HeaderKey,HeaderValue}]
,parsed_header
contains all HTTP header fields from the HTTP-request stored in a list as key-value tuples. See RFC 2616 for a listing of all header fields. For example the date field would be stored as:{"date","Wed, 15 Oct 1997 14:35:17 GMT"}. RFC 2616 defines that HTTP is a case insensitive protocol and the header fields may be in lower case or upper case. Httpd will ensure that all header field names are in lower case.
. entity_body
- The
Entity-Body
as defined in RFC 2616, for example data sent from a CGI-script using the POST method. connection
true | false
If set to true the connection to the client is a persistent connection and will not be closed when the request is served.
ERLANG WEB SERVER API CALLBACK FUNCTIONS
Functions
Module:do(ModData)-> {proceed, OldData} | {proceed, NewData} | {break, NewData} | done
OldData = list()
NewData = [{response,{StatusCode,Body}}] | [{response,{response,Head,Body}}] | [{response,{already_sent,Statuscode,Size}]
StausCode = integer()
Body = io_list() | nobody | {Fun, Arg}
Head = [HeaderOption]
HeaderOption = {Option, Value} | {code, StatusCode}
Option = accept_ranges | allow | cache_control | content_MD5 | content_encoding | content_language | content_length | content_location | content_range | content_type | date | etag | expires | last_modified | location | pragma | retry_after | server | trailer | transfer_encoding
Value = string()
Fun = fun( Arg ) -> sent| close | Body
Arg = [term()]
When a valid request reaches httpd it calls do/1
in
each module defined by the Modules configuration
option. The function may generate data for other modules
or a response that can be sent back to the client.
The field data
in ModData is a list. This list will be
the list returned from the last call to
do/1
.
Body
is the body of the http-response that will be
sent back to the client an appropriate header will be
appended to the message. StatusCode
will be the
status code of the response see RFC2616 for the appropriate
values.
Head
is a key value list of HTTP header fields. The
server will construct a HTTP header from this data. See RFC
2616 for the appropriate value for each header field. If the
client is a HTTP/1.0 client then the server will filter the
list so that only HTTP/1.0 header fields will be sent back
to the client.
If Body
is returned and equal to {Fun,Arg}
,
the Web server will try apply/2
on Fun
with
Arg
as argument and expect that the fun either
returns a list (Body)
that is a HTTP-repsonse or the
atom sent if the HTTP-response is sent back to the
client. If close is returned from the fun something has gone
wrong and the server will signal this to the client by
closing the connection.
Module:load(Line, AccIn)-> eof | ok | {ok, AccOut} | {ok, AccOut, {Option, Value}} | {ok, AccOut, [{Option, Value}]} | {error, Reason}
Line = string()
AccIn = [{Option, Value}]
AccOut = [{Option, Value}]
Option = property()
Value = term()
Reason = term()
Load is used to convert a line in a Apache like
configuration file to a {Option, Value}
tuple. Some
more complex configuration options such as directory
and security_directory
will create an
accumulator.This function does only need clauses for the
options implemented by this particular callback module.
Module:store({Option, Value}, Config)-> {ok, {Option, NewValue}} | {error, Reason}
Line = string()
Option = property()
Config = [{Option, Value}]
Value = term()
Reason = term()
This function is used to check the validity of the configuration options before saving them in the internal database. This function may also have a side effect e.i. setup necessary extra resources implied by the configuration option. It can also resolve possible dependencies among configuration options by changing the value of the option. This function does only need clauses for the options implemented by this particular callback module.
Module:remove(ConfigDB) -> ok | {error, Reason}
ConfigDB = ets_table()
Reason = term()
When httpd is shutdown it will try to execute
remove/1
in each Erlang web server callback module. The
programmer may use this function to clean up resources
that may have been created in the store function.
ERLANG WEB SERVER API HELP FUNCTIONS
Functions
parse_query(QueryString) -> [{Key,Value}]
QueryString = string()
Key = string()
Value = string()
parse_query/1
parses incoming data to erl
and
eval
scripts (See mod_esi(3)) as defined in the standard
URL format, that is '+' becomes 'space' and decoding of
hexadecimal characters (%xx
).