httpd
An implementation of an HTTP
1.1 compliant web server, as defined in RFC 2616
This module provides the HTTP server start options, some administrative functions, and specifies the Erlang web server callback API.
DATA TYPES
Type definitions that are used more than once in this module:
boolean() = true | false
string()
= list of ASCII characters
path() = string()
representing a file or a directory path
ip_address() = {N1,N2,N3,N4} % IPv4
| {K1,K2,K3,K4,K5,K6,K7,K8} % IPv6
hostname() = string()
representing a host, for example,
"foo.bar.com"
property() = atom()
ERLANG HTTP SERVER SERVICE START/STOP
A web server can be configured to start when starting the Inets
application, or dynamically in runtime by calling the
Inets
application API inets:start(httpd, ServiceConfig)
or
inets:start(httpd, ServiceConfig, How)
,
see inets(3).
The configuration options, also called
properties, are as follows:
File Properties
When the web server is started
at application start time, the properties are to be fetched from a
configuration file that can consist of a regular Erlang property
list, that is, [{Option, Value}]
, where Option = property()
and Value = term()
, followed by a full stop, or for
backwards compatibility, an Apache-like configuration file. If the
web server is started dynamically at runtime,
a file can still be specified but also the complete property
list.
If this property is defined, Inets
expects to find
all other properties defined in this file. The
file must include all properties listed under mandatory
properties.
If this property is defined, Inets
expects to find all
other properties defined in this file, which uses Apache-like
syntax. The file must include all properties listed
under mandatory properties. The Apache-like syntax is the property,
written as one word where each new word begins with a capital,
followed by a white-space, followed by the value, followed by a
new line.
Example:
{server_root, "/urs/local/www"} -> ServerRoot /usr/local/www
A few exceptions are documented
for each property that behaves differently,
and the special cases {directory, {path(), PropertyList}}
and {security_directory, {Dir, PropertyList}}
, are represented
as:
<Directory Dir> <Properties handled as described above> </Directory>
Note!
The properties proplist_file
and file
are mutually exclusive. Also newer properties may not be supported as Apache-like options, this is a legacy feature.
Mandatory Properties
The port that the HTTP server listen to.
If zero is specified as port, an arbitrary available port
is picked and function httpd:info/2
can be used to
determine which port was picked.
The name of your server, normally a fully qualified domain name.
Defines the home directory of the server, where log files, and so on, can be stored. Relative paths specified in other properties refer to this directory.
Defines the top directory for the documents that are available on the HTTP server.
Communication Properties
Default is any
. any
is denoted *
in the Apache-like configuration file.
Used together with bind_address
and port
to uniquely identify
a HTTP server. This can be useful in a virtualized environment,
where there can
be more that one server that has the same bind_address and port.
If this property is not explicitly set, it is assumed that the
bind_address
and
port
uniquely identifies the HTTP server.
For SSL
configuration options, see
ssl:listen/2.
Default is ip_comm
.
This option is only used when option
socket_type
has value ip_comm
.
Default is inet6fb4
.
If given, sets a minimum of bytes per second value for connections.
If the value is unreached, the socket closes for that connection.
The option is good for reducing the risk of "slow DoS" attacks.
Erlang Web Server API Modules
Defines which modules the HTTP server uses when handling
requests. Default is [mod_alias, mod_auth, mod_esi,
mod_actions, mod_cgi, mod_dir, mod_get, mod_head, mod_log,
mod_disk_log]
.
Notice that some mod
-modules are dependent on
others, so the order cannot be entirely arbitrary. See the
Inets Web Server Modules in the
User's Guide for details.
Limit properties
A callback module to customize the inets HTTP servers behaviour see httpd_custom_api
Allows you to disable chunked
transfer-encoding when sending a response to an HTTP/1.1
client. Default is false
.
Instructs the server whether to use persistent
connections when the client claims to be HTTP/1.1
compliant. Default is true
.
The number of seconds the server waits for a
subsequent request from the client before closing the
connection. Default is 150
.
Limits the size of the message body of an HTTP request. Default is no limit.
Limits the number of simultaneous requests that can be
supported. Default is 150
.
Limits the size of the message header of an HTTP request.
Default is 10240
.
Maximum content-length in an incoming request, in bytes. Requests
with content larger than this are answered with status 413.
Default is 100000000
(100 MB).
Limits the size of the HTTP request URI. Default is no limit.
The number of requests that a client can do on one
connection. When the server has responded to the number of
requests defined by max_keep_alive_requests
, the server
closes the connection. The server closes it even if there are
queued request. Default is no limit.
Administrative Properties
MimeType = string()
and Extension = string()
.
Files delivered to the client are MIME typed according to RFC
1590. File suffixes are mapped to MIME types before file delivery.
The mapping between file suffixes and MIME types can be specified
as an Apache-like file or directly in the property list. Such
a file can look like the follwoing:
# MIME type Extension text/html html htm text/plain asc txt
Default is [{"html","text/html"},{"htm","text/html"}].
When the server is asked to provide a document type that cannot be determined by the MIME Type Settings, the server uses this default type.
Defines the email-address of the server administrator to be included in any error messages returned by the server.
Defines the look of the value of the server header.
Example: Assuming the version of Inets
is 5.8.1,
the server header string can look as follows for
the different values of server-tokens:
none
"" % A Server: header will not be generated
prod
"inets"
major
"inets/5"
minor
"inets/5.8"
minimal
"inets/5.8.1"
os
"inets/5.8.1 (unix)"
full
"inets/5.8.1 (unix/linux) OTP/R15B"
{private, "foo/bar"}
"foo/bar"
By default, the value is as before, that is, minimal
.
Defines if access logs are to be written according to the common
log format or the extended common log format.
The common
format is one line looking like this:
remotehost rfc931 authuser [date] "request" status bytes
.
Here:
remotehost
rfc931
authuser
[date]
"request"
status
bytes
The combined
format is one line looking like this:
remotehost rfc931 authuser [date] "request" status bytes "referer" "user_agent"
In addition to the earlier:
"referer"
"user_agent"
This affects the access logs written by mod_log
and
mod_disk_log
.
Default is pretty
. If the error log is meant to be read
directly by a human, pretty
is the best option.
pretty
has a format corresponding to:
io:format("[~s] ~s, reason: ~n ~p ~n~n", [Date, Msg, Reason]).
compact
has a format corresponding to:
io:format("[~s] ~s, reason: ~w ~n", [Date, Msg, Reason]).
This affects the error logs written by mod_log
and
mod_disk_log
.
URL Aliasing Properties - Requires mod_alias
Alias = string()
and RealName = string()
.
alias
allows documents to be stored in the local file
system instead of the document_root
location. URLs with a path
beginning with url-path is mapped to local files beginning with
directory-filename, for example:
{alias, {"/image", "/ftp/pub/image"}}
Access to http://your.server.org/image/foo.gif would refer to
the file /ftp/pub/image/foo.gif.
Re = string()
and Replacement = string()
.
re_write
allows documents to be stored in the local file
system instead of the document_root
location. URLs are rewritten
by re:replace/3
to produce a path in the local file-system,
for example:
{re_write, {"^/[~]([^/]+)(.*)$", "/home/\\1/public\\2"}}
Access to http://your.server.org/~bob/foo.gif would refer to
the file /home/bob/public/foo.gif.
In an Apache-like configuration file, Re
is separated
from Replacement
with one single space, and as expected
backslashes do not need to be backslash escaped, the
same example would become:
ReWrite ^/[~]([^/]+)(.*)$ /home/\1/public\2
Beware of trailing space in Replacement
to be used.
If you must have a space in Re
, use, for example, the character
encoding \040
, see
re(3).
directory_index
specifies a list of resources to look for
if a client requests a directory using a /
at the end of the
directory name. file
depicts the name of a file in the
directory. Several files can be given, in which case the server
returns the first it finds, for example:
{directory_index, ["index.hml", "welcome.html"]}
Access to http://your.server.org/docs/ would return
http://your.server.org/docs/index.html or
http://your.server.org/docs/welcome.html if index.html does not
exist.
CGI Properties - Requires mod_cgi
Alias = string()
and RealName = string()
.
Have the same behavior as property alias
, except that
they also mark the target directory as containing CGI
scripts. URLs with a path beginning with url-path are mapped to
scripts beginning with directory-filename, for example:
{script_alias, {"/cgi-bin/", "/web/cgi-bin/"}}
Access to http://your.server.org/cgi-bin/foo would cause
the server to run the script /web/cgi-bin/foo.
Re = string()
and Replacement = string()
.
Have the same behavior as property re_write
, except that
they also mark the target directory as containing CGI
scripts. URLs with a path beginning with url-path are mapped to
scripts beginning with directory-filename, for example:
{script_re_write, {"^/cgi-bin/(\\d+)/", "/web/\\1/cgi-bin/"}}
Access to http://your.server.org/cgi-bin/17/foo would cause
the server to run the script /web/17/cgi-bin/foo.
If script_nocache
is set to true
, the HTTP server by
default adds the header fields necessary to prevent proxies from
caching the page. Generally this is preferred.
Default to false
.
The time in seconds the web server waits between each
chunk of data from the script. If the CGI script does not deliver
any data before the timeout, the connection to the client is
closed. Default is 15
.
MimeType = string()
and CgiScript = string()
.
action
adds an action activating a CGI script
whenever a file of a certain MIME type is requested. It
propagates the URL and file path of the requested document using
the standard CGI PATH_INFO and PATH_TRANSLATED environment
variables.
Example:
{action, {"text/plain", "/cgi-bin/log_and_deliver_text"}}
Method = string()
and CgiScript = string()
.
script
adds an action activating a CGI script
whenever a file is requested using a certain HTTP method. The
method is either GET or POST, as defined in
Example:
{script, {"PUT", "/cgi-bin/put"}}
ESI Properties - Requires mod_esi
URLPath = string()
and AllowedModule = atom()
.
erl_script_alias
marks all URLs matching url-path as erl
scheme scripts. A matching URL is mapped into a specific module
and function, for example:
{erl_script_alias, {"/cgi-bin/example", [httpd_example]}}
A request to
http://your.server.org/cgi-bin/example/httpd_example:yahoo
would refer to httpd_example:yahoo/3 or, if that does not exist,
httpd_example:yahoo/2 and
http://your.server.org/cgi-bin/example/other:yahoo would
not be allowed to execute.
If erl_script_nocache
is set to true
, the server adds
HTTP header fields preventing proxies from caching the
page. This is generally a good idea for dynamic content, as
the content often varies between each request.
Default is false
.
If erl_script_timeout
sets the time in seconds the server
waits between each chunk of data to be delivered through
mod_esi:deliver/2
. Default is 15
. This is only relevant
for scripts that use the erl scheme.
URLPath = string()
and AllowedModule = atom()
.
Same as erl_script_alias
but for scripts
using the eval scheme. This is only supported
for backwards compatibility. The eval scheme is deprecated.
Log Properties - Requires mod_log
Defines the filename of the error log file to be used to log
server errors. If the filename does not begin with a slash (/),
it is assumed to be relative to the server_root
.
Defines the filename of the access log file to be used to
log security events. If the filename does not begin with a slash
(/), it is assumed to be relative to the server_root
.
Defines the filename of the access log file to be used to
log incoming requests. If the filename does not begin with a
slash (/), it is assumed to be relative to the server_root
.
Disk Log Properties - Requires mod_disk_log
Defines the file format of the log files. See disk_log
for
details. If the internal file format is used, the
log file is repaired after a crash. When a log file is
repaired, data can disappear. When the external file format is
used, httpd
does not start if the log file is broken. Default is
external
.
Defines the filename of the (disk_log(3)
) error log file
to be used to log server errors. If the filename does not begin
with a slash (/), it is assumed to be relative to the server_root
.
MaxBytes = integer()
and MaxFiles = integer()
.
Defines the properties of the (disk_log(3)
) error log
file. This file is of type wrap log and
max bytes is written to each file and max files is
used before the first file is truncated and reused.
Defines the filename of the (disk_log(3)
) access log file
logging incoming security events, that is, authenticated
requests. If the filename does not begin with a slash (/), it
is assumed to be relative to the server_root
.
MaxBytes = integer()
and MaxFiles = integer()
.
Defines the properties of the disk_log(3)
access log
file. This file is of type wrap log and
max bytes is written to each file and max files is
used before the first file is truncated and reused.
Defines the filename of the (disk_log(3)
) access log file
logging incoming requests. If the filename does not begin
with a slash (/), it is assumed to be relative to the
server_root
.
MaxBytes = integer()
and MaxFiles = integer()
.
Defines the properties of the disk_log(3)
access log
file. This file is of type wrap log and
max bytes is written to each file and max files is
used before the first file is truncated and reused.
Authentication Properties - Requires mod_auth
{directory, {path(), [{property(), term()}]}}
The properties for directories are as follows:
Defines a set of hosts to be granted access to a
given directory, for example:
{allow_from, ["123.34.56.11", "150.100.23"]}
The host 123.34.56.11
and all machines on the 150.100.23
subnet are allowed access.
Defines a set of hosts
to be denied access to a given directory, for example:
{deny_from, ["123.34.56.11", "150.100.23"]}
The host 123.34.56.11
and all machines on the 150.100.23
subnet are not allowed access.
Sets the type of authentication database that is used for the
directory. The key difference between the different methods is
that dynamic data can be saved when Mnesia
and Dets
are used.
This property is called AuthDbType
in the Apache-like
configuration files.
Sets the name of a file containing the list of users and
passwords for user authentication. The filename can be either
absolute or relative to the server_root
. If using the
plain storage method, this file is a plain text file where
each line contains a username followed by a colon, followed
by the non-encrypted password. If usernames are duplicated,
the behavior is undefined.
Example:
ragnar:s7Xxv7
edward:wwjau8
If the Dets
storage method is used, the user database is
maintained by Dets
and must not be edited by hand. Use the
API functions in module mod_auth
to create/edit the user
database. This directive is ignored if the Mnesia
storage method is used. For security reasons, ensure that
auth_user_file
is stored outside the document tree of the web
server. If it is placed in the directory that it protects,
clients can download it.
Sets the name of a file containing the list of user
groups for user authentication. The filename can be either
absolute or relative to the server_root
. If the plain
storage method is used, the group file is a plain text file, where
each line contains a group name followed by a colon, followed
by the members usernames separated by spaces.
Example:
group1: bob joe ante
If the Dets
storage method is used, the group database is
maintained by Dets
and must not be edited by hand. Use the
API for module mod_auth
to create/edit the group database.
This directive is ignored if the Mnesia
storage method is used.
For security reasons, ensure that the auth_group_file
is
stored outside the document tree of the web server. If it is
placed in the directory that it protects, clients
can download it.
Sets the name of the authorization realm (auth-domain) for a directory. This string informs the client about which username and password to use.
If set to other than "NoPassword", the password is required for all API calls. If the password is set to "DummyPassword", the password must be changed before any other API calls. To secure the authenticating data, the password must be changed after the web server is started. Otherwise it is written in clear text in the configuration file.
Defines users to grant access to a given directory using a secret password.
Defines users to grant access to a given directory using a secret password.
Htaccess Authentication Properties - Requires mod_htaccess
Specifies the filenames that are used for access files. When a request comes, every directory in the path to the requested asset are searched after files with the names specified by this parameter. If such a file is found, the file is parsed and the restrictions specified in it are applied to the request.
Security Properties - Requires mod_security
{security_directory, {path(), [{property(), term()}]}}
The properties for the security directories are as follows:
Name of the security data file. The filename can either be
absolute or relative to the server_root
. This file is used to
store persistent data for module mod_security
.
Specifies the maximum number of attempts to authenticate a
user before the user is blocked out. If a user
successfully authenticates while blocked, the
user receives a 403 (Forbidden) response from the
server. If the user makes a failed attempt while blocked, the
server returns 401 (Unauthorized), for security
reasons.
Default is 3
. Can be set to infinity.
Specifies the number of minutes a user is blocked. After
this timehas passed, the user automatically regains access.
Default is 60
.
Specifies the number of minutes a failed user authentication
is remembered. If a user authenticates after this
time has passed, the previous failed authentications are
forgotten.
Default is 30
.
30
.
Functions
info(Pid) ->
info(Pid, Properties) -> [{Option, Value}]
Properties = [property()]
Option = property()
Value = term()
Fetches information about the HTTP server. When called with only the pid, all properties are fetched. When called with a list of specific properties, they are fetched. The available properties are the same as the start options of the server.
Note!
Pid is the pid returned from inets:start/[2,3]
.
Can also be retrieved form inets:services/0
and
inets:services_info/0
,
see inets(3).
info(Address, Port) ->
info(Address, Port, Profile) ->
info(Address, Port, Profile, Properties) -> [{Option, Value}]
info(Address, Port, Properties) -> [{Option, Value}]
Address = ip_address()
Port = integer()
Profile = atom()
Properties = [property()]
Option = property()
Value = term()
Fetches information about the HTTP server. When called with
only Address
and Port
, all properties are
fetched. When called with a list of specific properties, they
are fetched. The available properties are the same as the
start options of the server.
Note!
The address must be the IP address and cannot be the hostname.
reload_config(Config, Mode) -> ok | {error, Reason}
Config = path() | [{Option, Value}]
Option = property()
Value = term()
Mode = non_disturbing | disturbing
Reloads the HTTP server configuration without restarting the server. Incoming requests are answered with a temporary down message during the reload time.
Note!
Available properties are the same as the
start options of the server, but the properties
bind_address
and port
cannot be changed.
If mode is disturbing, the server is blocked forcefully, all ongoing requests terminates, and the reload starts immediately. If mode is non-disturbing, no new connections are accepted, but ongoing requests are allowed to complete before the reload is done.
ERLANG WEB SERVER API DATA TYPES
The Erlang web server API data types are as follows:
ModData = #mod{} -record(mod, { data = [], socket_type = ip_comm, socket, config_db, method, absolute_uri, request_uri, http_version, request_line, parsed_header = [], entity_body, connection }).
To acess the record in your callback-module use:
-include_lib("inets/include/httpd.hrl").
The fields of record mod
have the following meaning:
data
Type [{InteractionKey,InteractionValue}]
is used to
propagate data between modules. Depicted
interaction_data()
in function type declarations.
socket_type
socket_type()
indicates whether it is an IP socket or an ssl
socket.
socket
The socket, in format ip_comm
or ssl
,
depending on socket_type
.
config_db
The config file directives stored as key-value tuples in
an ETS table. Depicted config_db()
in function type
declarations.
method
Type "GET" | "POST" | "HEAD" | "TRACE"
, that is, the
HTTP method.
absolute_uri
If the request is an HTTP/1.1
request, the URI can be in the absolute URI format. In that
case, httpd
saves the absolute URI in this field. An Example
of an absolute URI is
"http://ServerName:Part/cgi-bin/find.pl?person=jocke"
request_uri
The Request-URI
as defined
in "/cgi-bin/find.pl?person=jocke"
.
http_version
The HTTP
version of the
request, that is, "HTTP/0.9", "HTTP/1.0", or "HTTP/1.1".
request_line
The Request-Line
as
defined in"GET /cgi-bin/find.pl?person=jocke HTTP/1.0"
.
parsed_header
[{HeaderKey,HeaderValue}]
.
parsed_header
contains all HTTP header fields from the
HTTP request stored in a list as key-value tuples. See
{"date","Wed, 15 Oct 1997 14:35:17 GMT"}
.
RFC 2616 defines that HTTP is a case-insensitive protocol and
the header fields can be in lower case or upper case. httpd
ensures that all header field names are in lower case.
entity_body
The entity-Body
as defined
in
connection
true | false
. If set to true
, the connection to the
client is a persistent connection and is not closed when
the request is served.
ERLANG WEB SERVER API CALLBACK FUNCTIONS
Functions
Module:do(ModData)-> {proceed, OldData} | {proceed, NewData} | {break, NewData} | done
OldData = list()
NewData = [{response,{StatusCode,Body}}]
| [{response,{response,Head,Body}}]
| [{response,{already_sent,Statuscode,Size}}]
StatusCode = integer()
Body = io_list() | nobody | {Fun, Arg}
Head = [HeaderOption]
HeaderOption = {Option, Value} | {code, StatusCode}
Option = accept_ranges | allow
| cache_control | content_MD5
| content_encoding | content_language
| content_length | content_location
| content_range | content_type | date
| etag | expires | last_modified
| location | pragma | retry_after
| server | trailer | transfer_encoding
Value = string()
Fun = fun( Arg ) -> sent| close | Body
Arg = [term()]
When a valid request reaches httpd
, it calls do/1
in
each module, defined by the configuration
option of Module
. The function can generate data for other
modules or a response that can be sent back to the client.
The field data
in ModData
is a list. This list is
the list returned from the last call to
do/1
.
Body
is the body of the HTTP response that is
sent back to the client. An appropriate header is
appended to the message. StatusCode
is the
status code of the response, see
Head
is a key value list of HTTP header fields. The
server constructs an HTTP header from this data. See
If Body
is returned and equal to {Fun,Arg}
,
the web server tries apply/2
on Fun
with
Arg
as argument. The web server expects that the fun either
returns a list (Body)
that is an HTTP repsonse, or the
atom sent
if the HTTP response is sent back to the
client. If close
is returned from the fun, something has gone
wrong and the server signals this to the client by
closing the connection.
Module:load(Line, AccIn)-> eof | ok | {ok, AccOut} | {ok, AccOut, {Option, Value}} | {ok, AccOut, [{Option, Value}]} | {error, Reason}
Line = string()
AccIn = [{Option, Value}]
AccOut = [{Option, Value}]
Option = property()
Value = term()
Reason = term()
Converts a line in an Apache-like
configuration file to an {Option, Value}
tuple. Some
more complex configuration options, such as directory
and security_directory
, create an
accumulator. This function only needs clauses for the
options implemented by this particular callback module.
Module:remove(ConfigDB) -> ok | {error, Reason}
ConfigDB = ets_table()
Reason = term()
When httpd
is shut down, it tries to execute
remove/1
in each Erlang web server callback module. The
programmer can use this function to clean up resources
created in the store function.
Module:store({Option, Value}, Config)-> {ok, {Option, NewValue}} | {error, Reason}
Line = string()
Option = property()
Config = [{Option, Value}]
Value = term()
Reason = term()
Checks the validity of the configuration options before saving them in the internal database. This function can also have a side effect, that is, setup of necessary extra resources implied by the configuration option. It can also resolve possible dependencies among configuration options by changing the value of the option. This function only needs clauses for the options implemented by this particular callback module.
ERLANG WEB SERVER API HELP FUNCTIONS
Functions
parse_query(QueryString) -> [{Key,Value}]
QueryString = string()
Key = string()
Value = string()
parse_query/1
parses incoming data to erl
and
eval
scripts (see mod_esi(3))
as defined in the standard
URL format, that is, '+' becomes 'space' and decoding of
hexadecimal characters (%xx
).