gen_sctp

The gen_sctp module provides functions for communicating with sockets using the SCTP protocol.

The gen_sctp module provides functions for communicating with sockets using the SCTP protocol. The implementation assumes that the OS kernel supports SCTP (RFC2960) through the user-level Sockets API Extensions. During development this implementation was tested on Linux Fedora Core 5.0 (kernel 2.6.15-2054 or later is needed), and on Solaris 10, 11. During OTP adaptation it was tested on SUSE Linux Enterprise Server 10 (x86_64) kernel 2.6.16.27-0.6-smp, with lksctp-tools-1.0.6, briefly on Solaris 10, and later on SUSE Linux Enterprise Server 10 Service Pack 1 (x86_64) kernel 2.6.16.54-0.2.3-smp with lksctp-tools-1.0.7.

Record definitions for the gen_sctp module can be found using:

  -include_lib("kernel/include/inet_sctp.hrl").    

These record definitions use the "new" spelling 'adaptation', not the deprecated 'adaption', regardless of which spelling the underlying C API uses.

CONTENTS

DATA TYPES

assoc_id()

An opaque term returned in for example #sctp_paddr_change{} that identifies an association for an SCTP socket. The term is opaque except for the special value 0 that has a meaning such as "the whole endpoint" or "all future associations".

charlist() = [char()]
iolist() = [char() | binary()]
ip_address()

Represents an address of an SCTP socket. It is a tuple as explained in inet(3).

port_number() = 0 .. 65535
posix()

See inet(3); POSIX Error Codes.

sctp_option()

One of the SCTP Socket Options.

sctp_socket()

Socket identifier returned from open/*.

timeout() = int() | infinity

Timeout used in SCTP connect and receive calls.

Functions


abort(sctp_socket(), Assoc) -> ok | {error, posix()}

  • Assoc = #sctp_assoc_change{}

Abnormally terminates the association given by Assoc, without flushing of unsent data. The socket itself remains open. Other associations opened on this socket are still valid, and it can be used in new associations.

close(sctp_socket()) -> ok | {error, posix()}

Completely closes the socket and all associations on it. The unsent data is flushed as in eof/2. The close/1 call is blocking or otherwise depending of the value of the linger socket option. If close does not linger or linger timeout expires, the call returns and the data is flushed in the background.

connect(Socket, Addr, Port, Opts) -> {ok,Assoc} | {error, posix()}

Same as connect(Socket, Addr, Port, Opts, infinity).

connect(Socket, Addr, Port, [Opt], Timeout) -> {ok, Assoc} | {error, posix()}

  • Socket = sctp_socket()
  • Addr = ip_address() | Host
  • Port = port_number()
  • Opt = sctp_option()
  • Timeout = timeout()
  • Host = atom() | string()
  • Assoc = #sctp_assoc_change{}

Establishes a new association for the socket Socket, with the peer (SCTP server socket) given by Addr and Port. The Timeout, is expressed in milliseconds. A socket can be associated with multiple peers.

WARNING:Using a value of Timeout less than the maximum time taken by the OS to establish an association (around 4.5 minutes if the default values from RFC 4960 are used) can result in inconsistent or incorrect return values. This is especially relevant for associations sharing the same Socket (i.e. source address and port) since the controlling process blocks until connect/* returns. connect_init/* provides an alternative not subject to this limitation.

The result of connect/* is an #sctp_assoc_change{} event which contains, in particular, the new Association ID:

  #sctp_assoc_change{
        state             = atom(),
        error             = atom(),
        outbound_streams  = int(),
        inbound_streams   = int(),
        assoc_id          = assoc_id()
  }        

The number of outbound and inbound streams can be set by giving an sctp_initmsg option to connect as in:

  connect(Socket, Ip, Port,
        [{sctp_initmsg,#sctp_initmsg{num_ostreams=OutStreams,
                                     max_instreams=MaxInStreams}}])        

All options Opt are set on the socket before the association is attempted. If an option record has got undefined field values, the options record is first read from the socket for those values. In effect, Opt option records only define field values to change before connecting.

The returned outbound_streams and inbound_streams are the actual stream numbers on the socket, which may be different from the requested values (OutStreams and MaxInStreams respectively) if the peer requires lower values.

The following values of state are possible:

  • comm_up: association successfully established. This indicates a successful completion of connect.

  • cant_assoc: association cannot be established (connect/* failure).

All other states do not normally occur in the output from connect/*. Rather, they may occur in #sctp_assoc_change{} events received instead of data in recv/* calls. All of them indicate losing the association due to various error conditions, and are listed here for the sake of completeness. The error field may provide more detailed diagnostics.

  • comm_lost;

  • restart;

  • shutdown_comp.

connect_init(Socket, Addr, Port, Opts) -> ok | {error, posix()}

Same as connect_init(Socket, Addr, Port, Opts, infinity).

connect_init(Socket, Addr, Port, [Opt], Timeout) -> ok | {error, posix()}

  • Socket = sctp_socket()
  • Addr = ip_address() | Host
  • Port = port_number()
  • Opt = sctp_option()
  • Timeout = timeout()
  • Host = atom() | string()

Initiates a new association for the socket Socket, with the peer (SCTP server socket) given by Addr and Port.

The fundamental difference between this API and connect/* is that the return value is that of the underlying OS connect(2) system call. If ok is returned then the result of the association establishement is received by the calling process as an #sctp_assoc_change{} event. The calling process must be prepared to receive this, or poll for it using recv/* depending on the value of the active option.

The parameters are as described in connect/*, with the exception of the Timeout value.

The timer associated with Timeout only supervises IP resolution of Addr

controlling_process(sctp_socket(), pid()) -> ok

Assigns a new controlling process Pid to Socket. Same implementation as gen_udp:controlling_process/2.

eof(Socket, Assoc) -> ok | {error, Reason}

  • Socket = sctp_socket()
  • Assoc = #sctp_assoc_change{}

Gracefully terminates the association given by Assoc, with flushing of all unsent data. The socket itself remains open. Other associations opened on this socket are still valid, and it can be used in new associations.

listen(Socket, IsServer) -> ok | {error, Reason}

  • Socket = sctp_socket()
  • IsServer = bool()

Sets up a socket to listen on the IP address and port number it is bound to. IsServer must be 'true' or 'false'. In the contrast to TCP, in SCTP there is no listening queue length. If IsServer is 'true' the socket accepts new associations, i.e. it will become an SCTP server socket.

open() -> {ok, Socket} | {error, posix()}

open(Port) -> {ok, Socket} | {error, posix()}

open([Opt]) -> {ok, Socket} | {error, posix()}

open(Port, [Opt]) -> {ok, Socket} | {error, posix()}

  • Opt = {ip,IP} | {ifaddr,IP} | {port,Port} | sctp_option()
  • IP = ip_address() | any | loopback
  • Port = port_number()

Creates an SCTP socket and binds it to the local addresses specified by all {ip,IP} (or synonymously {ifaddr,IP}) options (this feature is called SCTP multi-homing). The default IP and Port are any and 0, meaning bind to all local addresses on any one free port.

A default set of socket options is used. In particular, the socket is opened in binary and passive mode, and with reasonably large kernel and driver buffers.

recv(sctp_socket()) -> {ok, {FromIP, FromPort, AncData, BinMsg}} | {error, Reason}

recv(sctp_socket(), timeout()) -> {ok, {FromIP, FromPort, AncData, Data}} | {error, Reason}

  • FromIP = ip_address()
  • FromPort = port_number()
  • AncData = [#sctp_sndrcvinfo{}]
  • Data = binary() | charlist() | #sctp_sndrcvinfo{} | #sctp_assoc_change{} | #sctp_paddr_change{} | #sctp_adaptation_event{}
  • Reason = posix() | #sctp_send_failed{} | #scpt_paddr_change{} | #sctp_pdapi_event{} | #sctp_remote_error{} | #sctp_shutdown_event{}

Receives the Data message from any association of the socket. If the receive times out {error,timeout is returned. The default timeout is infinity. FromIP and FromPort indicate the sender's address.

AncData is a list of Ancillary Data items which may be received along with the main Data. This list can be empty, or contain a single #sctp_sndrcvinfo{} record, if receiving of such ancillary data is enabled (see option sctp_events). It is enabled by default, since such ancillary data provide an easy way of determining the association and stream over which the message has been received. (An alternative way would be to get the Association ID from the FromIP and FromPort using the sctp_get_peer_addr_info socket option, but this would still not produce the Stream number).

The actual Data received may be a binary(), or list() of bytes (integers in the range 0 through 255) depending on the socket mode, or an SCTP Event. The following SCTP Events are possible:

  • #sctp_sndrcvinfo{}

  • #sctp_assoc_change{};

  •   #sctp_paddr_change{
            addr      = {ip_address(),port()},
            state     = atom(),
            error     = int(),
            assoc_id  = assoc_id()
      }            

    Indicates change of the status of the peer's IP address given by addr within the association assoc_id. Possible values of state (mostly self-explanatory) include:

    • addr_unreachable;

    • addr_available;

    • addr_removed;

    • addr_added;

    • addr_made_prim.

    • addr_confirmed.

    In case of an error (e.g. addr_unreachable), the error field provides additional diagnostics. In such cases, the #sctp_paddr_change{} Event is automatically converted into an error term returned by gen_sctp:recv. The error field value can be converted into a string using error_string/1.

  •   #sctp_send_failed{
            flags     = true | false,
            error     = int(),
            info      = #sctp_sndrcvinfo{},
            assoc_id  = assoc_id()
            data      = binary()
      }            

    The sender may receive this event if a send operation fails. The flags is a Boolean specifying whether the data have actually been transmitted over the wire; error provides extended diagnostics, use error_string/1; info is the original #sctp_sndrcvinfo{} record used in the failed send/*, and data is the whole original data chunk attempted to be sent.

    In the current implementation of the Erlang/SCTP binding, this Event is internally converted into an error term returned by recv/*.

  •   #sctp_adaptation_event{
            adaptation_ind = int(),
            assoc_id       = assoc_id()
      }            

    Delivered when a peer sends an Adaptation Layer Indication parameter (configured through the option sctp_adaptation_layer). Note that with the current implementation of the Erlang/SCTP binding, this event is disabled by default.

  •   #sctp_pdapi_event{
            indication = sctp_partial_delivery_aborted,
            assoc_id   = assoc_id()
      }            

    A partial delivery failure. In the current implementation of the Erlang/SCTP binding, this Event is internally converted into an error term returned by recv/*.

send(Socket, SndRcvInfo, Data) -> ok | {error, Reason}

  • Socket = sctp_socket()
  • SndRcvInfo = #sctp_sndrcvinfo{}
  • Data = binary() | iolist()

Sends the Data message with all sending parameters from a #sctp_sndrcvinfo{} record. This way, the user can specify the PPID (passed to the remote end) and Context (passed to the local SCTP layer) which can be used for example for error identification. However, such a fine level of user control is rarely required. The send/4 function is sufficient for most applications.

send(Socket, Assoc, Stream, Data) -> ok | {error, Reason}

  • Socket = sctp_socket()
  • Assoc = #sctp_assoc_change{} | assoc_id()
  • Stream = integer()
  • Data = binary() | iolist()

Sends Data message over an existing association and given stream.

error_string(integer()) -> ok | string() | undefined

Translates an SCTP error number from for example #sctp_remote_error{} or #sctp_send_failed{} into an explanatory string, or one of the atoms ok for no error and undefined for an unrecognized error.

SCTP SOCKET OPTIONS

The set of admissible SCTP socket options is by construction orthogonal to the sets of TCP, UDP and generic INET options: only those options which are explicitly listed below are allowed for SCTP sockets. Options can be set on the socket using gen_sctp:open/1,2 or inet:setopts/2, retrieved using inet:getopts/2, and when calling gen_sctp:connect/4,5 options can be changed.

{mode, list|binary}or just list or binary.

Determines the type of data returned from gen_sctp:recv/1,2.

{active, true|false|once}
  • If false (passive mode, the default), the caller needs to do an explicit gen_sctp:recv call in order to retrieve the available data from the socket.

  • If true (full active mode), the pending data or events are sent to the owning process.

    NB: This can cause the message queue to overflow, as there is no way to throttle the sender in this case (no flow control!).

  • If once, only one message is automatically placed in the message queue, after that the mode is automatically re-set to passive. This provides flow control as well as the possibility for the receiver to listen for its incoming SCTP data interleaved with other inter-process messages.

{buffer, int()}
  • Determines the size of the user-level software buffer used by the SCTP driver. Not to be confused with sndbuf and recbuf options which correspond to the kernel socket buffers. It is recommended to have val(buffer) >= max(val(sndbuf),val(recbuf)). In fact, the val(buffer) is automatically set to the above maximum when sndbuf or recbuf values are set.

  • {tos, int()}
  • Sets the Type-Of-Service field on the IP datagrams being sent, to the given value, which effectively determines a prioritization policy for the outbound packets. The acceptable values are system-dependent. TODO: we do not provide symbolic names for these values yet.

  • {priority, int()}
  • A protocol-independent equivalent of tos above. Setting priority implies setting tos as well.

  • {dontroute, true|false}
  • By default false. If true, the kernel does not send packets via any gateway, only sends them to directly connected hosts.

  • {reuseaddr, true|false}
  • By default false. If true, the local binding address {IP,Port} of the socket can be re-used immediately: no waiting in the CLOSE_WAIT state is performed (may be required for high-throughput servers).

  • {linger, {true|false, int()}
  • Determines the timeout in seconds for flushing unsent data in the gen_sctp:close/1 socket call. If the 1st component of the value tuple is false, the 2nd one is ignored, which means that gen_sctp:close/1 returns immediately not waiting for data to be flushed. Otherwise, the 2nd component is the flushing time-out in seconds.

  • {sndbuf, int()}
  • The size, in bytes, of the *kernel* send buffer for this socket. Sending errors would occur for datagrams larger than val(sndbuf). Setting this option also adjusts the size of the driver buffer (see buffer above).

  • {recbuf, int()}
  • The size, in bytes, of the *kernel* recv buffer for this socket. Sending errors would occur for datagrams larger than val(sndbuf). Setting this option also adjusts the size of the driver buffer (see buffer above).

  • {sctp_rtoinfo, #sctp_rtoinfo{}}
  •   #sctp_rtoinfo{
            assoc_id = assoc_id(),
            initial  = int(),
            max      = int(),
            min      = int()
      }        

    Determines re-transmission time-out parameters, in milliseconds, for the association(s) given by assoc_id. If assoc_id = 0 (default) indicates the whole endpoint. See RFC2960 and Sockets API Extensions for SCTP for the exact semantics of the fields values.

  • {sctp_associnfo, #sctp_assocparams{}}
  •   #sctp_assocparams{
            assoc_id                 = assoc_id(),
            asocmaxrxt               = int(),
            number_peer_destinations = int(),
            peer_rwnd                = int(),
            local_rwnd               = int(),
            cookie_life              = int()
      }        

    Determines association parameters for the association(s) given by assoc_id. assoc_id = 0 (default) indicates the whole endpoint. See Sockets API Extensions for SCTP for the discussion of their semantics. Rarely used.

  • {sctp_initmsg, #sctp_initmsg{}}
  •   #sctp_initmsg{
           num_ostreams   = int(),
           max_instreams  = int(),
           max_attempts   = int(),
           max_init_timeo = int()
      }        

    Determines the default parameters which this socket attempts to negotiate with its peer while establishing an association with it. Should be set after open/* but before the first connect/*. #sctp_initmsg{} can also be used as ancillary data with the first call of send/* to a new peer (when a new association is created).

    • num_ostreams: number of outbound streams;

    • max_instreams: max number of in-bound streams;

    • max_attempts: max re-transmissions while establishing an association;

    • max_init_timeo: time-out in milliseconds for establishing an association.

  • {sctp_autoclose, int()|infinity}
  • Determines the time (in seconds) after which an idle association is automatically closed.

  • {sctp_nodelay, true|false}
  • Turns on|off the Nagle algorithm for merging small packets into larger ones (which improves throughput at the expense of latency).

  • {sctp_disable_fragments, true|false}
  • If true, induces an error on an attempt to send a message which is larger than the current PMTU size (which would require fragmentation/re-assembling). Note that message fragmentation does not affect the logical atomicity of its delivery; this option is provided for performance reasons only.

  • {sctp_i_want_mapped_v4_addr, true|false}
  • Turns on|off automatic mapping of IPv4 addresses into IPv6 ones (if the socket address family is AF_INET6).

  • {sctp_maxseg, int()}
  • Determines the maximum chunk size if message fragmentation is used. If 0, the chunk size is limited by the Path MTU only.

  • {sctp_primary_addr, #sctp_prim{}}
  •   #sctp_prim{
            assoc_id = assoc_id(),
            addr     = {IP, Port}
      }
      IP = ip_address()
      Port = port_number()        

    For the association given by assoc_id, {IP,Port} must be one of the peer's addresses. This option determines that the given address is treated by the local SCTP stack as the peer's primary address.

  • {sctp_set_peer_primary_addr, #sctp_setpeerprim{}}
  •   #sctp_setpeerprim{
            assoc_id = assoc_id(),
            addr     = {IP, Port}
      }
      IP = ip_address()
      Port = port_number()        

    When set, informs the peer that it should use {IP, Port} as the primary address of the local endpoint for the association given by assoc_id.

  • {sctp_adaptation_layer, #sctp_setadaptation{}}
  •   #sctp_setadaptation{
            adaptation_ind = int()
      }        

    When set, requests that the local endpoint uses the value given by adaptation_ind as the Adaptation Indication parameter for establishing new associations. See RFC2960 and Sockets API Extenstions for SCTP for more details.

  • {sctp_peer_addr_params, #sctp_paddrparams{}}
  •   #sctp_paddrparams{
            assoc_id   = assoc_id(),
            address    = {IP, Port},
            hbinterval = int(),
            pathmaxrxt = int(),
            pathmtu    = int(),
            sackdelay  = int(),
            flags      = list()
      }
      IP = ip_address()
      Port = port_number()        

    This option determines various per-address parameters for the association given by assoc_id and the peer address address (the SCTP protocol supports multi-homing, so more than 1 address can correspond to a given association).

    • hbinterval: heartbeat interval, in milliseconds;

    • pathmaxrxt: max number of retransmissions before this address is considered unreachable (and an alternative address is selected);

    • pathmtu: fixed Path MTU, if automatic discovery is disabled (see flags below);

    • sackdelay: delay in milliseconds for SAC messages (if the delay is enabled, see flags below);

    • flags: the following flags are available:

      • hb_enable: enable heartbeat;

      • hb_disable: disable heartbeat;

      • hb_demand: initiate heartbeat immediately;

      • pmtud_enable: enable automatic Path MTU discovery;

      • pmtud_disable: disable automatic Path MTU discovery;

      • sackdelay_enable: enable SAC delay;

      • sackdelay_disable: disable SAC delay.

  • {sctp_default_send_param, #sctp_sndrcvinfo{}}
  •   #sctp_sndrcvinfo{
            stream     = int(),
            ssn        = int(),
            flags      = list(),
            ppid       = int(),
            context    = int(),
            timetolive = int(),
            tsn        = int(),
            cumtsn     = int(),
            assoc_id   = assoc_id()
      }        

    #sctp_sndrcvinfo{} is used both in this socket option, and as ancillary data while sending or receiving SCTP messages. When set as an option, it provides a default values for subsequent gen_sctp:sendcalls on the association given by assoc_id. assoc_id = 0 (default) indicates the whole endpoint. The following fields typically need to be specified by the sender:

    • sinfo_stream: stream number (0-base) within the association to send the messages through;

    • sinfo_flags: the following flags are recognised:

      • unordered: the message is to be sent unordered;

      • addr_over: the address specified in gen_sctp:send overwrites the primary peer address;

      • abort: abort the current association without flushing any unsent data;

      • eof: gracefully shut down the current association, with flushing of unsent data.

      Other fields are rarely used. See RFC2960 and Sockets API Extensions for SCTP for full information.

  • {sctp_events, #sctp_event_subscribe{}}
  •   #sctp_event_subscribe{
              data_io_event          = true | false,
              association_event      = true | false,
              address_event          = true | false,
              send_failure_event     = true | false,
              peer_error_event       = true | false,
              shutdown_event         = true | false,
              partial_delivery_event = true | false,
              adaptation_layer_event = true | false
        }        

    This option determines which SCTP Events are to be received (via recv/*) along with the data. The only exception is data_io_event which enables or disables receiving of #sctp_sndrcvinfo{} ancillary data, not events. By default, all flags except adaptation_layer_event are enabled, although sctp_data_io_event and association_event are used by the driver itself and not exported to the user level.

  • {sctp_delayed_ack_time, #sctp_assoc_value{}}
  •   #sctp_assoc_value{
            assoc_id    = assoc_id(),
            assoc_value = int()
      }        

    Rarely used. Determines the ACK time (given by assoc_value in milliseconds) for the given association or the whole endpoint if assoc_value = 0 (default).

  • {sctp_status, #sctp_status{}}
  •   #sctp_status{
            assoc_id            = assoc_id(),
            state               = atom(),
            rwnd                = int(),
            unackdata           = int(),
            penddata            = int(),
            instrms             = int(),
            outstrms            = int(),
            fragmentation_point = int(),
            primary             = #sctp_paddrinfo{}
      }        

    This option is read-only. It determines the status of the SCTP association given by assoc_id. Possible values of state follows. The state designations are mostly self-explanatory. state_empty is the default which means that no other state is active:

    • sctp_state_empty

    • sctp_state_closed

    • sctp_state_cookie_wait

    • sctp_state_cookie_echoed

    • sctp_state_established

    • sctp_state_shutdown_pending

    • sctp_state_shutdown_sent

    • sctp_state_shutdown_received

    • sctp_state_shutdown_ack_sent

    The semantics of other fields is the following:

    • sstat_rwnd: the association peer's current receiver window size;

    • sstat_unackdata: number of unacked data chunks;

    • sstat_penddata: number of data chunks pending receipt;

    • sstat_instrms: number of inbound streams;

    • sstat_outstrms: number of outbound streams;

    • sstat_fragmentation_point: message size at which SCTP fragmentation will occur;

    • sstat_primary: information on the current primary peer address (see below for the format of #sctp_paddrinfo{}).

  • {sctp_get_peer_addr_info, #sctp_paddrinfo{}}
  •   #sctp_paddrinfo{
            assoc_id  = assoc_id(),
            address   = {IP, Port},
            state     = inactive | active,
            cwnd      = int(),
            srtt      = int(),
            rto       = int(),
            mtu       = int()
      }
      IP = ip_address()
      Port = port_number()        

    This option is read-only. It determines the parameters specific to the peer's address given by address within the association given by assoc_id. The address field must be set by the caller; all other fields are filled in on return. If assoc_id = 0 (default), the address is automatically translated into the corresponding association ID. This option is rarely used; see RFC2960 and Sockets API Extensions for SCTP for the semantics of all fields.

  • SCTP EXAMPLES

    SEE ALSO

    inet(3), gen_tcp(3), gen_upd(3), RFC2960 (Stream Control Transmission Protocol), Sockets API Extensions for SCTP.