file

File Interface Module

The module file provides an interface to the file system.

On operating systems with thread support, it is possible to let file operations be performed in threads of their own, allowing other Erlang processes to continue executing in parallel with the file operations. See the command line flag +A in erl(1).

With regard to file name encoding, the Erlang VM can operate in two modes. The current mode can be queried using the native_name_encoding/0 function. It returns either latin1 or utf8.

In the latin1 mode, the Erlang VM does not change the encoding of file names. In the utf8 mode, file names can contain Unicode characters greater than 255 and the VM will convert file names back and forth to the native file name encoding (usually UTF-8, but UTF-16 on Windows).

The default mode depends on the operating system. Windows and MacOS X enforce consistent file name encoding and therefore the VM uses the utf8 mode.

On operating systems with transparent naming (i.e. all Unix systems except MacOS X), the default will be utf8 if the terminal supports UTF-8, otherwise latin1. The default may be overridden using the +fnl (to force latin1 mode) or +fnu (to force utf8 mode) when starting erl.

On operating systems with transparent naming, files could be inconsistently named, i.e. some files are encoded in UTF-8 while others are encoded in (for example) iso-latin1. To be able to handle file systems with inconsistent naming when running in the utf8 mode, the concept of "raw file names" has been introduced.

A raw file name is a file name given as a binary. The Erlang VM will perform no translation of a file name given as a binary on systems with transparent naming.

When running in the utf8 mode, the file:list_dir/1 and file:read_link/1 functions will never return raw file names. Use the list_dir_all/1 and read_link_all/1 functions to return all file names including raw file names.

Also see Notes about raw file names.

Types


deep_list() = [char() | atom() | deep_list()]

fd()

A file descriptor representing a file opened in raw mode.

filename() = string()

filename_all() = string() | binary()

io_device() = pid() | fd()

As returned by file:open/2; pid() is a process handling I/O-protocols.

name() = string() | atom() | deep_list()

If VM is in Unicode filename mode, string() and char() are allowed to be > 255.

name_all() =
            string() | atom() | deep_list() | (RawFilename :: binary())

If VM is in Unicode filename mode, string() and char() are allowed to be > 255. RawFilename is a filename not subject to Unicode translation, meaning that it can contain characters not conforming to the Unicode encoding expected from the filesystem (i.e. non-UTF-8 characters although the VM is started in Unicode filename mode).

posix() =
            eacces |
            eagain |
            ebadf |
            ebusy |
            edquot |
            eexist |
            efault |
            efbig |
            eintr |
            einval |
            eio |
            eisdir |
            eloop |
            emfile |
            emlink |
            enametoolong |
            enfile |
            enodev |
            enoent |
            enomem |
            enospc |
            enotblk |
            enotdir |
            enotsup |
            enxio |
            eperm |
            epipe |
            erofs |
            espipe |
            esrch |
            estale |
            exdev

An atom which is named from the POSIX error codes used in Unix, and in the runtime libraries of most C compilers.

date_time() = calendar:datetime()

Must denote a valid date and time.

file_info() =
            #file_info{size = undefined | integer() >= 0,
                       type =
                           undefined |
                           device |
                           directory |
                           other |
                           regular |
                           symlink,
                       access =
                           undefined | read | write | read_write | none,
                       atime =
                           undefined |
                           file:date_time() |
                           integer() >= 0,
                       mtime =
                           undefined |
                           file:date_time() |
                           integer() >= 0,
                       ctime =
                           undefined |
                           file:date_time() |
                           integer() >= 0,
                       mode = undefined | integer() >= 0,
                       links = undefined | integer() >= 0,
                       major_device = undefined | integer() >= 0,
                       minor_device = undefined | integer() >= 0,
                       inode = undefined | integer() >= 0,
                       uid = undefined | integer() >= 0,
                       gid = undefined | integer() >= 0}

location() =
            integer() |
            {bof, Offset :: integer()} |
            {cur, Offset :: integer()} |
            {eof, Offset :: integer()} |
            bof |
            cur |
            eof

mode() =
            read |
            write |
            append |
            exclusive |
            raw |
            binary |
            {delayed_write,
             Size :: integer() >= 0,
             Delay :: integer() >= 0} |
            delayed_write |
            {read_ahead, Size :: integer() >= 1} |
            read_ahead |
            compressed |
            {encoding, unicode:encoding()} |
            sync

file_info_option() =
            {time, local} | {time, universal} | {time, posix} | raw

Functions


advise(IoDevice, Offset, Length, Advise) -> ok | {error, Reason}

  • posix_file_advise() =
        normal |
        sequential |
        random |
        no_reuse |
        will_need |
        dont_need

advise/4 can be used to announce an intention to access file data in a specific pattern in the future, thus allowing the operating system to perform appropriate optimizations.

On some platforms, this function might have no effect.

allocate(File, Offset, Length) -> ok | {error, posix()}

allocate/3 can be used to preallocate space for a file.

This function only succeeds in platforms that implement this feature. When it succeeds, space is preallocated for the file but the file size might not be updated. This behaviour depends on the preallocation implementation. To guarantee the file size is updated one must truncate the file to the new size.

change_group(Filename, Gid) -> ok | {error, Reason}

Changes group of a file. See write_file_info/2.

change_mode(Filename, Mode) -> ok | {error, Reason}

Changes permissions of a file. See write_file_info/2.

change_owner(Filename, Uid) -> ok | {error, Reason}

Changes owner of a file. See write_file_info/2.

change_owner(Filename, Uid, Gid) -> ok | {error, Reason}

Changes owner and group of a file. See write_file_info/2.

change_time(Filename, Mtime) -> ok | {error, Reason}

Changes the modification and access times of a file. See write_file_info/2.

change_time(Filename, Atime, Mtime) -> ok | {error, Reason}

Changes the modification and last access times of a file. See write_file_info/2.

close(IoDevice) -> ok | {error, Reason}

Closes the file referenced by IoDevice. It mostly returns ok, expect for some severe errors such as out of memory.

Note that if the option delayed_write was used when opening the file, close/1 might return an old write error and not even try to close the file. See open/2.

consult(Filename) -> {ok, Terms} | {error, Reason}

  • Filename = name_all()
  • Terms = [term()]
  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}

Reads Erlang terms, separated by '.', from Filename. Returns one of the following:

{ok, Terms}

The file was successfully read.

{error, atom()}

An error occurred when opening the file or reading it. See open/2 for a list of typical error codes.

{error, {Line, Mod, Term}}

An error occurred when interpreting the Erlang terms in the file. Use format_error/1 to convert the three-element tuple to an English description of the error.

Example:

f.txt:  {person, "kalle", 25}.
        {person, "pelle", 30}.
1> file:consult("f.txt").
{ok,[{person,"kalle",25},{person,"pelle",30}]}

The encoding of of Filename can be set by a comment as described in epp(3).

copy(Source, Destination) -> {ok, BytesCopied} | {error, Reason}

copy(Source, Destination, ByteCount) ->
        {ok, BytesCopied} | {error, Reason}

  • Source = Destination = io_device() | Filename | {Filename, Modes}
  • Filename = name_all()
  • Modes = [mode()]
  • ByteCount = integer() >= 0 | infinity
  • BytesCopied = integer() >= 0
  • Reason = posix() | badarg | terminated

Copies ByteCount bytes from Source to Destination. Source and Destination refer to either filenames or IO devices from e.g. open/2. ByteCount defaults to infinity, denoting an infinite number of bytes.

The argument Modes is a list of possible modes, see open/2, and defaults to [].

If both Source and Destination refer to filenames, the files are opened with [read, binary] and [write, binary] prepended to their mode lists, respectively, to optimize the copy.

If Source refers to a filename, it is opened with read mode prepended to the mode list before the copy, and closed when done.

If Destination refers to a filename, it is opened with write mode prepended to the mode list before the copy, and closed when done.

Returns {ok, BytesCopied} where BytesCopied is the number of bytes that actually was copied, which may be less than ByteCount if end of file was encountered on the source. If the operation fails, {error, Reason} is returned.

Typical error reasons: As for open/2 if a file had to be opened, and as for read/2 and write/2.

del_dir(Dir) -> ok | {error, Reason}

Tries to delete the directory Dir. The directory must be empty before it can be deleted. Returns ok if successful.

Typical error reasons are:

eacces

Missing search or write permissions for the parent directories of Dir.

eexist

The directory is not empty.

enoent

The directory does not exist.

enotdir

A component of Dir is not a directory. On some platforms, enoent is returned instead.

einval

Attempt to delete the current directory. On some platforms, eacces is returned instead.

delete(Filename) -> ok | {error, Reason}

Tries to delete the file Filename. Returns ok if successful.

Typical error reasons are:

enoent

The file does not exist.

eacces

Missing permission for the file or one of its parents.

eperm

The file is a directory and the user is not super-user.

enotdir

A component of the file name is not a directory. On some platforms, enoent is returned instead.

einval

Filename had an improper type, such as tuple.

Warning!

In a future release, a bad type for the Filename argument will probably generate an exception.

eval(Filename) -> ok | {error, Reason}

  • Filename = name_all()
  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}

Reads and evaluates Erlang expressions, separated by '.' (or ',', a sequence of expressions is also an expression), from Filename. The actual result of the evaluation is not returned; any expression sequence in the file must be there for its side effect. Returns one of the following:

ok

The file was read and evaluated.

{error, atom()}

An error occurred when opening the file or reading it. See open/2 for a list of typical error codes.

{error, {Line, Mod, Term}}

An error occurred when interpreting the Erlang expressions in the file. Use format_error/1 to convert the three-element tuple to an English description of the error.

The encoding of of Filename can be set by a comment as described in epp(3).

eval(Filename, Bindings) -> ok | {error, Reason}

The same as eval/1 but the variable bindings Bindings are used in the evaluation. See erl_eval(3) about variable bindings.

format_error(Reason) -> Chars

  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}
  • Chars = string()

Given the error reason returned by any function in this module, returns a descriptive string of the error in English.

get_cwd() -> {ok, Dir} | {error, Reason}

Returns {ok, Dir}, where Dir is the current working directory of the file server.

Note!

In rare circumstances, this function can fail on Unix. It may happen if read permission does not exist for the parent directories of the current directory.

Typical error reasons are:

eacces

Missing read permission for one of the parents of the current directory.

get_cwd(Drive) -> {ok, Dir} | {error, Reason}

Drive should be of the form "Letter:", for example "c:". Returns {ok, Dir} or {error, Reason}, where Dir is the current working directory of the drive specified.

This function returns {error, enotsup} on platforms which have no concept of current drive (Unix, for example).

Typical error reasons are:

enotsup

The operating system has no concept of drives.

eacces

The drive does not exist.

einval

The format of Drive is invalid.

list_dir(Dir) -> {ok, Filenames} | {error, Reason}

Lists all files in a directory, except files with "raw" names. Returns {ok, Filenames} if successful. Otherwise, it returns {error, Reason}. Filenames is a list of the names of all the files in the directory. The names are not sorted.

Typical error reasons are:

eacces

Missing search or write permissions for Dir or one of its parent directories.

enoent

The directory does not exist.

{no_translation, Filename}

Filename is a binary() with characters coded in ISO-latin-1 and the VM was started with the parameter +fnue.

list_dir_all(Dir) -> {ok, Filenames} | {error, Reason}

Lists all the files in a directory, including files with "raw" names. Returns {ok, Filenames} if successful. Otherwise, it returns {error, Reason}. Filenames is a list of the names of all the files in the directory. The names are not sorted.

Typical error reasons are:

eacces

Missing search or write permissions for Dir or one of its parent directories.

enoent

The directory does not exist.

make_dir(Dir) -> ok | {error, Reason}

Tries to create the directory Dir. Missing parent directories are not created. Returns ok if successful.

Typical error reasons are:

eacces

Missing search or write permissions for the parent directories of Dir.

eexist

There is already a file or directory named Dir.

enoent

A component of Dir does not exist.

enospc

There is a no space left on the device.

enotdir

A component of Dir is not a directory. On some platforms, enoent is returned instead.

make_link(Existing, New) -> ok | {error, Reason}

Makes a hard link from Existing to New, on platforms that support links (Unix and Windows). This function returns ok if the link was successfully created, or {error, Reason}. On platforms that do not support links, {error,enotsup} is returned.

Typical error reasons:

eacces

Missing read or write permissions for the parent directories of Existing or New.

eexist

New already exists.

enotsup

Hard links are not supported on this platform.

make_symlink(Existing, New) -> ok | {error, Reason}

This function creates a symbolic link New to the file or directory Existing, on platforms that support symbolic links (most Unix systems and Windows beginning with Vista). Existing need not exist. This function returns ok if the link was successfully created, or {error, Reason}. On platforms that do not support symbolic links, {error, enotsup} is returned.

Typical error reasons:

eacces

Missing read or write permissions for the parent directories of Existing or New.

eexist

New already exists.

enotsup

Symbolic links are not supported on this platform.

eperm

User does not have privileges to create symbolic links (SeCreateSymbolicLinkPrivilege on Windows).

native_name_encoding() -> latin1 | utf8

This function returns the file name encoding mode. If it is latin1, the system does no translation of file names. If it is utf8, file names will be converted back and forth to the native file name encoding (usually UTF-8, but UTF-16 on Windows).

open(File, Modes) -> {ok, IoDevice} | {error, Reason}

Opens the file File in the mode determined by Modes, which may contain one or more of the following items:

read

The file, which must exist, is opened for reading.

write

The file is opened for writing. It is created if it does not exist. If the file exists, and if write is not combined with read, the file will be truncated.

append

The file will be opened for writing, and it will be created if it does not exist. Every write operation to a file opened with append will take place at the end of the file.

exclusive

The file, when opened for writing, is created if it does not exist. If the file exists, open will return {error, eexist}.

Warning!

This option does not guarantee exclusiveness on file systems that do not support O_EXCL properly, such as NFS. Do not depend on this option unless you know that the file system supports it (in general, local file systems should be safe).

raw

The raw option allows faster access to a file, because no Erlang process is needed to handle the file. However, a file opened in this way has the following limitations:

The functions in the io module cannot be used, because they can only talk to an Erlang process. Instead, use the read/2, read_line/1 and write/2 functions. Especially if read_line/1 is to be used on a raw file, it is recommended to combine this option with the {read_ahead, Size} option as line oriented I/O is inefficient without buffering. Only the Erlang process which opened the file can use it. A remote Erlang file server cannot be used; the computer on which the Erlang node is running must have access to the file system (directly or through NFS).
binary

When this option has been given, read operations on the file will return binaries rather than lists.

{delayed_write, Size, Delay}

If this option is used, the data in subsequent write/2 calls is buffered until there are at least Size bytes buffered, or until the oldest buffered data is Delay milliseconds old. Then all buffered data is written in one operating system call. The buffered data is also flushed before some other file operation than write/2 is executed.

The purpose of this option is to increase performance by reducing the number of operating system calls, so the write/2 calls should be for sizes significantly less than Size, and not interspersed by to many other file operations, for this to happen.

When this option is used, the result of write/2 calls may prematurely be reported as successful, and if a write error should actually occur the error is reported as the result of the next file operation, which is not executed.

For example, when delayed_write is used, after a number of write/2 calls, close/1 might return {error, enospc} because there was not enough space on the disc for previously written data, and close/1 should probably be called again since the file is still open.

delayed_write

The same as {delayed_write, Size, Delay} with reasonable default values for Size and Delay. (Roughly some 64 KBytes, 2 seconds)

{read_ahead, Size}

This option activates read data buffering. If read/2 calls are for significantly less than Size bytes, read operations towards the operating system are still performed for blocks of Size bytes. The extra data is buffered and returned in subsequent read/2 calls, giving a performance gain since the number of operating system calls is reduced.

The read_ahead buffer is also highly utilized by the read_line/1 function in raw mode, why this option is recommended (for performance reasons) when accessing raw files using that function.

If read/2 calls are for sizes not significantly less than, or even greater than Size bytes, no performance gain can be expected.

read_ahead

The same as {read_ahead, Size} with a reasonable default value for Size. (Roughly some 64 KBytes)

compressed

Makes it possible to read or write gzip compressed files. The compressed option must be combined with either read or write, but not both. Note that the file size obtained with read_file_info/1 will most probably not match the number of bytes that can be read from a compressed file.

{encoding, Encoding}

Makes the file perform automatic translation of characters to and from a specific (Unicode) encoding. Note that the data supplied to file:write or returned by file:read still is byte oriented, this option only denotes how data is actually stored in the disk file.

Depending on the encoding, different methods of reading and writing data is preferred. The default encoding of latin1 implies using this (the file) module for reading and writing data, as the interfaces provided here work with byte-oriented data, while using other (Unicode) encodings makes the io(3) module's get_chars, get_line and put_chars functions more suitable, as they can work with the full Unicode range.

If data is sent to an io_device() in a format that cannot be converted to the specified encoding, or if data is read by a function that returns data in a format that cannot cope with the character range of the data, an error occurs and the file will be closed.

The allowed values for Encoding are:

latin1

The default encoding. Bytes supplied to i.e. file:write are written as is on the file, likewise bytes read from the file are returned to i.e. file:read as is. If the io(3) module is used for writing, the file can only cope with Unicode characters up to codepoint 255 (the ISO-latin-1 range).

unicode or utf8

Characters are translated to and from the UTF-8 encoding before being written to or read from the file. A file opened in this way might be readable using the file:read function, as long as no data stored on the file lies beyond the ISO-latin-1 range (0..255), but failure will occur if the data contains Unicode codepoints beyond that range. The file is best read with the functions in the Unicode aware io(3) module.

Bytes written to the file by any means are translated to UTF-8 encoding before actually being stored on the disk file.

utf16 or {utf16,big}

Works like unicode, but translation is done to and from big endian UTF-16 instead of UTF-8.

{utf16,little}

Works like unicode, but translation is done to and from little endian UTF-16 instead of UTF-8.

utf32 or {utf32,big}

Works like unicode, but translation is done to and from big endian UTF-32 instead of UTF-8.

{utf32,little}

Works like unicode, but translation is done to and from little endian UTF-32 instead of UTF-8.

The Encoding can be changed for a file "on the fly" by using the io:setopts/2 function, why a file can be analyzed in latin1 encoding for i.e. a BOM, positioned beyond the BOM and then be set for the right encoding before further reading.See the unicode(3) module for functions identifying BOM's.

This option is not allowed on raw files.

ram

File must be iodata(). Returns an fd() which lets the file module operate on the data in-memory as if it is a file.

sync

On platforms that support it, enables the POSIX O_SYNC synchronous I/O flag or its platform-dependent equivalent (e.g., FILE_FLAG_WRITE_THROUGH on Windows) so that writes to the file block until the data has been physically written to disk. Be aware, though, that the exact semantics of this flag differ from platform to platform; for example, neither Linux nor Windows guarantees that all file metadata are also written before the call returns. For precise semantics, check the details of your platform's documentation. On platforms with no support for POSIX O_SYNC or equivalent, use of the sync flag causes open to return {error, enotsup}.

Returns:

{ok, IoDevice}

The file has been opened in the requested mode. IoDevice is a reference to the file.

{error, Reason}

The file could not be opened.

IoDevice is really the pid of the process which handles the file. This process is linked to the process which originally opened the file. If any process to which the IoDevice is linked terminates, the file will be closed and the process itself will be terminated. An IoDevice returned from this call can be used as an argument to the IO functions (see io(3)).

Note!

In previous versions of file, modes were given as one of the atoms read, write, or read_write instead of a list. This is still allowed for reasons of backwards compatibility, but should not be used for new code. Also note that read_write is not allowed in a mode list.

Typical error reasons:

enoent

The file does not exist.

eacces

Missing permission for reading the file or searching one of the parent directories.

eisdir

The named file is not a regular file. It may be a directory, a fifo, or a device.

enotdir

A component of the file name is not a directory. On some platforms, enoent is returned instead.

enospc

There is a no space left on the device (if write access was specified).

path_consult(Path, Filename) ->
                {ok, Terms, FullName} | {error, Reason}

  • Path = [Dir]
  • Dir = Filename = name_all()
  • Terms = [term()]
  • FullName = filename_all()
  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}

Searches the path Path (a list of directory names) until the file Filename is found. If Filename is an absolute filename, Path is ignored. Then reads Erlang terms, separated by '.', from the file. Returns one of the following:

{ok, Terms, FullName}

The file was successfully read. FullName is the full name of the file.

{error, enoent}

The file could not be found in any of the directories in Path.

{error, atom()}

An error occurred when opening the file or reading it. See open/2 for a list of typical error codes.

{error, {Line, Mod, Term}}

An error occurred when interpreting the Erlang terms in the file. Use format_error/1 to convert the three-element tuple to an English description of the error.

The encoding of of Filename can be set by a comment as described in epp(3).

path_eval(Path, Filename) -> {ok, FullName} | {error, Reason}

  • Path = [Dir :: name_all()]
  • Filename = name_all()
  • FullName = filename_all()
  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}

Searches the path Path (a list of directory names) until the file Filename is found. If Filename is an absolute file name, Path is ignored. Then reads and evaluates Erlang expressions, separated by '.' (or ',', a sequence of expressions is also an expression), from the file. The actual result of evaluation is not returned; any expression sequence in the file must be there for its side effect. Returns one of the following:

{ok, FullName}

The file was read and evaluated. FullName is the full name of the file.

{error, enoent}

The file could not be found in any of the directories in Path.

{error, atom()}

An error occurred when opening the file or reading it. See open/2 for a list of typical error codes.

{error, {Line, Mod, Term}}

An error occurred when interpreting the Erlang expressions in the file. Use format_error/1 to convert the three-element tuple to an English description of the error.

The encoding of of Filename can be set by a comment as described in epp(3).

path_open(Path, Filename, Modes) ->
             {ok, IoDevice, FullName} | {error, Reason}

Searches the path Path (a list of directory names) until the file Filename is found. If Filename is an absolute file name, Path is ignored. Then opens the file in the mode determined by Modes. Returns one of the following:

{ok, IoDevice, FullName}

The file has been opened in the requested mode. IoDevice is a reference to the file and FullName is the full name of the file.

{error, enoent}

The file could not be found in any of the directories in Path.

{error, atom()}

The file could not be opened.

path_script(Path, Filename) ->
               {ok, Value, FullName} | {error, Reason}

  • Path = [Dir :: name_all()]
  • Filename = name_all()
  • Value = term()
  • FullName = filename_all()
  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}

Searches the path Path (a list of directory names) until the file Filename is found. If Filename is an absolute file name, Path is ignored. Then reads and evaluates Erlang expressions, separated by '.' (or ',', a sequence of expressions is also an expression), from the file. Returns one of the following:

{ok, Value, FullName}

The file was read and evaluated. FullName is the full name of the file and Value the value of the last expression.

{error, enoent}

The file could not be found in any of the directories in Path.

{error, atom()}

An error occurred when opening the file or reading it. See open/2 for a list of typical error codes.

{error, {Line, Mod, Term}}

An error occurred when interpreting the Erlang expressions in the file. Use format_error/1 to convert the three-element tuple to an English description of the error.

The encoding of of Filename can be set by a comment as described in epp(3).

path_script(Path, Filename, Bindings) ->
               {ok, Value, FullName} | {error, Reason}

The same as path_script/2 but the variable bindings Bindings are used in the evaluation. See erl_eval(3) about variable bindings.

pid2name(Pid) -> {ok, Filename} | undefined

If Pid is an IO device, that is, a pid returned from open/2, this function returns the filename, or rather:

{ok, Filename}

If this node's file server is not a slave, the file was opened by this node's file server, (this implies that Pid must be a local pid) and the file is not closed. Filename is the filename in flat string format.

undefined

In all other cases.

Warning!

This function is intended for debugging only.

position(IoDevice, Location) ->
            {ok, NewPosition} | {error, Reason}

Sets the position of the file referenced by IoDevice to Location. Returns {ok, NewPosition} (as absolute offset) if successful, otherwise {error, Reason}. Location is one of the following:

Offset

The same as {bof, Offset}.

{bof, Offset}

Absolute offset.

{cur, Offset}

Offset from the current position.

{eof, Offset}

Offset from the end of file.

bof | cur | eof

The same as above with Offset 0.

Note that offsets are counted in bytes, not in characters. If the file is opened using some other encoding than latin1, one byte does not correspond to one character. Positioning in such a file can only be done to known character boundaries, i.e. to a position earlier retrieved by getting a current position, to the beginning/end of the file or to some other position known to be on a correct character boundary by some other means (typically beyond a byte order mark in the file, which has a known byte-size).

Typical error reasons are:

einval

Either Location was illegal, or it evaluated to a negative offset in the file. Note that if the resulting position is a negative value, the result is an error, and after the call the file position is undefined.

pread(IoDevice, LocNums) -> {ok, DataL} | eof | {error, Reason}

  • IoDevice = io_device()
  • LocNums =
        [{Location :: location(), Number :: integer() >= 0}]
  • DataL = [Data]
  • Data = string() | binary() | eof
  • Reason = posix() | badarg | terminated

Performs a sequence of pread/3 in one operation, which is more efficient than calling them one at a time. Returns {ok, [Data, ...]} or {error, Reason}, where each Data, the result of the corresponding pread, is either a list or a binary depending on the mode of the file, or eof if the requested position was beyond end of file.

As the position is given as a byte-offset, special caution has to be taken when working with files where encoding is set to something else than latin1, as not every byte position will be a valid character boundary on such a file.

pread(IoDevice, Location, Number) ->
         {ok, Data} | eof | {error, Reason}

Combines position/2 and read/2 in one operation, which is more efficient than calling them one at a time. If IoDevice has been opened in raw mode, some restrictions apply: Location is only allowed to be an integer; and the current position of the file is undefined after the operation.

As the position is given as a byte-offset, special caution has to be taken when working with files where encoding is set to something else than latin1, as not every byte position will be a valid character boundary on such a file.

pwrite(IoDevice, LocBytes) -> ok | {error, {N, Reason}}

Performs a sequence of pwrite/3 in one operation, which is more efficient than calling them one at a time. Returns ok or {error, {N, Reason}}, where N is the number of successful writes that was done before the failure.

When positioning in a file with other encoding than latin1, caution must be taken to set the position on a correct character boundary, see position/2 for details.

pwrite(IoDevice, Location, Bytes) -> ok | {error, Reason}

Combines position/2 and write/2 in one operation, which is more efficient than calling them one at a time. If IoDevice has been opened in raw mode, some restrictions apply: Location is only allowed to be an integer; and the current position of the file is undefined after the operation.

When positioning in a file with other encoding than latin1, caution must be taken to set the position on a correct character boundary, see position/2 for details.

read(IoDevice, Number) -> {ok, Data} | eof | {error, Reason}

  • IoDevice = io_device() | atom()
  • Number = integer() >= 0
  • Data = string() | binary()
  • Reason =
        posix() |
        badarg |
        terminated |
        {no_translation, unicode, latin1}

Reads Number bytes/characters from the file referenced by IoDevice. The functions read/2, pread/3 and read_line/1 are the only ways to read from a file opened in raw mode (although they work for normally opened files, too).

For files where encoding is set to something else than latin1, one character might be represented by more than one byte on the file. The parameter Number always denotes the number of characters read from the file, while the position in the file might be moved much more than this number when reading a Unicode file.

Also, if encoding is set to something else than latin1, the read/3 call will fail if the data contains characters larger than 255, which is why the io(3) module is to be preferred when reading such a file.

The function returns:

{ok, Data}

If the file was opened in binary mode, the read bytes are returned in a binary, otherwise in a list. The list or binary will be shorter than the number of bytes requested if end of file was reached.

eof

Returned if Number>0 and end of file was reached before anything at all could be read.

{error, Reason}

An error occurred.

Typical error reasons:

ebadf

The file is not opened for reading.

{no_translation, unicode, latin1}

The file was opened with another encoding than latin1 and the data in the file can not be translated to the byte-oriented data that this function returns.

read_file(Filename) -> {ok, Binary} | {error, Reason}

  • Filename = name_all()
  • Binary = binary()
  • Reason = posix() | badarg | terminated | system_limit

Returns {ok, Binary}, where Binary is a binary data object that contains the contents of Filename, or {error, Reason} if an error occurs.

Typical error reasons:

enoent

The file does not exist.

eacces

Missing permission for reading the file, or for searching one of the parent directories.

eisdir

The named file is a directory.

enotdir

A component of the file name is not a directory. On some platforms, enoent is returned instead.

enomem

There is not enough memory for the contents of the file.

read_file_info(Filename) -> {ok, FileInfo} | {error, Reason}

read_file_info(Filename, Opts) -> {ok, FileInfo} | {error, Reason}

Retrieves information about a file. Returns {ok, FileInfo} if successful, otherwise {error, Reason}. FileInfo is a record file_info, defined in the Kernel include file file.hrl. Include the following directive in the module from which the function is called:

-include_lib("kernel/include/file.hrl").

The time type returned in atime, mtime and ctime is dependent on the time type set in Opts :: {time, Type}. Type local will return local time, universal will return universal time and posix will return seconds since or before unix time epoch which is 1970-01-01 00:00 UTC. Default is {time, local}.

If the raw option is set, the file server will not be called and only informations about local files will be returned.

Note!

Since file times is stored in posix time on most OS it is faster to query file information with the posix option.

The record file_info contains the following fields.

size = integer() >= 0

Size of file in bytes.

type = device | directory | other | regular | symlink

The type of the file.

access = read | write | read_write | none

The current system access to the file.

atime = date_time() | integer() >= 0

The last time the file was read.

mtime = date_time() | integer() >= 0

The last time the file was written.

ctime = date_time() | integer() >=0

The interpretation of this time field depends on the operating system. On Unix, it is the last time the file or the inode was changed. In Windows, it is the create time.

mode = integer() >= 0

The file permissions as the sum of the following bit values:

8#00400
read permission: owner
8#00200
write permission: owner
8#00100
execute permission: owner
8#00040
read permission: group
8#00020
write permission: group
8#00010
execute permission: group
8#00004
read permission: other
8#00002
write permission: other
8#00001
execute permission: other
16#800
set user id on execution
16#400
set group id on execution

On Unix platforms, other bits than those listed above may be set.

links = integer() >= 0

Number of links to the file (this will always be 1 for file systems which have no concept of links).

major_device = integer() >= 0

Identifies the file system where the file is located. In Windows, the number indicates a drive as follows: 0 means A:, 1 means B:, and so on.

minor_device = integer() >= 0

Only valid for character devices on Unix. In all other cases, this field is zero.

inode = integer() >= 0

Gives the inode number. On non-Unix file systems, this field will be zero.

uid = integer() >= 0

Indicates the owner of the file. Will be zero for non-Unix file systems.

gid = integer() >= 0

Gives the group that the owner of the file belongs to. Will be zero for non-Unix file systems.

Typical error reasons:

eacces

Missing search permission for one of the parent directories of the file.

enoent

The file does not exist.

enotdir

A component of the file name is not a directory. On some platforms, enoent is returned instead.

read_line(IoDevice) -> {ok, Data} | eof | {error, Reason}

  • IoDevice = io_device() | atom()
  • Data = string() | binary()
  • Reason =
        posix() |
        badarg |
        terminated |
        {no_translation, unicode, latin1}

Reads a line of bytes/characters from the file referenced by IoDevice. Lines are defined to be delimited by the linefeed (LF, \n) character, but any carriage return (CR, \r) followed by a newline is also treated as a single LF character (the carriage return is silently ignored). The line is returned including the LF, but excluding any CR immediately followed by a LF. This behaviour is consistent with the behaviour of io:get_line/2. If end of file is reached without any LF ending the last line, a line with no trailing LF is returned.

The function can be used on files opened in raw mode. It is however inefficient to use it on raw files if the file is not opened with the option {read_ahead, Size} specified, why combining raw and {read_ahead, Size} is highly recommended when opening a text file for raw line oriented reading.

If encoding is set to something else than latin1, the read_line/1 call will fail if the data contains characters larger than 255, why the io(3) module is to be preferred when reading such a file.

The function returns:

{ok, Data}

One line from the file is returned, including the trailing LF, but with CRLF sequences replaced by a single LF (see above).

If the file was opened in binary mode, the read bytes are returned in a binary, otherwise in a list.

eof

Returned if end of file was reached before anything at all could be read.

{error, Reason}

An error occurred.

Typical error reasons:

ebadf

The file is not opened for reading.

{no_translation, unicode, latin1}

The file is was opened with another encoding than latin1 and the data on the file can not be translated to the byte-oriented data that this function returns.

read_link(Name) -> {ok, Filename} | {error, Reason}

This function returns {ok, Filename} if Name refers to a symbolic link that is not a "raw" file name, or {error, Reason} otherwise. On platforms that do not support symbolic links, the return value will be {error,enotsup}.

Typical error reasons:

einval

Name does not refer to a symbolic link or the name of the file that it refers to does not conform to the expected encoding.

enoent

The file does not exist.

enotsup

Symbolic links are not supported on this platform.

This function returns {ok, Filename} if Name refers to a symbolic link or {error, Reason} otherwise. On platforms that do not support symbolic links, the return value will be {error,enotsup}.

Note that Filename can be either a list or a binary.

Typical error reasons:

einval

Name does not refer to a symbolic link.

enoent

The file does not exist.

enotsup

Symbolic links are not supported on this platform.

This function works like read_file_info/1,2 except that if Name is a symbolic link, information about the link will be returned in the file_info record and the type field of the record will be set to symlink.

If the raw option is set, the file server will not be called and only informations about local files will be returned.

If Name is not a symbolic link, this function returns exactly the same result as read_file_info/1. On platforms that do not support symbolic links, this function is always equivalent to read_file_info/1.

rename(Source, Destination) -> ok | {error, Reason}

Tries to rename the file Source to Destination. It can be used to move files (and directories) between directories, but it is not sufficient to specify the destination only. The destination file name must also be specified. For example, if bar is a normal file and foo and baz are directories, rename("foo/bar", "baz") returns an error, but rename("foo/bar", "baz/bar") succeeds. Returns ok if it is successful.

Note!

Renaming of open files is not allowed on most platforms (see eacces below).

Typical error reasons:

eacces

Missing read or write permissions for the parent directories of Source or Destination. On some platforms, this error is given if either Source or Destination is open.

eexist

Destination is not an empty directory. On some platforms, also given when Source and Destination are not of the same type.

einval

Source is a root directory, or Destination is a sub-directory of Source.

eisdir

Destination is a directory, but Source is not.

enoent

Source does not exist.

enotdir

Source is a directory, but Destination is not.

exdev

Source and Destination are on different file systems.

script(Filename) -> {ok, Value} | {error, Reason}

  • Filename = name_all()
  • Value = term()
  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}

Reads and evaluates Erlang expressions, separated by '.' (or ',', a sequence of expressions is also an expression), from the file. Returns one of the following:

{ok, Value}

The file was read and evaluated. Value is the value of the last expression.

{error, atom()}

An error occurred when opening the file or reading it. See open/2 for a list of typical error codes.

{error, {Line, Mod, Term}}

An error occurred when interpreting the Erlang expressions in the file. Use format_error/1 to convert the three-element tuple to an English description of the error.

The encoding of of Filename can be set by a comment as described in epp(3).

script(Filename, Bindings) -> {ok, Value} | {error, Reason}

  • Filename = name_all()
  • Bindings = erl_eval:binding_struct()
  • Value = term()
  • Reason =
        posix() |
        badarg |
        terminated |
        system_limit |
        {Line :: integer(), Mod :: module(), Term :: term()}

The same as script/1 but the variable bindings Bindings are used in the evaluation. See erl_eval(3) about variable bindings.

set_cwd(Dir) -> ok | {error, Reason}

  • Dir = name() | EncodedBinary
  • EncodedBinary = binary()
  • Reason = posix() | badarg | no_translation

Sets the current working directory of the file server to Dir. Returns ok if successful.

The functions in the file module usually treat binaries as raw filenames, i.e. they are passed as is even when the encoding of the binary does not agree with file:native_name_encoding(). This function however expects binaries to be encoded according to the value returned by file:native_name_encoding().

Typical error reasons are:

enoent

The directory does not exist.

enotdir

A component of Dir is not a directory. On some platforms, enoent is returned.

eacces

Missing permission for the directory or one of its parents.

badarg

Dir had an improper type, such as tuple.

no_translation

Dir is a binary() with characters coded in ISO-latin-1 and the VM is operating with unicode file name encoding.

Warning!

In a future release, a bad type for the Dir argument will probably generate an exception.

sync(IoDevice) -> ok | {error, Reason}

Makes sure that any buffers kept by the operating system (not by the Erlang runtime system) are written to disk. On some platforms, this function might have no effect.

Typical error reasons are:

enospc

Not enough space left to write the file.

datasync(IoDevice) -> ok | {error, Reason}

Makes sure that any buffers kept by the operating system (not by the Erlang runtime system) are written to disk. In many ways it resembles fsync but it does not update some of the file's metadata such as the access time. On some platforms this function has no effect.

Applications that access databases or log files often write a tiny data fragment (e.g., one line in a log file) and then call fsync() immediately in order to ensure that the written data is physically stored on the harddisk. Unfortunately, fsync() will always initiate two write operations: one for the newly written data and another one in order to update the modification time stored in the inode. If the modification time is not a part of the transaction concept, fdatasync() can be used to avoid unnecessary inode disk write operations.

Available only in some POSIX systems, this call results in a call to fsync(), or has no effect in systems not implementing the fdatasync() syscall.

truncate(IoDevice) -> ok | {error, Reason}

Truncates the file referenced by IoDevice at the current position. Returns ok if successful, otherwise {error, Reason}.

sendfile(Filename, Socket) ->
            {ok, integer() >= 0} |
            {error, inet:posix() | closed | badarg | not_owner}

Sends the file Filename to Socket. Returns {ok, BytesSent} if successful, otherwise {error, Reason}.

sendfile(RawFile, Socket, Offset, Bytes, Opts) ->
            {ok, integer() >= 0} |
            {error, inet:posix() | closed | badarg | not_owner}

  • sendfile_option() =
        {chunk_size, integer() >= 0} | {use_threads, boolean()}

Sends Bytes from the file referenced by RawFile beginning at Offset to Socket. Returns {ok, BytesSent} if successful, otherwise {error, Reason}. If Bytes is set to 0 all data after the given Offset is sent.

The file used must be opened using the raw flag, and the process calling sendfile must be the controlling process of the socket. See gen_tcp:controlling_process/2

If the OS used does not support sendfile, an Erlang fallback using file:read and gen_tcp:send is used.

The option list can contain the following options:

chunk_size
The chunk size used by the erlang fallback to send data. If using the fallback, this should be set to a value which comfortably fits in the systems memory. Default is 20 MB.
use_threads
Instruct the emulator to use the async thread pool for the sendfile system call. This could be usefull if the OS you are running on does not properly support non-blocking sendfile calls. Do note that using async threads potentially makes your system volnerable to slow client attacks. If set to true and no async threads are available, the sendfile call will return {error,einval}. Introduced in Erlang/OTP 17.0. Default is false.

write(IoDevice, Bytes) -> ok | {error, Reason}

Writes Bytes to the file referenced by IoDevice. This function is the only way to write to a file opened in raw mode (although it works for normally opened files, too). Returns ok if successful, and {error, Reason} otherwise.

If the file is opened with encoding set to something else than latin1, each byte written might result in several bytes actually being written to the file, as the byte range 0..255 might represent anything between one and four bytes depending on value and UTF encoding type.

Typical error reasons are:

ebadf

The file is not opened for writing.

enospc

There is a no space left on the device.

write_file(Filename, Bytes) -> ok | {error, Reason}

  • Filename = name_all()
  • Bytes = iodata()
  • Reason = posix() | badarg | terminated | system_limit

Writes the contents of the iodata term Bytes to the file Filename. The file is created if it does not exist. If it exists, the previous contents are overwritten. Returns ok, or {error, Reason}.

Typical error reasons are:

enoent

A component of the file name does not exist.

enotdir

A component of the file name is not a directory. On some platforms, enoent is returned instead.

enospc

There is a no space left on the device.

eacces

Missing permission for writing the file or searching one of the parent directories.

eisdir

The named file is a directory.

write_file(Filename, Bytes, Modes) -> ok | {error, Reason}

Same as write_file/2, but takes a third argument Modes, a list of possible modes, see open/2. The mode flags binary and write are implicit, so they should not be used.

write_file_info(Filename, FileInfo) -> ok | {error, Reason}

write_file_info(Filename, FileInfo, Opts) -> ok | {error, Reason}

Change file information. Returns ok if successful, otherwise {error, Reason}. FileInfo is a record file_info, defined in the Kernel include file file.hrl. Include the following directive in the module from which the function is called:

-include_lib("kernel/include/file.hrl").

The time type set in atime, mtime and ctime is dependent on the time type set in Opts :: {time, Type}. Type local will interpret the time set as local, universal will interpret it as universal time and posix must be seconds since or before unix time epoch which is 1970-01-01 00:00 UTC. Default is {time, local}.

If the raw option is set, the file server will not be called and only informations about local files will be returned.

The following fields are used from the record, if they are given.

atime = date_time() | integer() >= 0

The last time the file was read.

mtime = date_time() | integer() >= 0

The last time the file was written.

ctime = date_time() | integer() >= 0

On Unix, any value give for this field will be ignored (the "ctime" for the file will be set to the current time). On Windows, this field is the new creation time to set for the file.

mode = integer() >= 0

The file permissions as the sum of the following bit values:

8#00400
read permission: owner
8#00200
write permission: owner
8#00100
execute permission: owner
8#00040
read permission: group
8#00020
write permission: group
8#00010
execute permission: group
8#00004
read permission: other
8#00002
write permission: other
8#00001
execute permission: other
16#800
set user id on execution
16#400
set group id on execution

On Unix platforms, other bits than those listed above may be set.

uid = integer() >= 0

Indicates the owner of the file. Ignored for non-Unix file systems.

gid = integer() >= 0

Gives the group that the owner of the file belongs to. Ignored for non-Unix file systems.

Typical error reasons:

eacces

Missing search permission for one of the parent directories of the file.

enoent

The file does not exist.

enotdir

A component of the file name is not a directory. On some platforms, enoent is returned instead.

POSIX Error Codes

eacces - permission denied eagain - resource temporarily unavailable ebadf - bad file number ebusy - file busy edquot - disk quota exceeded eexist - file already exists efault - bad address in system call argument efbig - file too large eintr - interrupted system call einval - invalid argument eio - IO error eisdir - illegal operation on a directory eloop - too many levels of symbolic links emfile - too many open files emlink - too many links enametoolong - file name too long enfile - file table overflow enodev - no such device enoent - no such file or directory enomem - not enough memory enospc - no space left on device enotblk - block device required enotdir - not a directory enotsup - operation not supported enxio - no such device or address eperm - not owner epipe - broken pipe erofs - read-only file system espipe - invalid seek esrch - no such process estale - stale remote file handle exdev - cross-domain link

Performance

Some operating system file operations, for example a sync/1 or close/1 on a huge file, may block their calling thread for seconds. If this befalls the emulator main thread, the response time is no longer in the order of milliseconds, depending on the definition of "soft" in soft real-time system.

If the device driver thread pool is active, file operations are done through those threads instead, so the emulator can go on executing Erlang processes. Unfortunately, the time for serving a file operation increases due to the extra scheduling required from the operating system.

If the device driver thread pool is disabled or of size 0, large file reads and writes are segmented into several smaller, which enables the emulator so server other processes during the file operation. This gives the same effect as when using the thread pool, but with larger overhead. Other file operations, for example sync/1 or close/1 on a huge file, still are a problem.

For increased performance, raw files are recommended. Raw files uses the file system of the node's host machine. For normal files (non-raw), the file server is used to find the files, and if the node is running its file server as slave to another node's, and the other node runs on some other host machine, they may have different file systems. This is seldom a problem, but you have now been warned.

A normal file is really a process so it can be used as an IO device (see io). Therefore when data is written to a normal file, the sending of the data to the file process, copies all data that are not binaries. Opening the file in binary mode and writing binaries is therefore recommended. If the file is opened on another node, or if the file server runs as slave to another node's, also binaries are copied.

Caching data to reduce the number of file operations, or rather the number of calls to the file driver, will generally increase performance. The following function writes 4 MBytes in 23 seconds when tested:

create_file_slow(Name, N) when integer(N), N >= 0 ->
    {ok, FD} = file:open(Name, [raw, write, delayed_write, binary]),
    ok = create_file_slow(FD, 0, N),
    ok = ?FILE_MODULE:close(FD),
    ok.
      
create_file_slow(FD, M, M) ->
    ok;
create_file_slow(FD, M, N) ->
    ok = file:write(FD, <<M:32/unsigned>>),
    create_file_slow(FD, M+1, N).

The following, functionally equivalent, function collects 1024 entries into a list of 128 32-byte binaries before each call to file:write/2 and so does the same work in 0.52 seconds, which is 44 times faster.

create_file(Name, N) when integer(N), N >= 0 ->
    {ok, FD} = file:open(Name, [raw, write, delayed_write, binary]),
    ok = create_file(FD, 0, N),
    ok = ?FILE_MODULE:close(FD),
    ok.
      
create_file(FD, M, M) ->
    ok;
create_file(FD, M, N) when M + 1024 =< N ->
    create_file(FD, M, M + 1024, []),
    create_file(FD, M + 1024, N);
create_file(FD, M, N) ->
    create_file(FD, M, N, []).
      
create_file(FD, M, M, R) ->
    ok = file:write(FD, R);
create_file(FD, M, N0, R) when M + 8 =< N0 ->
    N1  = N0-1,  N2  = N0-2,  N3  = N0-3,  N4  = N0-4, 
    N5  = N0-5,  N6  = N0-6,  N7  = N0-7,  N8  = N0-8, 
    create_file(FD, M, N8, 
                [<<N8:32/unsigned,  N7:32/unsigned, 
                   N6:32/unsigned,  N5:32/unsigned, 
                   N4:32/unsigned,  N3:32/unsigned, 
                   N2:32/unsigned,  N1:32/unsigned>> | R]);
create_file(FD, M, N0, R) ->
    N1 = N0-1,
    create_file(FD, M, N1, [<<N1:32/unsigned>> | R]).

Note!

Trust only your own benchmarks. If the list length in create_file/2 above is increased, it will run slightly faster, but consume more memory and cause more memory fragmentation. How much this affects your application is something that this simple benchmark can not predict.

If the size of each binary is increased to 64 bytes, it will also run slightly faster, but the code will be twice as clumsy. In the current implementation are binaries larger than 64 bytes stored in memory common to all processes and not copied when sent between processes, while these smaller binaries are stored on the process heap and copied when sent like any other term.

So, with a binary size of 68 bytes create_file/2 runs 30 percent slower then with 64 bytes, and will cause much more memory fragmentation. Note that if the binaries were to be sent between processes (for example a non-raw file) the results would probably be completely different.

A raw file is really a port. When writing data to a port, it is efficient to write a list of binaries. There is no need to flatten a deep list before writing. On Unix hosts, scatter output, which writes a set of buffers in one operation, is used when possible. In this way file:write(FD, [Bin1, Bin2 | Bin3]) will write the contents of the binaries without copying the data at all except for perhaps deep down in the operating system kernel.

For raw files, pwrite/2 and pread/2 are efficiently implemented. The file driver is called only once for the whole operation, and the list iteration is done in the file driver.

The options delayed_write and read_ahead to file:open/2 makes the file driver cache data to reduce the number of operating system calls. The function create_file/2 in the example above takes 60 seconds seconds without the delayed_write option, which is 2.6 times slower.

And, as a really bad example, create_file_slow/2 above without the raw, binary and delayed_write options, that is it calls file:open(Name, [write]), needs 1 min 20 seconds for the job, which is 3.5 times slower than the first example, and 150 times slower than the optimized create_file/2.

Warnings

If an error occurs when accessing an open file with the io module, the process which handles the file will exit. The dead file process might hang if a process tries to access it later. This will be fixed in a future release.

SEE ALSO

filename(3)