Input and Output
Concept: All I/O in Rust is organized around 4 traits, owned by
std::io
:
Read
: Defines methods for byte-oriented input. Implementers are called readers.BufRead
: IncludesRead
methods, plus methods for reading lines of text and so forth. Implementers are called buffered readers.Write
: Defines methods for both byte-oriented and UTF-8 text output. Implementers are called writers.
Shortcut: All 4 traits are so commonly used that they can there's a prelude module containing only them. Just add:
#![allow(unused)] fn main() { use std::io::prelude::*; }
Readers and Writers
One of the simplest, most low-level implementation of both Read
and Write
is a function that copies data from any reader to any writer:
#![allow(unused)] fn main() { use std::io::{self, Read, Write, ErrorKind}; const DEFAULT_BUF_SIZE: usize = 8 * 1024; fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64> where R: Read, W: Write { let mut buf = [0; DEFAULT_BUF_SIZE]; let mut written = 0; loop { let len = match reader.read(&mut buf) { Ok(0) => return Ok(written), Ok(len) => len, Err(ref e) if e.kind() == ErrorKind::Interrupted => continue, Err(e) => return Err(e), }; writer.write_all(&buf[..len])?; written += len as u64; } } }
Shortcut: The import statement
use std::io::{self};
declaresio
as an alias to thestd::io
module, which means we can write things likestd::io::Result
as justio::Result
.
Readers
All main methods defined by Read
take the reader itself by mut
reference. There are also four adapter methods that take the reader
by value and transform it into an iterator or a different reader.
Note that there is no method for closing a reader. Readers and writers implement Drop
, so they are closed automatically.
Reader Method reader.read(buffer)
#![allow(unused)] fn main() { fn read(&mut self, buf: &mut [u8]) -> Result<usize> }
Reads an undefined number of bytes from the data source and stores them in the given buffer. The usize
success value is the number of bytes read, which might be less than or equal to buffer.len()
, even if there's still more data to read.
If read
returns Ok(0)
, theres no more input to read.
On error, read
returns Err(err)
, where err
is an io::Error
value. io::Errors
are printable for humans. For computers, you should use the .kind()
method, which returns an error code of type io::ErrorKind
.
io::ErrorKind
is an enum with lots of different types of errors. Most variants shouldn't be ignored because they indicate actual issues, but not all. io::ErrorKind::Interrupted
corresponds to the EINTR
UNIX error code, which means the signal was interrupted and can in almost all scenarios be ignored.
Reader Method reader.read_to_end(&mut byte_vec)
#![allow(unused)] fn main() { fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize> }
Reads all remaining input from the reader into a vector.
There's no limit on the amount of data that read_to_end
will return, so it's usually a good idea to impose a limit using .take()
.
Reader Method reader.read_to_string(&mut string)
#![allow(unused)] fn main() { fn read_to_string(&mut self, buf: &mut String) -> Result<usize> }
Reads all remaining input from the reader into a string. If the source provides data that isn't valid UTF-8, read_to_string
will return an ErrorKind::InvalidData
error.
Reader Method reader.read_exact(&mut buf)
#![allow(unused)] fn main() { fn read_exact(&mut self, buf: &mut [u8]) -> Result<()> }
Reads exactly enough data to fill the given buffer. If the reader runs out of data before reading buf.len()
bytes, read_exact
returns an ErrorKind::UnexpectedEof
error.
Adapter reader.bytes()
#![allow(unused)] fn main() { fn bytes(self) -> Bytes<Self> where Self: Sized }
Converts a reader into an iterator over the bytes of the input stream. The item types is io::Result<u8>
, so an error check is required for every byte. It calls reader.read()
one byte at a time, so this method is super inefficient if the reader isn't buffered.
Adapter reader.chars()
#![allow(unused)] fn main() { fn chars(self) -> Chars<Self> where Self: Sized }
Converts a reader into an iterator over the input stream as UTF-8 characters.
Adapter reader.chain(reader2)
#![allow(unused)] fn main() { fn chain<R: Read>(self, next: R) -> Chain<Self, R> where Self: Sized }
Creates a new reader that produces all of the input from reader
, followed by all of the input from reader2
.
Adapter reader.take(n)
#![allow(unused)] fn main() { fn take(self, limit: u64) -> Take<Self> where Self: Sized }
Creates a new reader that reads from the same source as reader
, but is limited to n
bytes of input.
Buffered Readers
Buffered readers implement both Read
and BufRead
, which provides three main methods.
Come back and add the type signatures of the following methods.
Buffered Reader Method reader.read_line(&mut line)
Reads a line of text and appends it to line
, which is of type String
.
The method returns an io::Result<usize, io::Error>
, where usize
is the number of bytes read, including the line ending, if any.
If the reader is at the end of the input, line
will be unchanged and the method will return Ok(0)
.
☆ Buffered Reader Method reader.lines()
Returns an iterator over the lines of the input.
The item type is io::Result<String, io::Error>
. Newline characters are not included in the strings.
Buffered Reader Methods reader.read_until(stop_byte, &mut byte_vec)
and reader.split(stop_byte)
Byte-oriented versions of .read_line()
and .lines()
. Produces Vec<u8>
instead of String
s.
Reading Lines
We can use .lines()
to create a function that implements the Unix grep
utility. Our function receives a generic reader (ie anything that implements BufRead
).
#![allow(unused)] fn main() { use std::io; use std::io::prelude::*; fn grep<R>(target: &str, reader: R) -> io::Result<()> where R: BufRead { for line_result in reader.lines() { let line = line_result?; if line.contains(target) { println!("{}", line); } } Ok(()) } }
In the case that we want to use stdin as our source of data, we have to convert it to a reader using its .lock()
method like so:
#![allow(unused)] fn main() { let stdin = io::stdin(); grep(&target, stdin.lock())?; // ok }
If we wanted to use our function with the contents of a file, we could do so like this:
#![allow(unused)] fn main() { let f = File::open(file)?; grep(&target, BufReader::new(f))?; // also ok }
Collecting Lines
Writers
To send output to a writer, use the write!()
and writeln!()
macros.
#![allow(unused)] fn main() { writeln!(io::stderr(), "error: world not helloable")?; writeln!(&mut byte_vec, "The greated common divisor of {:?} is {}", numbers, d)?; }
The write
macros are the same as the print
macros except for two differences:
- The
write
macros take an extra first argument, a writer. - The
write
macros return aResult
, so errors must be handled. When theprint
macros experience an issue, they simply panic.
The Write
trait has these methods:
Writer Method writer.write(&buf)
Writes some of the bytes in the slice buf
to the underlying stream.
Returns an io::Result<usize, io::Error>
.
On success, gives the number of bytes written, which may be less than buf.len()
, depending on the stream's mood.
This is the lowest-level method and is usually not used in practice.
Writer Method writer.write_all(&buf)
Writes all the bytes in the slice buf
.
Returns Result<(), io::Error>
.
Writer Method writer.flush()
Flushes any buffered data to the underlying stream.
Returns Result<(), io::Error>
.
Warning: When a
BufWriter
is dropped, all remaining buffered data is written to the underlying writer. However, if an error occurs during this write, the error is ignored. To make sure errors don't get swallowed, always call.flush()
on all buffered writers before dropping them.
Files
We've got two main ways to open a file:
File Method File::open(filename)
Opens an existing file for reading. It's an error if the file doesn't exist.
Returns an io::Result<File, io::Error>
.
File Method File::create(filename)
Creates a new file for writing. If a file exists with the given filename, it gets truncated.
Returns an io::Result<File, io::Error>
.
There is an altertive that uses OpenOptions
to specify the exact open behavior we want.
#![allow(unused)] fn main() { use std::fs::OpenOptions; // Create a file if none exists, or append to an existing one let log = OpenOptions::new() .append(true) .open("server.log"); // Create a file, or fail if one with the specified name already exists let new_file = OpenOptions::new() .write(true) .create_new(true) .open("new_file.txt")?; }
Just like with readers and writers, you can add a buffer to a File
if needed.
Term The method-chaining pattern seen with
OpenOptions
is called a builder in Rust.
Seeking
File
s also implement the Seek
trait, which means you can hop around within a File
rather than reading or writing in a single pass from the beginning to the end.
Seek
is defined like this:
#![allow(unused)] fn main() { pub trait Seek { fn seek(&mut self, pos: SeekFrom) -> io::Result<u64>; } pub enum SeekFrom { Start(u64), End(i64), Current(i64), } }
Seeking within a file is slow.
Other Reader and Writer Types
Add notes about common types of readers and writers.
Handy Readers and Writers
The std::io
offers a few function that return trivial readers and writers.
io::sink()
No-op writer. All the write methods return Ok
and the data is discarded.
io::empty()
No-op reader. Reading always succeeds and returns end-of-input.
io::repeat(byte)
Creates a reader that repeats the given byte endlessly.
Binary Data, Compression, and Serialization
Go here for some crate recommendations.
Files and Directories
OsStr
and Path
Rust strings are always valid Unicode. Filenames are almost always Unicode.
To solve the Unicode issue, Rust provides std::ffi::OsStr
and std::ffi::OsString
.
std::ffi::OsStr
OsStr
is a string type that's a subset of UTF-8. It's sole purpose is to represent all filenames, CLI arguments, and environment variables on all systems.
std::path::Path
Path
is exactly like OsStr
, but it provides a bunch of handy filename-related methods.
When to use which?
For absolute and relative paths, use Path
.
For an individual component of a path, use OsStr
.
Owning types
For each string type, there's always a corresponding owning type that owns heap-allocated data.
String type | Owning type | Conversion method
--|--
str
| String
| .to_string()
OsStr
| OsString
| .to_os_string()
Path
| PathBuf
| .to_path_buf()
All three of these string types implement a common trait, AsRef<Path>
, which makes it easy to declare a generic function that accepts "any filename type" as an argument.
#![allow(unused)] fn main() { use std::path::Path; use std::io; fn open_file<P>(path_arg: P) -> io::Result<()> where P: AsRef<Path> { let path = path_arg.as_ref(); // ... } }
Path
and PathBuf
Methods
Path Method Path::new(str)
#![allow(unused)] fn main() { fn new<S: AsRef<OsStr> + ?Sized>(s: &S) -> &Path }
Converts a &str
or &OsStr
to a &Path
. The string doesn't get copied; the new &Path
points to the same bytes as the original argument.
Path Method path.parent()
#![allow(unused)] fn main() { fn parent(&self) -> Option<&Path> }
Returns the path's parent directory, if any. The path doesn't get copied; the parent directory of path
is always a substring of path
.
Path Method path.file_name()
#![allow(unused)] fn main() { fn file_name(&self) -> Option<&OsStr> }
Returns the last component of path
, if any.
Path Methods path.is_absolute()
and path.is_relative()
#![allow(unused)] fn main() { fn is_absolute(&self) -> bool fn is_relative(&self) -> bool }
Tells you whether the path is absolute or relative.
Path Method path1.join(path2)
#![allow(unused)] fn main() { fn join<P: AsRef<Path>>(&self, path: P) -> PathBuf }
Joins two paths. If path2
is an absolute path, it just returns a copy of path2
.
Use Case: The path
join
method can be used to turn any path into an absolute path.#![allow(unused)] fn main() { let abs_path = std::env::current_dir()?.join(any_path); }
Path Method path.components()
#![allow(unused)] fn main() { fn components(&self) -> Components }
Creates an iterator over the components of the given path, from left to right. The Item
type of the iterator is std::path::Component
, which is an enum:
#![allow(unused)] fn main() { pub enum Component<'a> { Prefix(PrefixComponent<'a>), RootDir, CurDir, ParentDir, Normal(&'a OsStr), } }
Converting Path
s to Strings
Path Method path.to_str()
#![allow(unused)] fn main() { fn to_str(&self) -> Option<&str> }
If path
isn't valid UTF-8, this method returns None
.
Path Method path.to_string_lossy()
#![allow(unused)] fn main() { fn to_string_lossy(&self) -> Cow<str> }
Basically the same as to_str
, but it'll always return a string regardless of whether or not the path is valid UTF-8. In the case the case that it's not valid, each invalid byte is replaced with the Unicode replacement character, �.
Path Method path.display()
#![allow(unused)] fn main() { fn display(&self) -> Display }
Doesn't return a string, but it implements Display
so that it can be used with print!
macro and friends.
Filesystem Access Functions
Reading Directories
To list the contents of a directory, use std::fs::read_dir
, or the .read_dir()
method of a Path
:
#![allow(unused)] fn main() { use std::path; for entry_result in path.read_dir()? { let entry = entry_result?; println!("{}", entry.file_name().to_string_lossy()); } }
The read_dir
method has the following type signature:
#![allow(unused)] fn main() { fn read_dir<P: AsRef<Path>>(path: P) -> Result<ReadDir> }
A DirEntry
is a struct with a few methods that have the following signatures:
#![allow(unused)] fn main() { struct DirEntry(_); fn path(&self) -> PathBuf fn metadata(&self) -> Result<Metadata> fn file_type(&self) -> Result<FileType> fn file_name(&self) -> OsString }
Platform-Specific Features
The std::os
module contains a bunch of platform-specific features, like symlink
.
If you want code to compile on all platforms, with support for symbolic links on Unix, for instance, you must use #[cfg]
in the program as well. In such cases, it's easiest to import symlink
on Unix, while defining a symlink
stub on other systems:
#![allow(unused)] fn main() { #[cfg(unix)] use std::os::unix::fs::symlink; // Stub implementation of symlink for platforms that don't have it #[cfg(not(unix))] fn symlink<P: AsRef<Path>, Q: AsRef<Path>>(src: P, _dst: Q) -> std::io::Result<()> { Err(io::Error::new( io::ErrorKind::Other, format!("can't copy symbolic link {}", src.as_ref().display()) )) } }
There's a prelude
module that can be used to enable all Unix extensions at once:
#![allow(unused)] fn main() { use std::os::unix::prelude::*; }
Networking
For low-level networking code, start with the std::net
module.