public class TextFile extends Object
Modifier and Type | Method and Description |
---|---|
void |
delete() |
boolean |
exists() |
String |
fastTail(int numChars)
Uses the platform default encoding.
|
String |
fastTail(int numChars,
Charset cs)
Efficiently reads the last N characters (or shorter, if the whole file is shorter than that.)
|
String |
head(int numChars)
Reads the first N characters or until we hit EOF.
|
Iterable<String> |
lines()
Deprecated.
This method does not properly propagate errors and may lead to file descriptor leaks
if the collection is not fully iterated. Use
linesStream() instead. |
LinesStream |
linesStream()
Creates a new
LinesStream of the file. |
String |
read()
Reads the entire contents and returns it.
|
String |
readTrim() |
String |
toString() |
void |
write(String text)
Overwrites the file by the given string.
|
@NonNull public final File file
public TextFile(@NonNull File file)
public boolean exists()
public void delete()
public String read() throws IOException
IOException
@Deprecated @NonNull public Iterable<String> lines()
linesStream()
instead.RuntimeException
- in the case of IOException
in linesStream()
@CreatesObligation @NonNull public LinesStream linesStream() throws IOException
LinesStream
of the file.
Note: The caller is responsible for closing the returned
LinesStream
.
IOException
- if the file cannot be converted to a
Path
or if the file cannot be opened for readingpublic void write(String text) throws IOException
IOException
@NonNull public String head(int numChars) throws IOException
IOException
@NonNull public String fastTail(int numChars, Charset cs) throws IOException
This method first tries to just read the tail section of the file to get the necessary chars. To handle multi-byte variable length encoding (such as UTF-8), we read a larger than necessary chunk.
Some multi-byte encoding, such as Shift-JIS (http://en.wikipedia.org/wiki/Shift_JIS) doesn't allow the first byte and the second byte of a single char to be unambiguously identified, so it is possible that we end up decoding incorrectly if we start reading in the middle of a multi-byte character. All the CJK multi-byte encodings that I know of are self-correcting; as they are ASCII-compatible, any ASCII characters or control characters will bring the decoding back in sync, so the worst case we just have some garbage in the beginning that needs to be discarded. To accommodate this, we read additional 1024 bytes.
Other encodings, such as UTF-8, are better in that the character boundary is unambiguous, so there can be at most one garbage char. For dealing with UTF-16 and UTF-32, we read at 4 bytes boundary (all the constants and multipliers are multiples of 4.)
Note that it is possible to construct a contrived input that fools this algorithm, and in this method we are willing to live with a small possibility of that to avoid reading the whole text. In practice, such an input is very unlikely.
So all in all, this algorithm should work decently, and it works quite efficiently on a large text.
IOException
@NonNull public String fastTail(int numChars) throws IOException
IOException
public String readTrim() throws IOException
IOException
Copyright © 2004–2021. All rights reserved.