Package org.freebsd.file
Class FileEncoding
- java.lang.Object
-
- org.freebsd.file.FileEncoding
-
public class FileEncoding extends java.lang.Object
Tries to guess the encoding of the byte sequence. Orignial code taken from https://github.com/file/file/blob/master/src/encoding.c
-
-
Field Summary
Fields Modifier and Type Field Description private java.lang.String
code
private java.lang.String
code_mime
private static char[]
ebcdic_1047_to_8859
private static char[]
ebcdic_to_ascii
private static byte
F
private static byte
I
private static byte
T
private byte[]
text_chars
private java.lang.String
type
private static byte
X
-
Constructor Summary
Constructors Constructor Description FileEncoding()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private byte[]
from_ebcdic(byte[] buf, int nbytes)
java.lang.String
getCode()
java.lang.String
getCodeMime()
java.lang.String
getType()
boolean
guessFileEncoding(byte[] buf)
Try to determine whether text is in some character code we can identify.private boolean
looks_ascii(byte[] buf, int nbytes)
private boolean
looks_extended(byte[] buf, int nbytes)
private boolean
looks_latin1(byte[] buf, int nbytes)
private int
looks_ucs16(byte[] buf, int nbytes)
private boolean
looks_utf7(byte[] buf, int nbytes)
protected int
looks_utf8(byte[] buf, int nbytes)
private boolean
looks_utf8_with_BOM(byte[] buf, int nbytes)
private int
unsignedByte(byte value)
-
-
-
Field Detail
-
type
private java.lang.String type
-
code
private java.lang.String code
-
code_mime
private java.lang.String code_mime
-
F
private static final byte F
- See Also:
- Constant Field Values
-
T
private static final byte T
- See Also:
- Constant Field Values
-
I
private static final byte I
- See Also:
- Constant Field Values
-
X
private static final byte X
- See Also:
- Constant Field Values
-
text_chars
private byte[] text_chars
-
ebcdic_to_ascii
private static final char[] ebcdic_to_ascii
-
ebcdic_1047_to_8859
private static final char[] ebcdic_1047_to_8859
-
-
Method Detail
-
getCodeMime
public java.lang.String getCodeMime()
-
getType
public java.lang.String getType()
-
getCode
public java.lang.String getCode()
-
guessFileEncoding
public boolean guessFileEncoding(byte[] buf)
Try to determine whether text is in some character code we can identify. It also identifies EBCDIC by converting it to ISO-8859-1.- Returns:
- true if it could guess an encoding.
-
looks_ascii
private boolean looks_ascii(byte[] buf, int nbytes)
-
looks_latin1
private boolean looks_latin1(byte[] buf, int nbytes)
-
looks_extended
private boolean looks_extended(byte[] buf, int nbytes)
-
looks_utf8
protected int looks_utf8(byte[] buf, int nbytes)
-
looks_utf8_with_BOM
private boolean looks_utf8_with_BOM(byte[] buf, int nbytes)
-
looks_utf7
private boolean looks_utf7(byte[] buf, int nbytes)
-
looks_ucs16
private int looks_ucs16(byte[] buf, int nbytes)
-
from_ebcdic
private byte[] from_ebcdic(byte[] buf, int nbytes)
-
unsignedByte
private int unsignedByte(byte value)
-
-