module Sub:sig
..end
Substrings.
A substring defines a possibly empty subsequence of bytes in a base string.
The positions of a string s
of length l
are the slits found
before each byte and after the last byte of the string. They
are labelled from left to right by increasing number in the
range [0
;l
].
positions 0 1 2 3 4 l-1 l +---+---+---+---+ +-----+ indices | 0 | 1 | 2 | 3 | ... | l-1 | +---+---+---+---+ +-----+
The i
th byte index is between positions i
and i+1
.
Formally we define a substring of s
as being a subsequence
of bytes defined by a start and a stop position. The
former is always smaller or equal to the latter. When both
positions are equal the substring is empty. Note that for a
given base string there are as many empty substrings as there
are positions in the string.
Like in strings, we index the bytes of a substring using zero-based indices.
See how to use substrings to parse data.
typet =
Astring.String.sub
The type for substrings.
val empty : Astring.String.sub
empty
is the empty substring of the empty string Astring.String.empty
.
val v : ?start:int -> ?stop:int -> string -> Astring.String.sub
v ~start ~stop s
is the substring of s
that starts
at position start
(defaults to 0
) and stops at position
stop
(defaults to String.length s
).
Invalid_argument
if start
or stop
are not positions of
s
or if stop < start
.val start_pos : Astring.String.sub -> int
start_pos s
is s
's start position in the base string.
val stop_pos : Astring.String.sub -> int
stop_pos s
is s
's stop position in the base string.
val base_string : Astring.String.sub -> string
base_string s
is s
's base string.
val length : Astring.String.sub -> int
length s
is the number of bytes in s
.
val get : Astring.String.sub -> int -> char
get s i
is the byte of s
at its zero-based index i
.
Invalid_argument
if i
is not an index of s
.val get_byte : Astring.String.sub -> int -> int
get_byte s i
is Char.to_int (get s i)
.
val head : ?rev:bool -> Astring.String.sub -> char option
head s
is Some (get s h)
with h = 0
if rev = false
(default) or
h = length s - 1
if rev = true
. None
is returned if s
is
empty.
val get_head : ?rev:bool -> Astring.String.sub -> char
get_head s
is like Astring.String.Sub.head
but
Invalid_argument
if s
is empty.val of_string : string -> Astring.String.sub
of_string s
is v s
val to_string : Astring.String.sub -> string
to_string s
is the bytes of s
as a string.
val rebase : Astring.String.sub -> Astring.String.sub
rebase s
is v (to_string s)
. This puts s
on a base
string made solely of its bytes.
val hash : Astring.String.sub -> int
hash s
is Hashtbl.hash s
.
See the graphical guide.
val start : Astring.String.sub -> Astring.String.sub
start s
is the empty substring at the start position of s
.
val stop : Astring.String.sub -> Astring.String.sub
stop s
is the empty substring at the stop position of s
.
val base : Astring.String.sub -> Astring.String.sub
base s
is a substring that spans the whole base string of s
.
val tail : ?rev:bool -> Astring.String.sub -> Astring.String.sub
tail s
is s
without its first (rev
is false
, default)
or last (rev
is true
) byte or s
if it is empty.
val extend : ?rev:bool ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
extend ~rev ~max ~sat s
extends s
by at most max
consecutive sat
satisfiying bytes of the base string located
after stop s
(rev
is false
, default) or before start s
(rev
is true
). If max
is unspecified the extension is
limited by the extents of the base string of s
. sat
defaults to fun _ -> true
.
Invalid_argument
if max
is negative.val reduce : ?rev:bool ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
reduce ~rev ~max ~sat s
reduces s
by at most max
consecutive sat
satisfying bytes of s
located before stop
s
(rev
is false
, default) or after start s
(rev
is
true
). If max
is unspecified the reduction is limited by
the extents of the substring s
. sat
defaults to fun _ ->
true
.
Invalid_argument
if max
is negative.val extent : Astring.String.sub -> Astring.String.sub -> Astring.String.sub
extent s s'
is the smallest substring that includes all the
positions of s
and s'
.
Invalid_argument
if s
and s'
are not on the same base
string according to physical equality.val overlap : Astring.String.sub -> Astring.String.sub -> Astring.String.sub option
overlap s s'
is the smallest substring that includes all the
positions common to s
and s'
or None
if there are no
such positions. Note that the overlap substring may be empty.
Invalid_argument
if s
and s'
are not on the same base
string according to physical equality.val append : Astring.String.sub -> Astring.String.sub -> Astring.String.sub
append s s'
is like Astring.String.append
. The substrings can be
on different bases and the result is on a base string that holds
exactly the appended bytes.
val concat : ?sep:Astring.String.sub -> Astring.String.sub list -> Astring.String.sub
concat ~sep ss
is like Astring.String.concat
. The substrings can
all be on different bases and the result is on a base string that
holds exactly the concatenated bytes.
val is_empty : Astring.String.sub -> bool
is_empty s
is length s = 0
.
val is_prefix : affix:Astring.String.sub -> Astring.String.sub -> bool
is_prefix
is like Astring.String.is_prefix
. Only bytes
are compared, affix
can be on a different base string.
val is_infix : affix:Astring.String.sub -> Astring.String.sub -> bool
is_infix
is like Astring.String.is_infix
. Only bytes
are compared, affix
can be on a different base string.
val is_suffix : affix:Astring.String.sub -> Astring.String.sub -> bool
is_suffix
is like Astring.String.is_suffix
. Only bytes
are compared, affix
can be on a different base string.
val for_all : (char -> bool) -> Astring.String.sub -> bool
for_all
is like Astring.String.for_all
on the substring.
val exists : (char -> bool) -> Astring.String.sub -> bool
exists
is like Astring.String.exists
on the substring.
val same_base : Astring.String.sub -> Astring.String.sub -> bool
same_base s s'
is true
iff the substrings s
and s'
have the same base string according to physical equality.
val equal_bytes : Astring.String.sub -> Astring.String.sub -> bool
equal_bytes s s'
is true
iff the substrings s
and s'
have
exactly the same bytes. The substrings can be on a different
base string.
val compare_bytes : Astring.String.sub -> Astring.String.sub -> int
compare_bytes s s'
compares the bytes of s
and s
' in
lexicographical order. The substrings can be on a different
base string.
val equal : Astring.String.sub -> Astring.String.sub -> bool
equal s s'
is true
iff s
and s'
have the same positions.
Invalid_argument
if s
and s'
are not on the same base
string according to physical equality.val compare : Astring.String.sub -> Astring.String.sub -> int
compare s s'
compares the positions of s
and s'
in
lexicographical order.
Invalid_argument
if s
and s'
are not on the same base
string according to physical equality.Extracted substrings are always on the same base string as the
substring s
acted upon.
val with_range : ?first:int -> ?len:int -> Astring.String.sub -> Astring.String.sub
with_range
is like Astring.String.sub_with_range
. The indices are the
substring's zero-based ones, not those in the base string.
val with_index_range : ?first:int -> ?last:int -> Astring.String.sub -> Astring.String.sub
with_index_range
is like Astring.String.sub_with_index_range
. The
indices are the substring's zero-based ones, not those in the
base string.
val trim : ?drop:(char -> bool) -> Astring.String.sub -> Astring.String.sub
trim
is like Astring.String.trim
. If all bytes are dropped returns
an empty string located in the middle of the argument.
val span : ?rev:bool ->
?min:int ->
?max:int ->
?sat:(char -> bool) ->
Astring.String.sub -> Astring.String.sub * Astring.String.sub
span
is like Astring.String.span
. For a substring s
a left
empty span is start s
and a right empty span is stop s
.
val take : ?rev:bool ->
?min:int ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
take
is like Astring.String.take
.
val drop : ?rev:bool ->
?min:int ->
?max:int -> ?sat:(char -> bool) -> Astring.String.sub -> Astring.String.sub
drop
is like Astring.String.drop
.
val cut : ?rev:bool ->
sep:Astring.String.sub ->
Astring.String.sub -> (Astring.String.sub * Astring.String.sub) option
cut
is like Astring.String.cut
. sep
can be on a different base string
val cuts : ?rev:bool ->
?empty:bool ->
sep:Astring.String.sub -> Astring.String.sub -> Astring.String.sub list
cuts
is like Astring.String.cuts
. sep
can be on a different base
string
val fields : ?empty:bool ->
?is_sep:(char -> bool) -> Astring.String.sub -> Astring.String.sub list
fields
is like Astring.String.fields
.
val find : ?rev:bool ->
(char -> bool) -> Astring.String.sub -> Astring.String.sub option
find ~rev sat s
is the substring of s
(if any) that spans the
first byte that satisfies sat
in s
after position start s
(rev
is false
, default) or before stop s
(rev
is true
).
None
is returned if there is no matching byte in s
.
val find_sub : ?rev:bool ->
sub:Astring.String.sub -> Astring.String.sub -> Astring.String.sub option
find_sub ~rev ~sub s
is the substring of s
(if any) that
spans the first match of sub
in s
after position start s
(rev
is false
, defaults) or before stop s
(rev
is
false
). Only bytes are compared and sub
can be on a
different base string. None
is returned if there is no match of
sub
in s
.
val filter : (char -> bool) -> Astring.String.sub -> Astring.String.sub
filter sat s
is like Astring.String.filter
. The result is on a
base string that holds only the filtered bytes.
val filter_map : (char -> char option) -> Astring.String.sub -> Astring.String.sub
filter_map f s
is like Astring.String.filter_map
. The result is on a
base string that holds only the filtered bytes.
val map : (char -> char) -> Astring.String.sub -> Astring.String.sub
map
is like Astring.String.map
. The result is on a base string that
holds only the mapped bytes.
val mapi : (int -> char -> char) -> Astring.String.sub -> Astring.String.sub
mapi
is like Astring.String.mapi
. The result is on a base string that
holds only the mapped bytes. The indices are the substring's
zero-based ones, not those in the base string.
val fold_left : ('a -> char -> 'a) -> 'a -> Astring.String.sub -> 'a
fold_left
is like Astring.String.fold_left
.
val fold_right : (char -> 'a -> 'a) -> Astring.String.sub -> 'a -> 'a
fold_right
is like Astring.String.fold_right
.
val iter : (char -> unit) -> Astring.String.sub -> unit
iter
is like Astring.String.iter
.
val iteri : (int -> char -> unit) -> Astring.String.sub -> unit
iteri
is like Astring.String.iteri
. The indices are the
substring's zero-based ones, not those in the base string.
val pp : Stdlib.Format.formatter -> Astring.String.sub -> unit
pp ppf s
prints s
's bytes on ppf
.
val dump : Stdlib.Format.formatter -> Astring.String.sub -> unit
dump ppf s
prints s
as a syntactically valid OCaml string
on ppf
using Astring.String.Ascii.escape_string
.
val dump_raw : Stdlib.Format.formatter -> Astring.String.sub -> unit
dump_raw ppf s
prints an unspecified raw internal
representation of s
on ppf.
val of_char : char -> Astring.String.sub
of_char c
is a string that contains the byte c
.
val to_char : Astring.String.sub -> char option
to_char s
is the single byte in s
or None
if there is no byte
or more than one in s
.
val of_bool : bool -> Astring.String.sub
of_bool b
is a string representation for b
. Relies on
Pervasives.string_of_bool
.
val to_bool : Astring.String.sub -> bool option
to_bool s
is a bool
from s
, if any. Relies on
Pervasives.bool_of_string
.
val of_int : int -> Astring.String.sub
of_int i
is a string representation for i
. Relies on
Pervasives.string_of_int
.
val to_int : Astring.String.sub -> int option
to_int
is an int
from s
, if any. Relies on
Pervasives.int_of_string
.
val of_nativeint : nativeint -> Astring.String.sub
of_nativeint i
is a string representation for i
. Relies on
Nativeint.of_string
.
val to_nativeint : Astring.String.sub -> nativeint option
to_nativeint
is an nativeint
from s
, if any. Relies on
Nativeint.to_string
.
val of_int32 : int32 -> Astring.String.sub
of_int32 i
is a string representation for i
. Relies on
Int32.of_string
.
val to_int32 : Astring.String.sub -> int32 option
to_int32
is an int32
from s
, if any. Relies on
Int32.to_string
.
val of_int64 : int64 -> Astring.String.sub
of_int64 i
is a string representation for i
. Relies on
Int64.of_string
.
val to_int64 : Astring.String.sub -> int64 option
to_int64
is an int64
from s
, if any. Relies on
Int64.to_string
.
val of_float : float -> Astring.String.sub
of_float f
is a string representation for f
. Relies on
Pervasives.string_of_float
.
val to_float : Astring.String.sub -> float option
to_float s
is a float
from s
, if any. Relies
on Pervasives.float_of_string
.
+---+---+---+---+---+---+---+---+---+---+---+ | R | e | v | o | l | t | | n | o | w | ! | +---+---+---+---+---+---+---+---+---+---+---+ |---------------| a | start a | stop a |-----------| tail a |-----------| tail ~rev:true a |-----------------------------------| extend a |-----------------------| extend ~rev:true a |-------------------------------------------| base a |-----------| b | start b | stop b |-------| tail b |-------| tail ~rev:true b |-------------------------------------------| extend b |-----------| extend ~rev:true b |-------------------------------------------| base b |-----------------------| extent a b |---| overlap a b | c | start c | stop c | tail c | tail ~rev:true c |---------------| extend c |---------------------------| extend ~rev:true c |-------------------------------------------| base c |-------------------| extent a c None overlap a c |---------------| d | start d | stop d |-----------| tail d |-----------| tail ~rev:true d |---------------| extend d |-------------------------------------------| extend ~rev:true d |-------------------------------------------| base d |---------------| extent d c | overlap d c