module Html:sig..end
"<" is converted to "<".
As the entities may be named, there is a dependency on the character
set.val encode_from_latin1 : string -> string
val decode_to_latin1 : string -> stringval unsafe_chars_html4 : stringval encode : in_enc:Netconversion.encoding ->
?out_enc:Netconversion.encoding ->
?prefer_name:bool -> ?unsafe_chars:string -> unit -> string -> stringin_enc is recoded to
out_enc, and the following characters are encoded as HTML
entity (&name; or &#num;):unsafe_charsout_enc. By
default (out_enc=`Enc_usascii), only ASCII characters can be
represented, and thus all code points >= 128 are encoded as
HTML entities. If you pass out_enc=`Enc_utf8, all characters
can be represented."(a<b) & (c>d)" is encoded as
"(a<b) & (c>d)".
It is required that out_enc is an ASCII-compatible encoding.
The option prefer_name selects whether named entities (e.g. <)
or numeric entities (e.g. <) are prefered.
The efficiency of the function can be improved when the same encoding is applied to several strings. Create a specialized encoding function by passing all arguments up to the unit argument, and apply this function several times. For example:
let my_enc = encode ~in_enc:`Enc_utf8 () in
let s1' = my_enc s1 in
let s2' = my_enc s2 in ...
typeentity_set =[ `Empty | `Html | `Xml ]
val decode : in_enc:Netconversion.encoding ->
out_enc:Netconversion.encoding ->
?lookup:(string -> string) ->
?subst:(int -> string) ->
?entity_base:entity_set -> unit -> string -> stringin_enc to out_enc, and HTML
entities (&name; or &#num;) are resolved. The input encoding
in_enc must be ASCII-compatible.
By default, the function knows all entities defined for HTML 4 (this
can be changed using entity_base, see below). If other
entities occur, the function lookup is called and the name of
the entity is passed as input string to the function. It is
expected that lookup returns the value of the entity, and that this
value is already encoded as out_enc.
By default, lookup raises a Failure exception.
If a character cannot be represented in the output encoding,
the function subst is called. subst must return a substitute
string for the character.
By default, subst raises a Failure exception.
The option entity_base determines which set of entities are
considered as the known entities that can be decoded without
help by the lookup function: `Html selects all entities defined
for HTML 4, `Xml selects only <, >, &, ",
and ',
and `Empty selects the empty set (i.e. lookup is always called).