UTF8
-
class
UTF8
The UTF8 class encapsulates a utf8 encoded array of characters and allows for easy encoding and decoding.
Public Functions
-
UTF8 &
Assign
(UTF8 &&in_utf8) Moves the source UTF8 object to this object. This method is functionally equivalent to the overloaded assignment operator.
Parameters: in_utf8 – The source of the move. Returns: A reference to this object.
-
UTF8 &
Assign
(UTF8 const &in_utf8) Copies the source UTF8 object to this object. This method is functionally equivalent to the overloaded assignment operator.
Parameters: in_utf8 – The source of the copy. Returns: A reference to this object.
-
inline char
At
(size_t in_index) const Retrieves the utf8 encoded character at the specified index. This method may split up individual code points.
Returns: The utf8 encoded character array.
-
void
Clear
() Reset all string data.
-
inline bool
Empty
() const Indicates whether this utf8 string is empty.
Returns: true if the UTF8 string is empty, false otherwise.
-
inline char const *
GetBytes
() const Retrieves the raw, utf8 encoded character array.
Returns: The utf8 encoded character array.
-
size_t
GetHash
() const Returns a hash code for the utf8 encoded characters.
Returns: The size_t hash code.
-
inline size_t
GetLength
() const Retrieves the number of bytes in the utf8 encoded string up to but not including the null terminator. This will return 0 if the utf8 object is uninitialized.
Returns: The number of bytes.
-
inline size_t
GetWStrLength
() const Retrieves the number of wide characters in the wchar_t string up to but not including the null terminator. This will return 0 if the utf8 object is uninitialized.
Returns: The number of wide characters.
-
inline bool
IsValid
() const Indicates whether this utf8 string has been initialized.
Returns: true if the UTF8 string has been initialized, false otherwise.
-
inline
operator char const*
() const Allows typecasting to const char * by retrieves the raw, utf8 encoded character array.
Returns: The utf8 encoded character array.
-
inline bool
operator!=
(char const *in_utf8) const This function is used to check a utf8-encoded character string for equivalence to this.
Parameters: in_utf8 – The object to compare to this. Returns: true if the objects are not equivalent, false otherwise.
-
inline bool
operator!=
(UTF8 const &in_utf8) const This function is used to check an object for equivalence to this.
Parameters: in_utf8 – The object to compare to this. Returns: true if the objects are not equivalent, false otherwise.
-
UTF8
operator+
(char const *in_utf8) const Creates a new UTF8 object by appending a utf8 encoded string to the end of this object.
Parameters: in_utf8 – A string, assumed to be utf8 encoded, used as the tail end of the new string. Returns: A new UTF8 object representing the concatenation of 2 strings.
-
UTF8
operator+
(UTF8 const &in_utf8) const Creates a new UTF8 object by appending a UTF8 object to the end of this object.
Parameters: in_utf8 – The tail end of the new string. Returns: A new UTF8 object representing the concatenation of 2 strings.
-
UTF8 &
operator+=
(char const *in_utf8) Appends a utf8 encoded string to the end of this object.
Parameters: in_utf8 – A string, assumed to be utf8 encoded, used as the tail end of the new string. Returns: A reference to this object.
-
UTF8 &
operator+=
(UTF8 const &in_utf8) Appends a UTF8 object to the end of this object.
Parameters: in_utf8 – The tail end of the new string. Returns: A reference to this object.
-
inline UTF8 &
operator=
(UTF8 &&in_utf8) The move assignment operator takes control of the underlying data from the source utf8 string.
Parameters: the – source of the move.
-
inline UTF8 &
operator=
(UTF8 const &in_utf8) Copies the source UTF8 object to this object.
Parameters: in_utf8 – The source of the copy. Returns: A reference to this object.
-
bool
operator==
(char const *in_utf8) const This function is used to check a utf8-encoded character string for equivalence to this.
Parameters: in_utf8 – The object to compare to this. Returns: true if the objects are equivalent, false otherwise.
-
bool
operator==
(UTF8 const &in_utf8) const This function is used to check an object for equivalence to this.
Parameters: in_utf8 – The object to compare to this. Returns: true if the objects are equivalent, false otherwise.
-
inline void
Reset
() Resets this object to its initial, uninitialized state.
-
size_t
ToWStr
(wchar_t *out_wide_string) const Decode a utf8 encoded string into a wide character buffer
Parameters: out_wide_string – Returns: the number of wide characters (code points) in the wide string.
-
size_t
ToWStr
(WCharArray &out_wide_string) const Decode a utf8 encoded string into a wide character buffer
Returns: The number of wide characters (code points) in the wide string.
-
UTF8
(char const *in_string, char const *in_locale = 0) This constructor can be used to encode a string from any known locale to utf8. Be careful not to re-encode a string that’s already utf8 encoded.
Parameters: - in_string – The string to be encoded.
- in_locale – A string identifying the source locale of in_string. If none is specified, the default locale on the local machine will be used. If in_string is already utf8 encoded, specify the locale as “utf8” to prevent re-encoding.
-
UTF8
(UTF8 &&in_that) The move constructor takes control of the underlying data from the source utf8 string.
Parameters: the – source of the move.
-
UTF8
(UTF8 const &in_that) The copy constructor copies the source utf8 string.
Parameters: in_that – the source to be copied.
-
UTF8
(wchar_t const *in_string) This constructor can be used to encode a wide character string to utf8.
Parameters: in_string – The string to be encoded.
Friends
-
inline friend bool
operator!=
(char const *in_left, UTF8 const &in_right) This function is used to check a utf8-encoded character string for equivalence to a UTF8 object.
Parameters: - in_left – A utf8-encoded character string.
- in_right – A UTF8 object.
Returns: true if the objects are not equivalent, false otherwise.
-
inline friend bool
operator!=
(wchar_t const *in_left, UTF8 const &in_right) This function is used to check a wide character string for equivalence to a UTF8 object.
Parameters: - in_left – A wide character string.
- in_right – A UTF8 object.
Returns: true if the objects are not equivalent, false otherwise.
-
inline friend UTF8
operator+
(char const *in_left, UTF8 const &in_right) Creates a new UTF8 object by appending a UTF8 object to the end of a utf8-encoded character string.
Parameters: - in_left – A string, assumed to be utf8 encoded, used as the head end of the new string.
- in_right – A UTF8 object used as the tail end of the new string.
Returns: A new UTF8 object representing the concatenation of 2 strings.
-
inline friend UTF8
operator+
(wchar_t const *in_left, UTF8 const &in_right) Creates a new UTF8 object by appending a UTF8 object to the end of a wide character string.
Parameters: - in_left – A wide character string used as the head end of the new string.
- in_right – A UTF8 object used as the tail end of the new string.
Returns: A new UTF8 object representing the concatenation of 2 strings.
-
inline friend bool
operator==
(char const *in_left, UTF8 const &in_right) This function is used to check a utf8-encoded character string for equivalence to a UTF8 object.
Parameters: - in_left – A utf8-encoded character string.
- in_right – A UTF8 object.
Returns: true if the objects are equivalent, false otherwise.
-
UTF8 &