String functions are used to manipulate string data.
concat(col1, col2, ...)
Concatenates arrays or strings. This function accepts any number of arguments and returns a string or an array.
endswith(col1, col2)
Returns a boolean indicating whether the first string argument ends with the second string argument.
format_time(col, format)
Format a datetime to string. The 'col' will be cast to datetime type if it is bigint, float or string type before formatting. Please check format patterns for how to compose the format.
A pattern is used to create a format string. Patterns are based on a simple sequence of letters and symbols which is common in many languages like Java, etc. The supported symbols in Kuiper are
Symbol | Meaning | Example |
---|---|---|
G | era | G(AD) |
Y | year | YYYY(2004), YY(04) |
M | month | M(1), MM(01), MMM(Jan), MMMM(January) |
d | day of month | d(2), dd(02) |
E | day of week | EEE(Mon), EEEE(Monday) |
H | hour in 24 hours format | HH(15) |
h | hour in 12 hours format | h(2), hh(03) |
a | AM or PM | a(PM) |
m | minute | m(4), mm(04) |
s | second | s(5), ss(05) |
S | fraction of second | S(.0), SS(.00), SSS(.000) |
z | time zone name | z(MST) |
Z | 4 digits time zone offset | Z(-0700) |
X | time zone offset | X(-07), XX(-0700), XXX(-07:00) |
Examples:
indexof(col1, col2)
Returns the first index (0-based) of the second argument as a substring in the first argument.
length(col)
Returns the number of characters in the provided string.
lower(col)
Returns the lowercase version of the given string.
lpad(col, 2)
Returns the string argument, padded on the left side with the number of spaces specified by the second argument.
ltrim(col)
Removes all leading whitespace (tabs and spaces) from the provided string.
numbytes(col)
Returns the number of bytes in the UTF-8 encoding of the provided string.
regexp_matches(col, regex)
Returns true if the string (first argument) contains a match for the regular expression.
regexp_replace(col, regex, replacement)
Replaces all substrings of the specified string value that matches regexp with replacement.
regexp_substring(col, regex)
Returns the first substring of the specified string value that matches regexp.
rpad(col, 2)
Returns the string argument, padded on the right side with the number of spaces specified by the second argument.
rtrim(col)
Removes all trailing whitespace (tabs and spaces) from the provided string.
substring(col, start, length)
Returns the substring of the specified string value starting at the specified index position (0-based, inclusive) for up to the specified length of characters.
startswith(col, str)
Returns a boolean indicating whether the first string argument starts with the second string argument.
split_value(col, str_splitter, index)
Split the value of the 1st parameter with the 2nd parameter, and return the value of split array that indexed with the 3rd parameter.
For example, split_value("/test/device001/message","/",0) AS a
, the returned value of function is empty;
split_value("/test/device001/message","/",3) AS a
, the returned value of function is message
.
trim(col)
Removes all leading and trailing whitespace (tabs and spaces) from the provided string.
upper(col)
Returns the uppercase version of the given string.