LostTech.TensorFlow : API Documentation

Type tf.strings

Namespace tensorflow

Methods

Properties

Public static methods

object bytes_split(RaggedTensor input, string name)

Split string elements of `input` into bytes.

Examples: Note that this op splits strings into bytes, not unicode characters. To split strings into unicode characters, use tf.strings.unicode_split.

See also: tf.io.decode_raw, tf.strings.split, tf.strings.unicode_split.
Parameters
RaggedTensor input
A string `Tensor` or `RaggedTensor`: the strings to split. Must have a statically known rank (`N`).
string name
A name for the operation (optional).
Returns
object
A `RaggedTensor` of rank `N+1`: the bytes that make up the source strings.
Show Example
>>> tf.strings.bytes_split('hello')
            ['h', 'e', 'l', 'l', 'o']
            >>> tf.strings.bytes_split(['hello', '123'])
             

object bytes_split(IGraphNodeBase input, string name)

Split string elements of `input` into bytes.

Examples: Note that this op splits strings into bytes, not unicode characters. To split strings into unicode characters, use tf.strings.unicode_split.

See also: tf.io.decode_raw, tf.strings.split, tf.strings.unicode_split.
Parameters
IGraphNodeBase input
A string `Tensor` or `RaggedTensor`: the strings to split. Must have a statically known rank (`N`).
string name
A name for the operation (optional).
Returns
object
A `RaggedTensor` of rank `N+1`: the bytes that make up the source strings.
Show Example
>>> tf.strings.bytes_split('hello')
            ['h', 'e', 'l', 'l', 'o']
            >>> tf.strings.bytes_split(['hello', '123'])
             

object bytes_split_dyn(object input, object name)

Split string elements of `input` into bytes.

Examples: Note that this op splits strings into bytes, not unicode characters. To split strings into unicode characters, use tf.strings.unicode_split.

See also: tf.io.decode_raw, tf.strings.split, tf.strings.unicode_split.
Parameters
object input
A string `Tensor` or `RaggedTensor`: the strings to split. Must have a statically known rank (`N`).
object name
A name for the operation (optional).
Returns
object
A `RaggedTensor` of rank `N+1`: the bytes that make up the source strings.
Show Example
>>> tf.strings.bytes_split('hello')
            ['h', 'e', 'l', 'l', 'o']
            >>> tf.strings.bytes_split(['hello', '123'])
             

Tensor format(string template, IGraphNodeBase inputs, string placeholder, PythonFunctionContainer summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
string template
A string template to format tensor values into.
IGraphNodeBase inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
PythonFunctionContainer summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(string template, ValueTuple inputs, string placeholder, PythonFunctionContainer summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
string template
A string template to format tensor values into.
ValueTuple inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
PythonFunctionContainer summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(IEnumerable<object> template, ValueTuple inputs, string placeholder, PythonFunctionContainer summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
IEnumerable<object> template
A string template to format tensor values into.
ValueTuple inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
PythonFunctionContainer summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(IEnumerable<object> template, IGraphNodeBase inputs, string placeholder, PythonFunctionContainer summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
IEnumerable<object> template
A string template to format tensor values into.
IGraphNodeBase inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
PythonFunctionContainer summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(IEnumerable<object> template, ValueTuple inputs, string placeholder, ImplicitContainer<T> summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
IEnumerable<object> template
A string template to format tensor values into.
ValueTuple inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
ImplicitContainer<T> summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(IEnumerable<object> template, IEnumerable<object> inputs, string placeholder, PythonFunctionContainer summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
IEnumerable<object> template
A string template to format tensor values into.
IEnumerable<object> inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
PythonFunctionContainer summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(IEnumerable<object> template, IEnumerable<object> inputs, string placeholder, ImplicitContainer<T> summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
IEnumerable<object> template
A string template to format tensor values into.
IEnumerable<object> inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
ImplicitContainer<T> summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(IEnumerable<object> template, IGraphNodeBase inputs, string placeholder, ImplicitContainer<T> summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
IEnumerable<object> template
A string template to format tensor values into.
IGraphNodeBase inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
ImplicitContainer<T> summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(string template, IEnumerable<object> inputs, string placeholder, ImplicitContainer<T> summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
string template
A string template to format tensor values into.
IEnumerable<object> inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
ImplicitContainer<T> summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(string template, IEnumerable<object> inputs, string placeholder, PythonFunctionContainer summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
string template
A string template to format tensor values into.
IEnumerable<object> inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
PythonFunctionContainer summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(string template, ValueTuple inputs, string placeholder, ImplicitContainer<T> summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
string template
A string template to format tensor values into.
ValueTuple inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
ImplicitContainer<T> summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor format(string template, IGraphNodeBase inputs, string placeholder, ImplicitContainer<T> summarize, string name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
string template
A string template to format tensor values into.
IGraphNodeBase inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
string placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
ImplicitContainer<T> summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
string name
A name for the operation (optional).
Returns
Tensor
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

object format_dyn(object template, object inputs, ImplicitContainer<T> placeholder, ImplicitContainer<T> summarize, object name)

Formats a string template using a list of tensors.

Formats a string template using a list of tensors, abbreviating tensors by only printing the first and last `summarize` elements of each dimension (recursively). If formatting only one tensor into a template, the tensor does not have to be wrapped in a list.

Example: Formatting a single-tensor template: Formatting a multi-tensor template:
Parameters
object template
A string template to format tensor values into.
object inputs
A list of `Tensor` objects, or a single Tensor. The list of tensors to format into the template string. If a solitary tensor is passed in, the input tensor will automatically be wrapped as a list.
ImplicitContainer<T> placeholder
An optional `string`. Defaults to `{}`. At each placeholder occurring in the template, a subsequent tensor will be inserted.
ImplicitContainer<T> summarize
An optional `int`. Defaults to `3`. When formatting the tensors, show the first and last `summarize` entries of each tensor dimension (recursively). If set to -1, all elements of the tensor will be shown.
object name
A name for the operation (optional).
Returns
object
A scalar `Tensor` of type `string`.
Show Example
sess = tf.compat.v1.Session()
            with sess.as_default():
                tensor = tf.range(10)
                formatted = tf.strings.format("tensor: {}, suffix", tensor)
                out = sess.run(formatted)
                expected = "tensor: [0 1 2... 7 8 9], suffix" 

assert(out.decode() == expected)

Tensor length(IEnumerable<Byte[]> input, string name, string unit)

String lengths of `input`.

Computes the length of each string given in the input tensor.
Parameters
IEnumerable<Byte[]> input
A `Tensor` of type `string`. The string for which to compute the length.
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is counted to compute string length. One of: `"BYTE"` (for the number of bytes in each string) or `"UTF8_CHAR"` (for the number of UTF-8 encoded Unicode code points in each string). Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `int32`.

object length_dyn(object input, object name, ImplicitContainer<T> unit)

String lengths of `input`.

Computes the length of each string given in the input tensor.
Parameters
object input
A `Tensor` of type `string`. The string for which to compute the length.
object name
A name for the operation (optional).
ImplicitContainer<T> unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is counted to compute string length. One of: `"BYTE"` (for the number of bytes in each string) or `"UTF8_CHAR"` (for the number of UTF-8 encoded Unicode code points in each string). Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
object
A `Tensor` of type `int32`.

Tensor lower(IGraphNodeBase input, string encoding, string name)

TODO: add doc.
Parameters
IGraphNodeBase input
A `Tensor` of type `string`.
string encoding
An optional `string`. Defaults to `""`.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `string`.

object lower_dyn(object input, ImplicitContainer<T> encoding, object name)

TODO: add doc.
Parameters
object input
A `Tensor` of type `string`.
ImplicitContainer<T> encoding
An optional `string`. Defaults to `""`.
object name
A name for the operation (optional).
Returns
object
A `Tensor` of type `string`.

object ngrams(IEnumerable<object> data, IEnumerable<int> ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IEnumerable<object> data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(IEnumerable<object> data, int ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IEnumerable<object> data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(IEnumerable<object> data, int ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IEnumerable<object> data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(RaggedTensor data, IEnumerable<int> ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
RaggedTensor data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(RaggedTensor data, IEnumerable<int> ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
RaggedTensor data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(RaggedTensor data, int ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
RaggedTensor data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(RaggedTensor data, int ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
RaggedTensor data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(IEnumerable<object> data, IEnumerable<int> ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IEnumerable<object> data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(int data, int ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
int data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(int data, IEnumerable<int> ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
int data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(int data, IEnumerable<int> ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
int data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(int data, int ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
int data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(IGraphNodeBase data, IEnumerable<int> ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IGraphNodeBase data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(IGraphNodeBase data, IEnumerable<int> ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IGraphNodeBase data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(IGraphNodeBase data, int ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IGraphNodeBase data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(IGraphNodeBase data, int ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
IGraphNodeBase data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(object data, IEnumerable<int> ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
object data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(object data, IEnumerable<int> ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
object data
A Tensor or RaggedTensor containing the source data for the ngrams.
IEnumerable<int> ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(object data, int ngram_width, Byte[] separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
object data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
Byte[] separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams(object data, int ngram_width, string separator, object pad_values, Nullable<int> padding_width, bool preserve_short_sequences, string name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
object data
A Tensor or RaggedTensor containing the source data for the ngrams.
int ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
string separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
Nullable<int> padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
bool preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
string name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

object ngrams_dyn(object data, object ngram_width, ImplicitContainer<T> separator, object pad_values, object padding_width, ImplicitContainer<T> preserve_short_sequences, object name)

Create a tensor of n-grams based on `data`.

Creates a tensor of n-grams based on `data`. The n-grams are created by joining windows of `width` adjacent strings from the inner axis of `data` using `separator`.

The input data can be padded on both the start and end of the sequence, if desired, using the `pad_values` argument. If set, `pad_values` should contain either a tuple of strings or a single string; the 0th element of the tuple will be used to pad the left side of the sequence and the 1st element of the tuple will be used to pad the right side of the sequence. The `padding_width` arg controls how many padding values are added to each side; it defaults to `ngram_width-1`.

If this op is configured to not have padding, or if it is configured to add padding with `padding_width` set to less than ngram_width-1, it is possible that a sequence, or a sequence plus padding, is smaller than the ngram width. In that case, no ngrams will be generated for that sequence. This can be prevented by setting `preserve_short_sequences`, which will cause the op to always generate at least one ngram per non-empty sequence.
Parameters
object data
A Tensor or RaggedTensor containing the source data for the ngrams.
object ngram_width
The width(s) of the ngrams to create. If this is a list or tuple, the op will return ngrams of all specified arities in list order. Values must be non-Tensor integers greater than 0.
ImplicitContainer<T> separator
The separator string used between ngram elements. Must be a string constant, not a Tensor.
object pad_values
A tuple of (left_pad_value, right_pad_value), a single string, or None. If None, no padding will be added; if a single string, then that string will be used for both left and right padding. Values must be Python strings.
object padding_width
If set, `padding_width` pad values will be added to both sides of each sequence. Defaults to `ngram_width`-1. Must be greater than 0. (Note that 1-grams are never padded, regardless of this value.)
ImplicitContainer<T> preserve_short_sequences
If true, then ensure that at least one ngram is generated for each input sequence. In particular, if an input sequence is shorter than `min(ngram_width) + 2*pad_width`, then generate a single ngram containing the entire sequence. If false, then no ngrams are generated for these short input sequences.
object name
The op name.
Returns
object
A RaggedTensor of ngrams. If `data.shape=[D1...DN, S]`, then `output.shape=[D1...DN, NUM_NGRAMS]`, where `NUM_NGRAMS=S-ngram_width+1+2*padding_width`.

Tensor regex_full_match(IGraphNodeBase input, string pattern, string name)

Check if the input matches the regex pattern.

The input is a string tensor of any shape. The pattern is a scalar string tensor which is applied to every element of the input tensor. The boolean values (True or False) of the output tensor indicate if the input matches the regex pattern provided.

The pattern follows the re2 syntax (https://github.com/google/re2/wiki/Syntax)
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. A string tensor of the text to be processed.
string pattern
A `Tensor` of type `string`. A scalar string tensor containing the regular expression to match the input.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `bool`.

Tensor regex_full_match(IGraphNodeBase input, ValueTuple<Byte[], string> pattern, string name)

Check if the input matches the regex pattern.

The input is a string tensor of any shape. The pattern is a scalar string tensor which is applied to every element of the input tensor. The boolean values (True or False) of the output tensor indicate if the input matches the regex pattern provided.

The pattern follows the re2 syntax (https://github.com/google/re2/wiki/Syntax)
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. A string tensor of the text to be processed.
ValueTuple<Byte[], string> pattern
A `Tensor` of type `string`. A scalar string tensor containing the regular expression to match the input.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `bool`.

object split(IEnumerable<object> input, object sep, int maxsplit, string result_type, IEnumerable<object> source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
IEnumerable<object> input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
IEnumerable<object> source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(IGraphNodeBase input, object sep, int maxsplit, string result_type, IGraphNodeBase source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
IGraphNodeBase input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
IGraphNodeBase source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(IGraphNodeBase input, object sep, int maxsplit, string result_type, int source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
IGraphNodeBase input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
int source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(IEnumerable<object> input, object sep, int maxsplit, string result_type, int source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
IEnumerable<object> input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
int source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(IGraphNodeBase input, object sep, int maxsplit, string result_type, IEnumerable<object> source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
IGraphNodeBase input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
IEnumerable<object> source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(IEnumerable<object> input, object sep, int maxsplit, string result_type, IGraphNodeBase source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
IEnumerable<object> input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
IGraphNodeBase source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(int input, object sep, int maxsplit, string result_type, IGraphNodeBase source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
int input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
IGraphNodeBase source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(int input, object sep, int maxsplit, string result_type, int source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
int input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
int source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split(int input, object sep, int maxsplit, string result_type, IEnumerable<object> source, string name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
int input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
int maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
string result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
IEnumerable<object> source
alias for "input" argument.
string name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

object split_dyn(object input, object sep, ImplicitContainer<T> maxsplit, ImplicitContainer<T> result_type, object source, object name)

Split elements of `input` based on `sep`.

Let N be the size of `input` (typically N will be the batch size). Split each element of `input` based on `sep` and return a `SparseTensor` or `RaggedTensor` containing the split tokens. Empty tokens are ignored.

Examples: If `sep` is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings. For example, `input` of `"1<>2<><>3"` and `sep` of `"<>"` returns `["1", "2", "", "3"]`. If `sep` is None or an empty string, consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

Note that the above mentioned behavior matches python's str.split.
Parameters
object input
A string `Tensor` of rank `N`, the strings to split. If `rank(input)` is not known statically, then it is assumed to be `1`.
object sep
`0-D` string `Tensor`, the delimiter character.
ImplicitContainer<T> maxsplit
An `int`. If `maxsplit > 0`, limit of the split of the result.
ImplicitContainer<T> result_type
The tensor type for the result: one of `"RaggedTensor"` or `"SparseTensor"`.
object source
alias for "input" argument.
object name
A name for the operation (optional).
Returns
object
A `SparseTensor` or `RaggedTensor` of rank `N+1`, the strings split according to the delimiter.
Show Example
>>> tf.strings.split(['hello world', 'a b c'])
            tf.SparseTensor(indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2]],
                            values=['hello', 'world', 'a', 'b', 'c']
                            dense_shape=[2, 3]) 

>>> tf.strings.split(['hello world', 'a b c'], result_type="RaggedTensor")

Tensor substr(IEnumerable<object> input, ndarray pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, ndarray pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, double pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, int pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, int pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, int pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, ndarray pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, ndarray pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, ndarray pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, double pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, double pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IGraphNodeBase input, double pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, ndarray pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, int pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, int pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, ndarray pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, ndarray pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, double pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, double pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, double pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, int pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, int pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, int pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, ndarray pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(IEnumerable<object> input, int pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
IEnumerable<object> input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, double pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, double pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, double pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, int pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, int pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, int pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
int pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(Byte[] input, double pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
Byte[] input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, ndarray pos, int len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
int len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, ndarray pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, ndarray pos, double len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
ndarray pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
double len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

Tensor substr(PythonClassContainer input, double pos, ndarray len, string name, string unit)

Return substrings from `Tensor` of strings.

For each string in the input `Tensor`, creates a substring starting at index `pos` with a total length of `len`.

If `len` defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

A negative `pos` indicates distance within the string backwards from the end.

If `pos` specifies an index which is out of range for any of the input strings, then an `InvalidArgumentError` is thrown.

`pos` and `len` must have the same shape, otherwise a `ValueError` is thrown on Op creation.

*NOTE*: `Substr` supports broadcasting up to two dimensions. More about broadcasting [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

---

Examples

Using scalar `pos` and `len`: Using `pos` and `len` with same shape as `input`: Broadcasting `pos` and `len` onto `input`:

``` input = [[b'ten', b'eleven', b'twelve'], [b'thirteen', b'fourteen', b'fifteen'], [b'sixteen', b'seventeen', b'eighteen'], [b'nineteen', b'twenty', b'twentyone']] position = [1, 2, 3] length = [1, 2, 3]

output = [[b'e', b'ev', b'lve'], [b'h', b'ur', b'tee'], [b'i', b've', b'hte'], [b'i', b'en', b'nty']] ```

Broadcasting `input` onto `pos` and `len`:

``` input = b'thirteen' position = [1, 5, 7] length = [3, 2, 1]

output = [b'hir', b'ee', b'n'] ```
Parameters
PythonClassContainer input
A `Tensor` of type `string`. Tensor of strings
double pos
A `Tensor`. Must be one of the following types: `int32`, `int64`. Scalar defining the position of first character in each substring
ndarray len
A `Tensor`. Must have the same type as `pos`. Scalar defining the number of characters to include in each substring
string name
A name for the operation (optional).
string unit
An optional `string` from: `"BYTE", "UTF8_CHAR"`. Defaults to `"BYTE"`. The unit that is used to create the substring. One of: `"BYTE"` (for defining position and length by bytes) or `"UTF8_CHAR"` (for the UTF-8 encoded Unicode code points). The default is `"BYTE"`. Results are undefined if `unit=UTF8_CHAR` and the `input` strings do not contain structurally valid UTF-8.
Returns
Tensor
A `Tensor` of type `string`.
Show Example
input = [b'Hello', b'World']
            position = 1
            length = 3 

output = [b'ell', b'orl']

object unicode_decode(IGraphNodeBase input, string input_encoding, string errors, int replacement_char, bool replace_control_characters, string name)

Decodes each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the Unicode codepoint for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
IGraphNodeBase input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`; and in place of C0 control characters in `input` when `replace_control_characters=True`.
bool replace_control_characters
Whether to replace the C0 control characters `(U+0000 - U+001F)` with the `replacement_char`.
string name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_decode(input, 'UTF-8').tolist() [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] ```

object unicode_decode(IEnumerable<Byte[]> input, string input_encoding, string errors, int replacement_char, bool replace_control_characters, string name)

Decodes each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the Unicode codepoint for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
IEnumerable<Byte[]> input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`; and in place of C0 control characters in `input` when `replace_control_characters=True`.
bool replace_control_characters
Whether to replace the C0 control characters `(U+0000 - U+001F)` with the `replacement_char`.
string name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_decode(input, 'UTF-8').tolist() [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] ```

object unicode_decode(ndarray input, string input_encoding, string errors, int replacement_char, bool replace_control_characters, string name)

Decodes each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the Unicode codepoint for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
ndarray input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`; and in place of C0 control characters in `input` when `replace_control_characters=True`.
bool replace_control_characters
Whether to replace the C0 control characters `(U+0000 - U+001F)` with the `replacement_char`.
string name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_decode(input, 'UTF-8').tolist() [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] ```

object unicode_decode_dyn(object input, object input_encoding, ImplicitContainer<T> errors, ImplicitContainer<T> replacement_char, ImplicitContainer<T> replace_control_characters, object name)

Decodes each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the Unicode codepoint for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
object input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
object input_encoding
String name for the unicode encoding that should be used to decode each string.
ImplicitContainer<T> errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
ImplicitContainer<T> replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`; and in place of C0 control characters in `input` when `replace_control_characters=True`.
ImplicitContainer<T> replace_control_characters
Whether to replace the C0 control characters `(U+0000 - U+001F)` with the `replacement_char`.
object name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_decode(input, 'UTF-8').tolist() [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] ```

object unicode_decode_with_offsets(IEnumerable<Byte[]> input, string input_encoding, string errors, int replacement_char, bool replace_control_characters, string name)

Decodes each string into a sequence of code points with start offsets.

This op is similar to `tf.strings.decode(...)`, but it also returns the start offset for each character in its respective string. This information can be used to align the characters with the original byte sequence.

Returns a tuple `(codepoints, start_offsets)` where:

* `codepoints[i1...iN, j]` is the Unicode codepoint for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`. * `start_offsets[i1...iN, j]` is the start byte offset for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
IEnumerable<Byte[]> input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`; and in place of C0 control characters in `input` when `replace_control_characters=True`.
bool replace_control_characters
Whether to replace the C0 control characters `(U+0000 - U+001F)` with the `replacement_char`.
string name
A name for the operation (optional).
Returns
object
A tuple of `N+1` dimensional tensors `(codepoints, start_offsets)`.

* `codepoints` is an `int32` tensor with shape `[D1...DN, (num_chars)]`. * `offsets` is an `int64` tensor with shape `[D1...DN, (num_chars)]`.

The returned tensors are tf.Tensors if `input` is a scalar, or tf.RaggedTensors otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> result = tf.strings.unicode_decode_with_offsets(input, 'UTF-8') >>> result[0].tolist() # codepoints [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> result[1].tolist() # offsets [[0, 1, 3, 5, 6, 7, 8, 9, 10], [0]] ```

object unicode_decode_with_offsets(IGraphNodeBase input, string input_encoding, string errors, int replacement_char, bool replace_control_characters, string name)

Decodes each string into a sequence of code points with start offsets.

This op is similar to `tf.strings.decode(...)`, but it also returns the start offset for each character in its respective string. This information can be used to align the characters with the original byte sequence.

Returns a tuple `(codepoints, start_offsets)` where:

* `codepoints[i1...iN, j]` is the Unicode codepoint for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`. * `start_offsets[i1...iN, j]` is the start byte offset for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
IGraphNodeBase input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`; and in place of C0 control characters in `input` when `replace_control_characters=True`.
bool replace_control_characters
Whether to replace the C0 control characters `(U+0000 - U+001F)` with the `replacement_char`.
string name
A name for the operation (optional).
Returns
object
A tuple of `N+1` dimensional tensors `(codepoints, start_offsets)`.

* `codepoints` is an `int32` tensor with shape `[D1...DN, (num_chars)]`. * `offsets` is an `int64` tensor with shape `[D1...DN, (num_chars)]`.

The returned tensors are tf.Tensors if `input` is a scalar, or tf.RaggedTensors otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> result = tf.strings.unicode_decode_with_offsets(input, 'UTF-8') >>> result[0].tolist() # codepoints [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> result[1].tolist() # offsets [[0, 1, 3, 5, 6, 7, 8, 9, 10], [0]] ```

object unicode_decode_with_offsets_dyn(object input, object input_encoding, ImplicitContainer<T> errors, ImplicitContainer<T> replacement_char, ImplicitContainer<T> replace_control_characters, object name)

Decodes each string into a sequence of code points with start offsets.

This op is similar to `tf.strings.decode(...)`, but it also returns the start offset for each character in its respective string. This information can be used to align the characters with the original byte sequence.

Returns a tuple `(codepoints, start_offsets)` where:

* `codepoints[i1...iN, j]` is the Unicode codepoint for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`. * `start_offsets[i1...iN, j]` is the start byte offset for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
object input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
object input_encoding
String name for the unicode encoding that should be used to decode each string.
ImplicitContainer<T> errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
ImplicitContainer<T> replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`; and in place of C0 control characters in `input` when `replace_control_characters=True`.
ImplicitContainer<T> replace_control_characters
Whether to replace the C0 control characters `(U+0000 - U+001F)` with the `replacement_char`.
object name
A name for the operation (optional).
Returns
object
A tuple of `N+1` dimensional tensors `(codepoints, start_offsets)`.

* `codepoints` is an `int32` tensor with shape `[D1...DN, (num_chars)]`. * `offsets` is an `int64` tensor with shape `[D1...DN, (num_chars)]`.

The returned tensors are tf.Tensors if `input` is a scalar, or tf.RaggedTensors otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> result = tf.strings.unicode_decode_with_offsets(input, 'UTF-8') >>> result[0].tolist() # codepoints [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> result[1].tolist() # offsets [[0, 1, 3, 5, 6, 7, 8, 9, 10], [0]] ```

object unicode_encode(RaggedTensor input, string output_encoding, string errors, int replacement_char, string name)

Encodes each sequence of Unicode code points in `input` into a string.

`result[i1...iN]` is the string formed by concatenating the Unicode codepoints `input[1...iN, :]`, encoded using `output_encoding`.
Parameters
RaggedTensor input
An `N+1` dimensional potentially ragged integer tensor with shape `[D1...DN, num_chars]`.
string output_encoding
Unicode encoding that should be used to encode each codepoint sequence. Can be `"UTF-8"`, `"UTF-16-BE"`, or `"UTF-32-BE"`.
string errors
Specifies the response when an invalid codepoint is encountered (optional). One of: * `'replace'`: Replace invalid codepoint with the `replacement_char`. (default) * `'ignore'`: Skip invalid codepoints. * `'strict'`: Raise an exception for any invalid codepoint.
int replacement_char
The replacement character codepoint to be used in place of any invalid input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character which is 0xFFFD (U+65533).
string name
A name for the operation (optional).
Returns
object
A `N` dimensional `string` tensor with shape `[D1...DN]`.

#### Example: ```python >>> input = [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> unicode_encode(input, 'UTF-8') ['G\xc3\xb6\xc3\xb6dnight', '\xf0\x9f\x98\x8a'] ```

object unicode_encode(ndarray input, string output_encoding, string errors, int replacement_char, string name)

Encodes each sequence of Unicode code points in `input` into a string.

`result[i1...iN]` is the string formed by concatenating the Unicode codepoints `input[1...iN, :]`, encoded using `output_encoding`.
Parameters
ndarray input
An `N+1` dimensional potentially ragged integer tensor with shape `[D1...DN, num_chars]`.
string output_encoding
Unicode encoding that should be used to encode each codepoint sequence. Can be `"UTF-8"`, `"UTF-16-BE"`, or `"UTF-32-BE"`.
string errors
Specifies the response when an invalid codepoint is encountered (optional). One of: * `'replace'`: Replace invalid codepoint with the `replacement_char`. (default) * `'ignore'`: Skip invalid codepoints. * `'strict'`: Raise an exception for any invalid codepoint.
int replacement_char
The replacement character codepoint to be used in place of any invalid input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character which is 0xFFFD (U+65533).
string name
A name for the operation (optional).
Returns
object
A `N` dimensional `string` tensor with shape `[D1...DN]`.

#### Example: ```python >>> input = [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> unicode_encode(input, 'UTF-8') ['G\xc3\xb6\xc3\xb6dnight', '\xf0\x9f\x98\x8a'] ```

object unicode_encode(IEnumerable<object> input, string output_encoding, string errors, int replacement_char, string name)

Encodes each sequence of Unicode code points in `input` into a string.

`result[i1...iN]` is the string formed by concatenating the Unicode codepoints `input[1...iN, :]`, encoded using `output_encoding`.
Parameters
IEnumerable<object> input
An `N+1` dimensional potentially ragged integer tensor with shape `[D1...DN, num_chars]`.
string output_encoding
Unicode encoding that should be used to encode each codepoint sequence. Can be `"UTF-8"`, `"UTF-16-BE"`, or `"UTF-32-BE"`.
string errors
Specifies the response when an invalid codepoint is encountered (optional). One of: * `'replace'`: Replace invalid codepoint with the `replacement_char`. (default) * `'ignore'`: Skip invalid codepoints. * `'strict'`: Raise an exception for any invalid codepoint.
int replacement_char
The replacement character codepoint to be used in place of any invalid input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character which is 0xFFFD (U+65533).
string name
A name for the operation (optional).
Returns
object
A `N` dimensional `string` tensor with shape `[D1...DN]`.

#### Example: ```python >>> input = [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> unicode_encode(input, 'UTF-8') ['G\xc3\xb6\xc3\xb6dnight', '\xf0\x9f\x98\x8a'] ```

object unicode_encode(int input, string output_encoding, string errors, int replacement_char, string name)

Encodes each sequence of Unicode code points in `input` into a string.

`result[i1...iN]` is the string formed by concatenating the Unicode codepoints `input[1...iN, :]`, encoded using `output_encoding`.
Parameters
int input
An `N+1` dimensional potentially ragged integer tensor with shape `[D1...DN, num_chars]`.
string output_encoding
Unicode encoding that should be used to encode each codepoint sequence. Can be `"UTF-8"`, `"UTF-16-BE"`, or `"UTF-32-BE"`.
string errors
Specifies the response when an invalid codepoint is encountered (optional). One of: * `'replace'`: Replace invalid codepoint with the `replacement_char`. (default) * `'ignore'`: Skip invalid codepoints. * `'strict'`: Raise an exception for any invalid codepoint.
int replacement_char
The replacement character codepoint to be used in place of any invalid input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character which is 0xFFFD (U+65533).
string name
A name for the operation (optional).
Returns
object
A `N` dimensional `string` tensor with shape `[D1...DN]`.

#### Example: ```python >>> input = [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> unicode_encode(input, 'UTF-8') ['G\xc3\xb6\xc3\xb6dnight', '\xf0\x9f\x98\x8a'] ```

object unicode_encode(object input, string output_encoding, string errors, int replacement_char, string name)

Encodes each sequence of Unicode code points in `input` into a string.

`result[i1...iN]` is the string formed by concatenating the Unicode codepoints `input[1...iN, :]`, encoded using `output_encoding`.
Parameters
object input
An `N+1` dimensional potentially ragged integer tensor with shape `[D1...DN, num_chars]`.
string output_encoding
Unicode encoding that should be used to encode each codepoint sequence. Can be `"UTF-8"`, `"UTF-16-BE"`, or `"UTF-32-BE"`.
string errors
Specifies the response when an invalid codepoint is encountered (optional). One of: * `'replace'`: Replace invalid codepoint with the `replacement_char`. (default) * `'ignore'`: Skip invalid codepoints. * `'strict'`: Raise an exception for any invalid codepoint.
int replacement_char
The replacement character codepoint to be used in place of any invalid input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character which is 0xFFFD (U+65533).
string name
A name for the operation (optional).
Returns
object
A `N` dimensional `string` tensor with shape `[D1...DN]`.

#### Example: ```python >>> input = [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> unicode_encode(input, 'UTF-8') ['G\xc3\xb6\xc3\xb6dnight', '\xf0\x9f\x98\x8a'] ```

object unicode_encode(IGraphNodeBase input, string output_encoding, string errors, int replacement_char, string name)

Encodes each sequence of Unicode code points in `input` into a string.

`result[i1...iN]` is the string formed by concatenating the Unicode codepoints `input[1...iN, :]`, encoded using `output_encoding`.
Parameters
IGraphNodeBase input
An `N+1` dimensional potentially ragged integer tensor with shape `[D1...DN, num_chars]`.
string output_encoding
Unicode encoding that should be used to encode each codepoint sequence. Can be `"UTF-8"`, `"UTF-16-BE"`, or `"UTF-32-BE"`.
string errors
Specifies the response when an invalid codepoint is encountered (optional). One of: * `'replace'`: Replace invalid codepoint with the `replacement_char`. (default) * `'ignore'`: Skip invalid codepoints. * `'strict'`: Raise an exception for any invalid codepoint.
int replacement_char
The replacement character codepoint to be used in place of any invalid input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character which is 0xFFFD (U+65533).
string name
A name for the operation (optional).
Returns
object
A `N` dimensional `string` tensor with shape `[D1...DN]`.

#### Example: ```python >>> input = [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> unicode_encode(input, 'UTF-8') ['G\xc3\xb6\xc3\xb6dnight', '\xf0\x9f\x98\x8a'] ```

object unicode_encode_dyn(object input, object output_encoding, ImplicitContainer<T> errors, ImplicitContainer<T> replacement_char, object name)

Encodes each sequence of Unicode code points in `input` into a string.

`result[i1...iN]` is the string formed by concatenating the Unicode codepoints `input[1...iN, :]`, encoded using `output_encoding`.
Parameters
object input
An `N+1` dimensional potentially ragged integer tensor with shape `[D1...DN, num_chars]`.
object output_encoding
Unicode encoding that should be used to encode each codepoint sequence. Can be `"UTF-8"`, `"UTF-16-BE"`, or `"UTF-32-BE"`.
ImplicitContainer<T> errors
Specifies the response when an invalid codepoint is encountered (optional). One of: * `'replace'`: Replace invalid codepoint with the `replacement_char`. (default) * `'ignore'`: Skip invalid codepoints. * `'strict'`: Raise an exception for any invalid codepoint.
ImplicitContainer<T> replacement_char
The replacement character codepoint to be used in place of any invalid input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character which is 0xFFFD (U+65533).
object name
A name for the operation (optional).
Returns
object
A `N` dimensional `string` tensor with shape `[D1...DN]`.

#### Example: ```python >>> input = [[71, 246, 246, 100, 110, 105, 103, 104, 116], [128522]] >>> unicode_encode(input, 'UTF-8') ['G\xc3\xb6\xc3\xb6dnight', '\xf0\x9f\x98\x8a'] ```

Tensor unicode_script(IGraphNodeBase input, string name)

Determine the script codes of a given tensor of Unicode integer code points.

This operation converts Unicode code points to script codes corresponding to each code point. Script codes correspond to International Components for Unicode (ICU) UScriptCode values. See http://icu-project.org/apiref/icu4c/uscript_8h.html. Returns -1 (USCRIPT_INVALID_CODE) for invalid codepoints. Output shape will match input shape.
Parameters
IGraphNodeBase input
A `Tensor` of type `int32`. A Tensor of int32 Unicode code points.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `int32`.

object unicode_script_dyn(object input, object name)

Determine the script codes of a given tensor of Unicode integer code points.

This operation converts Unicode code points to script codes corresponding to each code point. Script codes correspond to International Components for Unicode (ICU) UScriptCode values. See http://icu-project.org/apiref/icu4c/uscript_8h.html. Returns -1 (USCRIPT_INVALID_CODE) for invalid codepoints. Output shape will match input shape.
Parameters
object input
A `Tensor` of type `int32`. A Tensor of int32 Unicode code points.
object name
A name for the operation (optional).
Returns
object
A `Tensor` of type `int32`.

object unicode_split(IGraphNodeBase input, string input_encoding, string errors, int replacement_char, string name)

Splits each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the substring of `input[i1...iN]` that encodes its `j`th character, when decoded using `input_encoding`.
Parameters
IGraphNodeBase input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`.
string name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_split(input, 'UTF-8').tolist() [['G', '\xc3\xb6', '\xc3\xb6', 'd', 'n', 'i', 'g', 'h', 't'], ['\xf0\x9f\x98\x8a']] ```

object unicode_split(IEnumerable<Byte[]> input, string input_encoding, string errors, int replacement_char, string name)

Splits each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the substring of `input[i1...iN]` that encodes its `j`th character, when decoded using `input_encoding`.
Parameters
IEnumerable<Byte[]> input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`.
string name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_split(input, 'UTF-8').tolist() [['G', '\xc3\xb6', '\xc3\xb6', 'd', 'n', 'i', 'g', 'h', 't'], ['\xf0\x9f\x98\x8a']] ```

object unicode_split(ndarray input, string input_encoding, string errors, int replacement_char, string name)

Splits each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the substring of `input[i1...iN]` that encodes its `j`th character, when decoded using `input_encoding`.
Parameters
ndarray input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`.
string name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_split(input, 'UTF-8').tolist() [['G', '\xc3\xb6', '\xc3\xb6', 'd', 'n', 'i', 'g', 'h', 't'], ['\xf0\x9f\x98\x8a']] ```

object unicode_split_dyn(object input, object input_encoding, ImplicitContainer<T> errors, ImplicitContainer<T> replacement_char, object name)

Splits each string in `input` into a sequence of Unicode code points.

`result[i1...iN, j]` is the substring of `input[i1...iN]` that encodes its `j`th character, when decoded using `input_encoding`.
Parameters
object input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
object input_encoding
String name for the unicode encoding that should be used to decode each string.
ImplicitContainer<T> errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
ImplicitContainer<T> replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`.
object name
A name for the operation (optional).
Returns
object
A `N+1` dimensional `int32` tensor with shape `[D1...DN, (num_chars)]`. The returned tensor is a tf.Tensor if `input` is a scalar, or a tf.RaggedTensor otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> tf.strings.unicode_split(input, 'UTF-8').tolist() [['G', '\xc3\xb6', '\xc3\xb6', 'd', 'n', 'i', 'g', 'h', 't'], ['\xf0\x9f\x98\x8a']] ```

ValueTuple<object, object> unicode_split_with_offsets(IEnumerable<Byte[]> input, string input_encoding, string errors, int replacement_char, string name)

Splits each string into a sequence of code points with start offsets.

This op is similar to `tf.strings.decode(...)`, but it also returns the start offset for each character in its respective string. This information can be used to align the characters with the original byte sequence.

Returns a tuple `(chars, start_offsets)` where:

* `chars[i1...iN, j]` is the substring of `input[i1...iN]` that encodes its `j`th character, when decoded using `input_encoding`. * `start_offsets[i1...iN, j]` is the start byte offset for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
IEnumerable<Byte[]> input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`.
string name
A name for the operation (optional).
Returns
ValueTuple<object, object>
A tuple of `N+1` dimensional tensors `(codepoints, start_offsets)`.

* `codepoints` is an `int32` tensor with shape `[D1...DN, (num_chars)]`. * `offsets` is an `int64` tensor with shape `[D1...DN, (num_chars)]`.

The returned tensors are tf.Tensors if `input` is a scalar, or tf.RaggedTensors otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> result = tf.strings.unicode_split_with_offsets(input, 'UTF-8') >>> result[0].tolist() # character substrings [['G', '\xc3\xb6', '\xc3\xb6', 'd', 'n', 'i', 'g', 'h', 't'], ['\xf0\x9f\x98\x8a']] >>> result[1].tolist() # offsets [[0, 1, 3, 5, 6, 7, 8, 9, 10], [0]] ```

ValueTuple<object, object> unicode_split_with_offsets(IGraphNodeBase input, string input_encoding, string errors, int replacement_char, string name)

Splits each string into a sequence of code points with start offsets.

This op is similar to `tf.strings.decode(...)`, but it also returns the start offset for each character in its respective string. This information can be used to align the characters with the original byte sequence.

Returns a tuple `(chars, start_offsets)` where:

* `chars[i1...iN, j]` is the substring of `input[i1...iN]` that encodes its `j`th character, when decoded using `input_encoding`. * `start_offsets[i1...iN, j]` is the start byte offset for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
IGraphNodeBase input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
string input_encoding
String name for the unicode encoding that should be used to decode each string.
string errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
int replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`.
string name
A name for the operation (optional).
Returns
ValueTuple<object, object>
A tuple of `N+1` dimensional tensors `(codepoints, start_offsets)`.

* `codepoints` is an `int32` tensor with shape `[D1...DN, (num_chars)]`. * `offsets` is an `int64` tensor with shape `[D1...DN, (num_chars)]`.

The returned tensors are tf.Tensors if `input` is a scalar, or tf.RaggedTensors otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> result = tf.strings.unicode_split_with_offsets(input, 'UTF-8') >>> result[0].tolist() # character substrings [['G', '\xc3\xb6', '\xc3\xb6', 'd', 'n', 'i', 'g', 'h', 't'], ['\xf0\x9f\x98\x8a']] >>> result[1].tolist() # offsets [[0, 1, 3, 5, 6, 7, 8, 9, 10], [0]] ```

object unicode_split_with_offsets_dyn(object input, object input_encoding, ImplicitContainer<T> errors, ImplicitContainer<T> replacement_char, object name)

Splits each string into a sequence of code points with start offsets.

This op is similar to `tf.strings.decode(...)`, but it also returns the start offset for each character in its respective string. This information can be used to align the characters with the original byte sequence.

Returns a tuple `(chars, start_offsets)` where:

* `chars[i1...iN, j]` is the substring of `input[i1...iN]` that encodes its `j`th character, when decoded using `input_encoding`. * `start_offsets[i1...iN, j]` is the start byte offset for the `j`th character in `input[i1...iN]`, when decoded using `input_encoding`.
Parameters
object input
An `N` dimensional potentially ragged `string` tensor with shape `[D1...DN]`. `N` must be statically known.
object input_encoding
String name for the unicode encoding that should be used to decode each string.
ImplicitContainer<T> errors
Specifies the response when an input string can't be converted using the indicated encoding. One of: * `'strict'`: Raise an exception for any illegal substrings. * `'replace'`: Replace illegal substrings with `replacement_char`. * `'ignore'`: Skip illegal substrings.
ImplicitContainer<T> replacement_char
The replacement codepoint to be used in place of invalid substrings in `input` when `errors='replace'`.
object name
A name for the operation (optional).
Returns
object
A tuple of `N+1` dimensional tensors `(codepoints, start_offsets)`.

* `codepoints` is an `int32` tensor with shape `[D1...DN, (num_chars)]`. * `offsets` is an `int64` tensor with shape `[D1...DN, (num_chars)]`.

The returned tensors are tf.Tensors if `input` is a scalar, or tf.RaggedTensors otherwise.

#### Example: ```python >>> input = [s.encode('utf8') for s in (u'G\xf6\xf6dnight', u'\U0001f60a')] >>> result = tf.strings.unicode_split_with_offsets(input, 'UTF-8') >>> result[0].tolist() # character substrings [['G', '\xc3\xb6', '\xc3\xb6', 'd', 'n', 'i', 'g', 'h', 't'], ['\xf0\x9f\x98\x8a']] >>> result[1].tolist() # offsets [[0, 1, 3, 5, 6, 7, 8, 9, 10], [0]] ```

Tensor unicode_transcode(IGraphNodeBase input, string input_encoding, string output_encoding, string errors, int replacement_char, bool replace_control_characters, string name)

Transcode the input text from a source encoding to a destination encoding.

The input is a string tensor of any shape. The output is a string tensor of the same shape containing the transcoded strings. Output strings are always valid unicode. If the input contains invalid encoding positions, the `errors` attribute sets the policy for how to deal with them. If the default error-handling policy is used, invalid formatting will be substituted in the output by the `replacement_char`. If the errors policy is to `ignore`, any invalid encoding positions in the input are skipped and not included in the output. If it set to `strict` then any invalid formatting will result in an InvalidArgument error.

This operation can be used with `output_encoding = input_encoding` to enforce correct formatting for inputs even if they are already in the desired encoding.

If the input is prefixed by a Byte Order Mark needed to determine encoding (e.g. if the encoding is UTF-16 and the BOM indicates big-endian), then that BOM will be consumed and not emitted into the output. If the input encoding is marked with an explicit endianness (e.g. UTF-16-BE), then the BOM is interpreted as a non-breaking-space and is preserved in the output (including always for UTF-8).

The end result is that if the input is marked as an explicit endianness the transcoding is faithful to all codepoints in the source. If it is not marked with an explicit endianness, the BOM is not considered part of the string itself but as metadata, and so is not preserved in the output.
Parameters
IGraphNodeBase input
A `Tensor` of type `string`. The text to be processed. Can have any shape.
string input_encoding
A `string`. Text encoding of the input strings. This is any of the encodings supported by ICU ucnv algorithmic converters. Examples: `"UTF-16", "US ASCII", "UTF-8"`.
string output_encoding
A `string` from: `"UTF-8", "UTF-16-BE", "UTF-32-BE"`. The unicode encoding to use in the output. Must be one of `"UTF-8", "UTF-16-BE", "UTF-32-BE"`. Multi-byte encodings will be big-endian.
string errors
An optional `string` from: `"strict", "replace", "ignore"`. Defaults to `"replace"`. Error handling policy when there is invalid formatting found in the input. The value of 'strict' will cause the operation to produce a InvalidArgument error on any invalid input formatting. A value of 'replace' (the default) will cause the operation to replace any invalid formatting in the input with the `replacement_char` codepoint. A value of 'ignore' will cause the operation to skip any invalid formatting in the input and produce no corresponding output character.
int replacement_char
An optional `int`. Defaults to `65533`. The replacement character codepoint to be used in place of any invalid formatting in the input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character is 0xFFFD or U+65533.)

Note that for UTF-8, passing a replacement character expressible in 1 byte, such as ' ', will preserve string alignment to the source since invalid bytes will be replaced with a 1-byte replacement. For UTF-16-BE and UTF-16-LE, any 1 or 2 byte replacement character will preserve byte alignment to the source.
bool replace_control_characters
An optional `bool`. Defaults to `False`. Whether to replace the C0 control characters (00-1F) with the `replacement_char`. Default is false.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `string`.

object unicode_transcode_dyn(object input, object input_encoding, object output_encoding, ImplicitContainer<T> errors, ImplicitContainer<T> replacement_char, ImplicitContainer<T> replace_control_characters, object name)

Transcode the input text from a source encoding to a destination encoding.

The input is a string tensor of any shape. The output is a string tensor of the same shape containing the transcoded strings. Output strings are always valid unicode. If the input contains invalid encoding positions, the `errors` attribute sets the policy for how to deal with them. If the default error-handling policy is used, invalid formatting will be substituted in the output by the `replacement_char`. If the errors policy is to `ignore`, any invalid encoding positions in the input are skipped and not included in the output. If it set to `strict` then any invalid formatting will result in an InvalidArgument error.

This operation can be used with `output_encoding = input_encoding` to enforce correct formatting for inputs even if they are already in the desired encoding.

If the input is prefixed by a Byte Order Mark needed to determine encoding (e.g. if the encoding is UTF-16 and the BOM indicates big-endian), then that BOM will be consumed and not emitted into the output. If the input encoding is marked with an explicit endianness (e.g. UTF-16-BE), then the BOM is interpreted as a non-breaking-space and is preserved in the output (including always for UTF-8).

The end result is that if the input is marked as an explicit endianness the transcoding is faithful to all codepoints in the source. If it is not marked with an explicit endianness, the BOM is not considered part of the string itself but as metadata, and so is not preserved in the output.
Parameters
object input
A `Tensor` of type `string`. The text to be processed. Can have any shape.
object input_encoding
A `string`. Text encoding of the input strings. This is any of the encodings supported by ICU ucnv algorithmic converters. Examples: `"UTF-16", "US ASCII", "UTF-8"`.
object output_encoding
A `string` from: `"UTF-8", "UTF-16-BE", "UTF-32-BE"`. The unicode encoding to use in the output. Must be one of `"UTF-8", "UTF-16-BE", "UTF-32-BE"`. Multi-byte encodings will be big-endian.
ImplicitContainer<T> errors
An optional `string` from: `"strict", "replace", "ignore"`. Defaults to `"replace"`. Error handling policy when there is invalid formatting found in the input. The value of 'strict' will cause the operation to produce a InvalidArgument error on any invalid input formatting. A value of 'replace' (the default) will cause the operation to replace any invalid formatting in the input with the `replacement_char` codepoint. A value of 'ignore' will cause the operation to skip any invalid formatting in the input and produce no corresponding output character.
ImplicitContainer<T> replacement_char
An optional `int`. Defaults to `65533`. The replacement character codepoint to be used in place of any invalid formatting in the input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character is 0xFFFD or U+65533.)

Note that for UTF-8, passing a replacement character expressible in 1 byte, such as ' ', will preserve string alignment to the source since invalid bytes will be replaced with a 1-byte replacement. For UTF-16-BE and UTF-16-LE, any 1 or 2 byte replacement character will preserve byte alignment to the source.
ImplicitContainer<T> replace_control_characters
An optional `bool`. Defaults to `False`. Whether to replace the C0 control characters (00-1F) with the `replacement_char`. Default is false.
object name
A name for the operation (optional).
Returns
object
A `Tensor` of type `string`.

Tensor unsorted_segment_join(IGraphNodeBase inputs, IGraphNodeBase segment_ids, IGraphNodeBase num_segments, string separator, string name)

Joins the elements of `inputs` based on `segment_ids`.

Computes the string join along segments of a tensor. Given `segment_ids` with rank `N` and `data` with rank `N+M`:

`output[i, k1...kM] = strings.join([data[j1...jN, k1...kM])`

where the join is over all [j1...jN] such that segment_ids[j1...jN] = i. Strings are joined in row-major order.
Parameters
IGraphNodeBase inputs
A `Tensor` of type `string`. The input to be joined.
IGraphNodeBase segment_ids
A `Tensor`. Must be one of the following types: `int32`, `int64`. A tensor whose shape is a prefix of data.shape. Negative segment ids are not supported.
IGraphNodeBase num_segments
A `Tensor`. Must be one of the following types: `int32`, `int64`. A scalar.
string separator
An optional `string`. Defaults to `""`. The separator to use when joining.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `string`.
Show Example
inputs = [['Y', 'q', 'c'], ['Y', '6', '6'], ['p', 'G', 'a']]
            output_array = string_ops.unsorted_segment_join(inputs=inputs,
                                                            segment_ids=[1, 0, 1],
                                                            num_segments=2,
                                                            separator=':'))
            # output_array ==> [['Y', '6', '6'], ['Y:p', 'q:G', 'c:a']] 

inputs = ['this', 'is', 'a', 'test'] output_array = string_ops.unsorted_segment_join(inputs=inputs, segment_ids=[0, 0, 0, 0], num_segments=1, separator=':')) # output_array ==> ['this:is:a:test']

object unsorted_segment_join_dyn(object inputs, object segment_ids, object num_segments, ImplicitContainer<T> separator, object name)

Joins the elements of `inputs` based on `segment_ids`.

Computes the string join along segments of a tensor. Given `segment_ids` with rank `N` and `data` with rank `N+M`:

`output[i, k1...kM] = strings.join([data[j1...jN, k1...kM])`

where the join is over all [j1...jN] such that segment_ids[j1...jN] = i. Strings are joined in row-major order.
Parameters
object inputs
A `Tensor` of type `string`. The input to be joined.
object segment_ids
A `Tensor`. Must be one of the following types: `int32`, `int64`. A tensor whose shape is a prefix of data.shape. Negative segment ids are not supported.
object num_segments
A `Tensor`. Must be one of the following types: `int32`, `int64`. A scalar.
ImplicitContainer<T> separator
An optional `string`. Defaults to `""`. The separator to use when joining.
object name
A name for the operation (optional).
Returns
object
A `Tensor` of type `string`.
Show Example
inputs = [['Y', 'q', 'c'], ['Y', '6', '6'], ['p', 'G', 'a']]
            output_array = string_ops.unsorted_segment_join(inputs=inputs,
                                                            segment_ids=[1, 0, 1],
                                                            num_segments=2,
                                                            separator=':'))
            # output_array ==> [['Y', '6', '6'], ['Y:p', 'q:G', 'c:a']] 

inputs = ['this', 'is', 'a', 'test'] output_array = string_ops.unsorted_segment_join(inputs=inputs, segment_ids=[0, 0, 0, 0], num_segments=1, separator=':')) # output_array ==> ['this:is:a:test']

Tensor upper(IGraphNodeBase input, string encoding, string name)

TODO: add doc.
Parameters
IGraphNodeBase input
A `Tensor` of type `string`.
string encoding
An optional `string`. Defaults to `""`.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `string`.

object upper_dyn(object input, ImplicitContainer<T> encoding, object name)

TODO: add doc.
Parameters
object input
A `Tensor` of type `string`.
ImplicitContainer<T> encoding
An optional `string`. Defaults to `""`.
object name
A name for the operation (optional).
Returns
object
A `Tensor` of type `string`.

Public properties

PythonFunctionContainer bytes_split_fn get;

PythonFunctionContainer format_fn get;

PythonFunctionContainer length_fn get;

PythonFunctionContainer lower_fn get;

PythonFunctionContainer ngrams_fn get;

PythonFunctionContainer regex_full_match_fn get;

PythonFunctionContainer split_fn_ get;

PythonFunctionContainer substr_fn get;

PythonFunctionContainer unicode_decode_fn get;

PythonFunctionContainer unicode_decode_with_offsets_fn get;

PythonFunctionContainer unicode_encode_fn get;

PythonFunctionContainer unicode_script_fn get;

PythonFunctionContainer unicode_split_fn get;

PythonFunctionContainer unicode_split_with_offsets_fn get;

PythonFunctionContainer unicode_transcode_fn get;

PythonFunctionContainer unsorted_segment_join_fn get;

PythonFunctionContainer upper_fn get;