TokenTextSplitter
Property | Pattern | Type | Deprecated | Definition | Title/Description |
---|---|---|---|---|---|
+ implementation | No | const | No | - | Implementation |
- chunk_size | No | integer | No | - | Chunk Size |
- chunk_overlap | No | integer | No | - | Chunk Overlap |
- keep_separator | No | boolean | No | - | Keep Separator |
- strip_whitespace | No | boolean | No | - | Strip Whitespace |
- encoding_name | No | string | No | - | Encoding Name |
- model | No | string | No | - | Model |
- allowed_special | No | Combination | No | - | Allowed Special |
- disallowed_special | No | Combination | No | - | Disallowed Special |
1. Property implementation
Title: Implementation
Type | const |
Required | Yes |
Specific value: "TokenTextSplitter"
2. Property chunk_size
Title: Chunk Size
Type | integer |
Required | No |
Default | 4000 |
Description: Maximum size of chunks to return
3. Property chunk_overlap
Title: Chunk Overlap
Type | integer |
Required | No |
Default | 200 |
Description: Overlap in characters between chunks
4. Property keep_separator
Title: Keep Separator
Type | boolean |
Required | No |
Default | false |
Description: Whether to keep the separator in the chunks
5. Property strip_whitespace
Title: Strip Whitespace
Type | boolean |
Required | No |
Default | true |
Description: If True
, strips whitespace from the start and end of every document
6. Property encoding_name
Title: Encoding Name
Type | string |
Required | No |
Default | "gpt2" |
Description: Encoding name
7. Property model
Title: Model
Type | string |
Required | No |
Default | null |
Description: Model name
8. Property allowed_special
Title: Allowed Special
Type | combining |
Required | No |
Additional properties | [Any type: allowed] |
Default | [] |
Description: Allowed special tokens
8.1. Property item 0
Type | const |
Required | No |
Specific value: "all"
8.2. Property item 1
Type | array of string |
Required | No |
Array restrictions | |
---|---|
Min items | N/A |
Max items | N/A |
Items unicity | False |
Additional items | False |
Tuple validation | See below |
Each item of this array must be | Description |
---|---|
item 1 items | - |
8.2.1. item 1 items
Type | string |
Required | No |
9. Property disallowed_special
Title: Disallowed Special
Type | combining |
Required | No |
Additional properties | [Any type: allowed] |
Default | "all" |
Description: Disallowed special tokens
9.1. Property item 0
Type | const |
Required | No |
Specific value: "all"
9.2. Property item 1
Type | array of string |
Required | No |
Array restrictions | |
---|---|
Min items | N/A |
Max items | N/A |
Items unicity | False |
Additional items | False |
Tuple validation | See below |
Each item of this array must be | Description |
---|---|
item 1 items | - |
9.2.1. item 1 items
Type | string |
Required | No |