SpacyTextSplitter
| Property | Pattern | Type | Deprecated | Definition | Title/Description |
|---|---|---|---|---|---|
| + implementation | No | const | No | - | Implementation |
| - chunk_size | No | integer | No | - | Chunk Size |
| - chunk_overlap | No | integer | No | - | Chunk Overlap |
| - keep_separator | No | boolean | No | - | Keep Separator |
| - strip_whitespace | No | boolean | No | - | Strip Whitespace |
| - separator | No | string | No | - | Separator |
| - pipeline | No | string | No | - | Pipeline |
| - max_length | No | integer | No | - | Max Length |
1. Property implementation
Title: Implementation
| Type | const |
| Required | Yes |
Specific value: "SpacyTextSplitter"
2. Property chunk_size
Title: Chunk Size
| Type | integer |
| Required | No |
| Default | 4000 |
Description: Maximum size of chunks to return
3. Property chunk_overlap
Title: Chunk Overlap
| Type | integer |
| Required | No |
| Default | 200 |
Description: Overlap in characters between chunks
4. Property keep_separator
Title: Keep Separator
| Type | boolean |
| Required | No |
| Default | false |
Description: Whether to keep the separator in the chunks
5. Property strip_whitespace
Title: Strip Whitespace
| Type | boolean |
| Required | No |
| Default | true |
Description: If True, strips whitespace from the start and end of every document
6. Property separator
Title: Separator
| Type | string |
| Required | No |
| Default | "\n\n" |
Description: Separator to split on
7. Property pipeline
Title: Pipeline
| Type | string |
| Required | No |
| Default | "en_core_web_sm" |
Description: Spacy pipeline to use
8. Property max_length
Title: Max Length
| Type | integer |
| Required | No |
| Default | 1000000 |
Description: Maximum length of characters to process