Skip to content

SpacyTextSplitter

PropertyPatternTypeDeprecatedDefinitionTitle/Description
+ implementationNoconstNo-Implementation
- chunk_sizeNointegerNo-Chunk Size
- chunk_overlapNointegerNo-Chunk Overlap
- keep_separatorNobooleanNo-Keep Separator
- strip_whitespaceNobooleanNo-Strip Whitespace
- separatorNostringNo-Separator
- pipelineNostringNo-Pipeline
- max_lengthNointegerNo-Max Length

1. Property implementation

Title: Implementation

Typeconst
RequiredYes

Specific value: "SpacyTextSplitter"

2. Property chunk_size

Title: Chunk Size

Typeinteger
RequiredNo
Default4000

Description: Maximum size of chunks to return

3. Property chunk_overlap

Title: Chunk Overlap

Typeinteger
RequiredNo
Default200

Description: Overlap in characters between chunks

4. Property keep_separator

Title: Keep Separator

Typeboolean
RequiredNo
Defaultfalse

Description: Whether to keep the separator in the chunks

5. Property strip_whitespace

Title: Strip Whitespace

Typeboolean
RequiredNo
Defaulttrue

Description: If True, strips whitespace from the start and end of every document

6. Property separator

Title: Separator

Typestring
RequiredNo
Default"\n\n"

Description: Separator to split on

7. Property pipeline

Title: Pipeline

Typestring
RequiredNo
Default"en_core_web_sm"

Description: Spacy pipeline to use

8. Property max_length

Title: Max Length

Typeinteger
RequiredNo
Default1000000

Description: Maximum length of characters to process