Skip to content

MarkdownHeaderTextSplitter

PropertyPatternTypeDeprecatedDefinitionTitle/Description
+ implementationNoconstNo-Implementation
- chunk_sizeNointegerNo-Chunk Size
- chunk_overlapNointegerNo-Chunk Overlap
- keep_separatorNobooleanNo-Keep Separator
- strip_whitespaceNobooleanNo-Strip Whitespace
+ headers_to_split_onNoarray of arrayNo-Headers To Split On
- return_each_lineNobooleanNo-Return Each Line

1. Property implementation

Title: Implementation

Typeconst
RequiredYes

Specific value: "MarkdownHeaderTextSplitter"

2. Property chunk_size

Title: Chunk Size

Typeinteger
RequiredNo
Default4000

Description: Maximum size of chunks to return

3. Property chunk_overlap

Title: Chunk Overlap

Typeinteger
RequiredNo
Default200

Description: Overlap in characters between chunks

4. Property keep_separator

Title: Keep Separator

Typeboolean
RequiredNo
Defaultfalse

Description: Whether to keep the separator in the chunks

5. Property strip_whitespace

Title: Strip Whitespace

Typeboolean
RequiredNo
Defaulttrue

Description: If True, strips whitespace from the start and end of every document

6. Property headers_to_split_on

Title: Headers To Split On

Typearray of array
RequiredYes

Description: Headers we want to track, e.g., #, ##, etc.

Array restrictions
Min itemsN/A
Max itemsN/A
Items unicityFalse
Additional itemsFalse
Tuple validationSee below
Each item of this array must beDescription
headers_to_split_on items-

6.1. headers_to_split_on items

Typearray
RequiredNo
Array restrictions
Min items2
Max items2
Items unicityFalse
Additional itemsFalse
Tuple validationSee below
Each item of this array must beDescription
headers_to_split_on items item 0-
headers_to_split_on items item 1-

6.1.1. headers_to_split_on items item 0

Typestring
RequiredNo

6.1.2. headers_to_split_on items item 1

Typestring
RequiredNo

7. Property return_each_line

Title: Return Each Line

Typeboolean
RequiredNo
Defaultfalse

Description: Return each line w/ associated headers