Automatic

CSV File Splitting

Handle massive spreadsheets with automatic row-by-row splitting for perfect search results

What is CSV File Splitting?

When you upload large CSV or Excel files, George AI automatically splits them into individual markdown files—one per row. This ensures that when you search, you find exactly the row you're looking for, not an entire massive file.

No configuration needed. Upload a 100,000-row product catalog, and George AI handles everything automatically.

Maximum Tested

732K Rows

Largest successfully processed file

Memory Usage

~1 KB

Constant, regardless of file size

Search Precision

1 Row

Each search result = 1 record

How It Works

Upload CSV

Any size file
Auto-Split

One file per row
Embed

Vector search enabled
Search

Find exact rows

Example: Product Catalog (10,000 Products)

Input:

SKU,Name,Price,Stock

P-001,Widget A,29.99,150

P-002,Widget B,39.99,75

...

P-10000,Widget Z,19.99,200

Output (File Structure):

products.md (summary)

parts/

  0/ (rows 1-100)

    1.md, 2.md, ..., 100.md

  1/ (rows 101-200)

  2/ (rows 201-300)

...

Bucketed Storage: Files organized into folders of 100 for efficient access

Benefits

Semantic Search Precision

Each row becomes one semantic chunk. When you search "red t-shirt size M", you get that exact product row—not a 50,000-row file.

Memory Efficiency

Streaming architecture processes files with constant ~1KB memory usage, regardless of file size. Handle 700K+ rows without performance degradation.

Fast Pagination

Bucketed storage (100 files per directory) enables fast UI navigation. Metadata caching makes browsing split files instant.

Enrichment-Ready

Each row is a list item. Add enrichment fields to extract additional data (e.g., "Product Category" from description). Perfect for data cleaning.

Viewing Split Files

Markdown File Selector

After upload, navigate to the file in your library:

Open the library containing your CSV file
Click on the file name
Use the Markdown File Selector dropdown to choose which row to view
Dropdown shows: "Summary", "Row 1", "Row 2", etc.

Pagination Controls

For files with many rows, pagination controls appear automatically:

Navigate between rows using previous/next buttons
Jump to specific row numbers
View summary file to see total row count and column names

Configuration

Automatic - No configuration needed!

CSV file splitting is enabled by default for all libraries. Just upload and go.

Advanced: Library Settings

For power users, the setting is controlled in library configuration:

Setting	Default Value	Description
`splitByCsvRows`	Enabled	Automatically split CSV/Excel files by rows

Technical Details

File Storage Structure

Storage Layout:

/storage/libraries/{libraryId}/files/{fileId}/

  main.md                  # Summary file

  parts/0/1.md             # Row 1

  parts/0/2.md             # Row 2

...

  parts/0/100.md           # Row 100

  parts/1/101.md           # Row 101 (new bucket)

...

Markdown Format per Row

Each row becomes a structured markdown file:

# Row 1

**SKU:** P-001

**Name:** Widget A

**Price:** 29.99

**Stock:** 150

Embedding Strategy

One Chunk per Row: Each markdown file = one semantic chunk
Batch Processing: Embeddings generated in parallel batches for speed
Part Number Tracking: Each embedding stores its row number for retrieval
Summary Embedding: Main file (column headers + stats) also embedded

Common Use Cases

Product Catalogs

Upload supplier product lists (50,000+ products). Search finds exact SKUs. Enrich to extract missing data (category, brand). Export to e-commerce platform via automations.

Inventory Lists

Process warehouse inventory spreadsheets. Search by location, product code, or description. Track stock levels across multiple warehouses.

Customer & Contact Lists

Import CRM exports. Search by name, company, or email. Enrich with additional data from web APIs. Clean and deduplicate records.

Transaction & Order Logs

Process order history CSVs (100K+ transactions). Search by order number, customer, or date. Analyze patterns with AI enrichments.