Import Chunk

ETL Import Chunk stores raw extracted data in manageable segments. Each chunk contains a portion of the total data from an ETL Import Batch, stored as JSONL (JSON Lines) format for efficient processing.

Note: This DocType is system-generated during data extraction. Chunks cannot be created or modified manually.

Field Reference

Chunk Information

Field	Type	Description
Name	Auto	System-generated unique identifier
Import Batch	Link	Parent ETL Import Batch this chunk belongs to
Seq No	Int	Sequential number of this chunk within the batch
Row Count	Int	Number of data records in this chunk
Bytes	Int	Size of raw data in bytes
Checksum	Data	Data integrity verification hash

Data Storage

Field	Type	Description
Raw JSONL	Long Text	Raw extracted data in JSON Lines format
Processed	Check	Whether this chunk has been processed by transforms
Error Message	Small Text	Any errors encountered during chunk creation

JSONL Format

Raw data is stored as JSONL (JSON Lines) - one JSON object per line:

{"customer_id": 1001, "name": "Acme Corp", "email": "contact@acme.com"}
{"customer_id": 1002, "name": "Beta LLC", "email": "info@beta.com"}
{"customer_id": 1003, "name": "Gamma Inc", "email": "sales@gamma.com"}

Chunk Processing

During transformation, the system:

Reads each chunk sequentially by Seq No
Processes each JSONL line as a separate record
Applies field mapping and business logic
Creates/updates target DocType records
Logs results in ETL Transform Events

Performance Considerations

Chunk Size Optimization

Small chunks (100-500 records): Lower memory usage, more database commits
Large chunks (2000-5000 records): Higher memory usage, fewer commits
Recommended: 1000 records per chunk for most use cases

Memory Management

Chunks are processed sequentially to minimize memory usage
Raw JSONL is loaded into memory only during processing
Database connections are properly closed after each chunk

Viewing Chunk Data

To examine raw extracted data:

Navigate to ETL Import Batch
Click "View Chunks" to see all chunks for the batch
Open individual chunks to view Raw JSONL content
Use browser search to find specific records within chunks

Troubleshooting

Common Issues

Empty chunks: Check source query filters and data availability
Large chunk sizes: May cause memory issues during transformation
Malformed JSON: Usually indicates source data encoding problems
Processing errors: Review ETL Transform Event logs for details

Data Validation

Before transformation, verify: - Chunk row counts match expected data volume - JSONL format is valid (each line is proper JSON) - Required fields are present in source records - Data types match expected formats

ETL Import Batch: Parent batch containing multiple chunks
ETL Transform Run: Processes chunks during transformation
ETL Transform Event: Logs individual chunk processing results