Import Chunk
ETL Import Chunk stores raw extracted data in manageable segments. Each chunk contains a portion of the total data from an ETL Import Batch, stored as JSONL (JSON Lines) format for efficient processing.
Note: This DocType is system-generated during data extraction. Chunks cannot be created or modified manually.
Field Reference
Chunk Information
| Field | Type | Description |
|---|---|---|
| Name | Auto | System-generated unique identifier |
| Import Batch | Link | Parent ETL Import Batch this chunk belongs to |
| Seq No | Int | Sequential number of this chunk within the batch |
| Row Count | Int | Number of data records in this chunk |
| Bytes | Int | Size of raw data in bytes |
| Checksum | Data | Data integrity verification hash |
Data Storage
| Field | Type | Description |
|---|---|---|
| Raw JSONL | Long Text | Raw extracted data in JSON Lines format |
| Processed | Check | Whether this chunk has been processed by transforms |
| Error Message | Small Text | Any errors encountered during chunk creation |
JSONL Format
Raw data is stored as JSONL (JSON Lines) - one JSON object per line:
{"customer_id": 1001, "name": "Acme Corp", "email": "contact@acme.com"}
{"customer_id": 1002, "name": "Beta LLC", "email": "info@beta.com"}
{"customer_id": 1003, "name": "Gamma Inc", "email": "sales@gamma.com"}
Chunk Processing
During transformation, the system:
- Reads each chunk sequentially by Seq No
- Processes each JSONL line as a separate record
- Applies field mapping and business logic
- Creates/updates target DocType records
- Logs results in ETL Transform Events
Performance Considerations
Chunk Size Optimization
- Small chunks (100-500 records): Lower memory usage, more database commits
- Large chunks (2000-5000 records): Higher memory usage, fewer commits
- Recommended: 1000 records per chunk for most use cases
Memory Management
- Chunks are processed sequentially to minimize memory usage
- Raw JSONL is loaded into memory only during processing
- Database connections are properly closed after each chunk
Viewing Chunk Data
To examine raw extracted data:
- Navigate to ETL Import Batch
- Click "View Chunks" to see all chunks for the batch
- Open individual chunks to view Raw JSONL content
- Use browser search to find specific records within chunks
Troubleshooting
Common Issues
- Empty chunks: Check source query filters and data availability
- Large chunk sizes: May cause memory issues during transformation
- Malformed JSON: Usually indicates source data encoding problems
- Processing errors: Review ETL Transform Event logs for details
Data Validation
Before transformation, verify: - Chunk row counts match expected data volume - JSONL format is valid (each line is proper JSON) - Required fields are present in source records - Data types match expected formats
Related DocTypes
- ETL Import Batch: Parent batch containing multiple chunks
- ETL Transform Run: Processes chunks during transformation
- ETL Transform Event: Logs individual chunk processing results