Scheduled ETL Job

ETL Job provides automated scheduling for ETL processes. It combines an ETL Data Source and ETL Transform Map into a scheduled job that runs extraction and transformation automatically.

Field Reference

Job Configuration

Field	Type	Required	Description
Display Name	Data	Yes	Human-readable name for this job
Application	Link	Yes	My Application this job belongs to
Active	Check	No	Enable/disable job execution (default: checked)

ETL Configuration

Field	Type	Required	Description
Data Source	Link	Yes	ETL Data Source to extract data from
Transform Map	Link	Yes	ETL Transform Map to apply during transformation

Scheduling

Field	Type	Required	Description
Event Frequency	Select	Yes	How often to run (All, Hourly, Daily, Weekly, Monthly, Yearly, Cron, Custom)
Time	Time	No	What time of day to run (default: 00:00:00)

Custom Frequency Fields

Available when Event Frequency = "Custom"

Field	Type	Description
Custom Frequency	Select	Daily, Weekly, Monthly, or Yearly
Weekday	Select	Day of week (for Weekly)
Day	Select	Day of month (for Monthly/Yearly)
Month	Select	Month (for Yearly)

Cron Schedule

Available when Event Frequency = "Cron"

Field	Type	Description
Cron Format	Data	Standard cron expression (e.g., "0 2 * * *" for daily at 2 AM)

How ETL Jobs Work

When you save an ETL Job, the system automatically:

Creates a Server Script with scheduler event type
Generates execution code that calls the ETL pipeline:

mantera.run_etl_job("data_source_name", "transform_map_name")
Configures scheduling based on Event Frequency settings
Manages the Server Script lifecycle (update/delete when job changes)

ETL Pipeline Execution

Each scheduled execution:

Extracts data from the configured Data Source
Creates Import Batch with extracted data in chunks
Applies Transform Map to create/update target records
Logs results in Transform Run with detailed events
Runs in background to avoid blocking the system

Actions

Run ETL Job

Manually triggers job execution for testing purposes. The job runs in the background queue with a 1-hour timeout.

Frequency Options

Standard Frequencies

All: Runs every minute (use with caution)
Hourly: Every hour at minute 0
Daily: Once per day at specified time
Weekly: Once per week on Sunday at specified time
Monthly: Once per month on the 1st at specified time
Yearly: Once per year on January 1st at specified time

Long-Running Variants

Hourly Long, Daily Long, etc.: Same schedule but with extended timeout for large datasets

Custom Scheduling

Provides more granular control: - Daily: Specify exact time - Weekly: Choose day of week and time - Monthly: Choose day of month and time
- Yearly: Choose month, day, and time

Cron Expressions

Full flexibility using standard cron syntax:

*  *  *  *  *
┬  ┬  ┬  ┬  ┬
│  │  │  │  └ day of week (0-6) (0 is Sunday)
│  │  │  └──── month (1-12)
│  │  └─────── day of month (1-31)
│  └────────── hour (0-23)
└───────────── minute (0-59)

Example Configurations

Daily Customer Import

Display Name: Daily Customer Sync
Data Source: Customer Database Extract  
Transform Map: Customer Import Mapping
Event Frequency: Daily
Time: 02:00:00

Weekly Sales Report

Display Name: Weekly Sales Data
Data Source: Sales API Endpoint
Transform Map: Sales Record Transform
Event Frequency: Custom
Custom Frequency: Weekly
Weekday: Monday
Time: 08:00:00

Complex Cron Schedule

Display Name: Business Hours API Sync
Data Source: External CRM API
Transform Map: Contact Data Transform
Event Frequency: Cron
Cron Format: 0 9,13,17 * * 1-5
(Runs at 9 AM, 1 PM, and 5 PM on weekdays)

Monitoring Jobs

Job Status

Active jobs appear in the Server Script list with matching names
Inactive jobs have disabled Server Scripts
Job execution logs appear in Scheduled Job Log DocType

Troubleshooting

Check Error Log for background job failures
Review ETL Transform Run records for processing details
Monitor ETL Import Batch status for extraction issues
Verify Data Connection is active and accessible

Best Practices

Scheduling

Avoid overlapping executions for the same data source
Schedule during low-traffic hours for better performance
Use appropriate timeouts for data volume size
Consider time zones when setting execution times

Error Handling

Monitor job execution logs regularly
Set up email notifications for critical failures
Test with small datasets before full production runs
Have rollback procedures for data quality issues

Performance

Optimize chunk sizes based on data volume
Index coalesce fields in target DocTypes
Use incremental extraction where possible
Archive old Import Batches and Transform Runs periodically

Security Considerations

ETL Jobs inherit permissions from the Server Script system
Background execution runs with system-level privileges
Ensure proper data validation in Transform Maps
Monitor for unauthorized schedule changes
Use SSL connections for all external data sources

ETL Data Source: Defines what data to extract
ETL Transform Map: Defines how to transform and load data
Server Script: Automatically generated scheduler script
ETL Import Batch: Created during each job execution
ETL Transform Run: Logs each transformation operation

Scheduled ETL Job

Field Reference

Job Configuration

ETL Configuration

Scheduling

Custom Frequency Fields

Cron Schedule

How ETL Jobs Work

ETL Pipeline Execution

Actions

Run ETL Job

Frequency Options

Standard Frequencies

Long-Running Variants

Custom Scheduling

Cron Expressions

Example Configurations

Daily Customer Import

Weekly Sales Report

Complex Cron Schedule

Monitoring Jobs

Job Status

Troubleshooting

Best Practices

Scheduling

Error Handling

Performance

Security Considerations

Related DocTypes