Salesforce Data Loader: Complete Setup and Usage Guide

Salesforce Data Loader is the standard tool for bulk data operations — inserting, updating, upserting, deleting, and exporting up to 5 million records per run. Despite being free and bundled with every Salesforce org, it trips up new users on a handful of configuration details. This guide covers them all.

Installation

Download from your Salesforce org: Setup → Integrations → Data Loader. The download always gives you the version certified for your current Salesforce release. Java 11+ is required; modern macOS and Windows versions bundle it.

For production migrations, use the command-line mode (covered at the end of this guide) — it's scriptable, loggable, and doesn't require a UI session.

Operations Overview

Operation	When to Use
Insert	New records only; Salesforce generates the ID
Update	Existing records; requires Salesforce ID in CSV
Upsert	Insert new + update existing using an External ID field
Delete	Moves records to Recycle Bin; requires Salesforce ID
Hard Delete	Bypasses Recycle Bin; irreversible (requires API access)
Export / Export All	SOQL query to CSV; Export All includes soft-deleted records

The Upsert + External ID Pattern

This is the most important pattern for migrations. Create a text field on each object (e.g., Legacy_Id__c), mark it as an External ID. Load your data using Upsert, matching on Legacy_Id__c. Run the same file twice — first run inserts, second run updates (no-ops if data is unchanged). This makes migrations idempotent and re-runnable.

Batch Size Settings

In Settings → Settings, configure the batch size. Defaults:

Insert/Update/Upsert: 200 (Bulk API) or 1 (SOAP API)
Query: 500

For most migrations: use Bulk API V2, batch size 2000. For records with complex triggers: reduce to 200 or use serial Bulk API mode to avoid lock errors.

Mapping Fields

After selecting the object and CSV file, Data Loader shows a column-mapping screen. Map each CSV column to the target Salesforce field. Save the mapping as an .sdl file — you'll reuse it when the migration needs to run again.

For lookup relationships, the CSV column should contain either the Salesforce ID of the related record OR an External ID if you're using External ID matching. You cannot use a Name field as a relationship reference in Data Loader.

Reading the Success and Error Files

Every run produces two CSVs: success.csv and error.csv. Error rows include the Salesforce error message. Common errors and their fixes:

REQUIRED_FIELD_MISSING — a required field is null or missing from the CSV
FIELD_CUSTOM_VALIDATION_EXCEPTION — a validation rule blocked the record; disable the rule or fix the data
DUPLICATE_VALUE — duplicate rule triggered; switch to Upsert or disable the rule
INVALID_CROSS_REFERENCE_KEY — a lookup ID doesn't exist in the org; load parent records first

Command-Line Mode

For production migrations, script Data Loader with a process-conf.xml configuration file:

java -cp dataloader.jar com.salesforce.dataloader.process.ProcessRunner \
  process.name=insertAccounts \
  sfdc.username=admin@yourorg.com \
  sfdc.password=password+securitytoken

Store credentials in an encrypted key.txt and password.txt — never pass plaintext passwords in production scripts.

Sumit Kumar Singh

Independent Salesforce Consultant

Data Loader is my tool of choice for migrations up to 1 million records. Above that, I use Bulk API directly or a dedicated ETL platform.

About the Author