Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (2023)

  • Article
  • 10 minutes to read
  • Version 1
  • Current version

APPLIES TO: Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (1)Azure Data Factory Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (2)Azure Synapse Analytics

This article outlines how to use Copy Activity in Azure Data Factory and Azure Synapse pipelines to copy data from and to Salesforce. It builds on the Copy Activity overview article that presents a general overview of the copy activity.

Supported capabilities

This Salesforce connector is supported for the following capabilities:

Supported capabilitiesIR
Copy activity (source/sink)① ②
Lookup activity① ②

① Azure integration runtime ② Self-hosted integration runtime

For a list of data stores that are supported as sources or sinks, see the Supported data stores table.

Specifically, this Salesforce connector supports:

  • Salesforce Developer, Professional, Enterprise, or Unlimited editions.
  • Copying data from and to Salesforce production, sandbox, and custom domain.

Note

This function supports copy of any schema from the above mentioned Salesforce environments, including the Nonprofit Success Pack (NPSP).

The Salesforce connector is built on top of the Salesforce REST/Bulk API. When copying data from Salesforce, the connector automatically chooses between REST and Bulk APIs based on the data size – when the result set is large, Bulk API is used for better performance; You can explicitly set the API version used to read/write data via apiVersion property in linked service. When copying data to Salesforce, the connector uses BULK API v1.

(Video) Extract Salesforce Data to Azure Data Lake using Azure Data Factory

Note

The connector no longer sets default version for Salesforce API. For backward compatibility, if a default API version was set before, it keeps working. The default value is 45.0 for source, and 40.0 for sink.

Prerequisites

API permission must be enabled in Salesforce.

Salesforce request limits

Salesforce has limits for both total API requests and concurrent API requests. Note the following points:

  • If the number of concurrent requests exceeds the limit, throttling occurs and you see random failures.
  • If the total number of requests exceeds the limit, the Salesforce account is blocked for 24 hours.

You might also receive the "REQUEST_LIMIT_EXCEEDED" error message in both scenarios. For more information, see the "API request limits" section in Salesforce developer limits.

Get started

To perform the Copy activity with a pipeline, you can use one of the following tools or SDKs:

  • The Copy Data tool
  • The Azure portal
  • The .NET SDK
  • The Python SDK
  • Azure PowerShell
  • The REST API
  • The Azure Resource Manager template

Create a linked service to Salesforce using UI

Use the following steps to create a linked service to Salesforce in the Azure portal UI.

  1. Browse to the Manage tab in your Azure Data Factory or Synapse workspace and select Linked Services, then click New:

    • Azure Data Factory
    • Azure Synapse

    Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (3)

  2. Search for Salesforce and select the Salesforce connector.

    Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (4)

    (Video) Extract Salesforce to Azure Data Lake using Azure Data Factory

  3. Configure the service details, test the connection, and create the new linked service.

    Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (5)

Connector configuration details

The following sections provide details about properties that are used to define entities specific to the Salesforce connector.

Linked service properties

The following properties are supported for the Salesforce linked service.

PropertyDescriptionRequired
typeThe type property must be set to Salesforce.Yes
environmentUrlSpecify the URL of the Salesforce instance.
- Default is "https://login.salesforce.com".
- To copy data from sandbox, specify "https://test.salesforce.com".
- To copy data from custom domain, specify, for example, "https://[domain].my.salesforce.com".
No
usernameSpecify a user name for the user account.Yes
passwordSpecify a password for the user account.

Mark this field as a SecureString to store it securely, or reference a secret stored in Azure Key Vault.

Yes
securityTokenSpecify a security token for the user account.

To learn about security tokens in general, see Security and the API. The security token can be skipped only if you add the Integration Runtime's IP to the trusted IP address list on Salesforce. When using Azure IR, refer to Azure Integration Runtime IP addresses.

For instructions on how to get and reset a security token, see Get a security token. Mark this field as a SecureString to store it securely, or reference a secret stored in Azure Key Vault.

No
apiVersionSpecify the Salesforce REST/Bulk API version to use, e.g. 52.0.No
connectViaThe integration runtime to be used to connect to the data store. If not specified, it uses the default Azure Integration Runtime.No

Example: Store credentials

{ "name": "SalesforceLinkedService", "properties": { "type": "Salesforce", "typeProperties": { "username": "<username>", "password": { "type": "SecureString", "value": "<password>" }, "securityToken": { "type": "SecureString", "value": "<security token>" } }, "connectVia": { "referenceName": "<name of Integration Runtime>", "type": "IntegrationRuntimeReference" } }}

Example: Store credentials in Key Vault

{ "name": "SalesforceLinkedService", "properties": { "type": "Salesforce", "typeProperties": { "username": "<username>", "password": { "type": "AzureKeyVaultSecret", "secretName": "<secret name of password in AKV>", "store":{ "referenceName": "<Azure Key Vault linked service>", "type": "LinkedServiceReference" } }, "securityToken": { "type": "AzureKeyVaultSecret", "secretName": "<secret name of security token in AKV>", "store":{ "referenceName": "<Azure Key Vault linked service>", "type": "LinkedServiceReference" } } }, "connectVia": { "referenceName": "<name of Integration Runtime>", "type": "IntegrationRuntimeReference" } }}

Dataset properties

For a full list of sections and properties available for defining datasets, see the Datasets article. This section provides a list of properties supported by the Salesforce dataset.

To copy data from and to Salesforce, set the type property of the dataset to SalesforceObject. The following properties are supported.

PropertyDescriptionRequired
typeThe type property must be set to SalesforceObject.Yes
objectApiNameThe Salesforce object name to retrieve data from.No for source, Yes for sink

Important

The "__c" part of API Name is needed for any custom object.

Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (6)

Example:

(Video) Azure Data Factory: Upsert with the Copy Data Activity

{ "name": "SalesforceDataset", "properties": { "type": "SalesforceObject", "typeProperties": { "objectApiName": "MyTable__c" }, "schema": [], "linkedServiceName": { "referenceName": "<Salesforce linked service name>", "type": "LinkedServiceReference" } }}

Note

For backward compatibility: When you copy data from Salesforce, if you use the previous "RelationalTable" type dataset, it keeps working while you see a suggestion to switch to the new "SalesforceObject" type.

PropertyDescriptionRequired
typeThe type property of the dataset must be set to RelationalTable.Yes
tableNameName of the table in Salesforce.No (if "query" in the activity source is specified)

Copy activity properties

For a full list of sections and properties available for defining activities, see the Pipelines article. This section provides a list of properties supported by Salesforce source and sink.

Salesforce as a source type

To copy data from Salesforce, set the source type in the copy activity to SalesforceSource. The following properties are supported in the copy activity source section.

PropertyDescriptionRequired
typeThe type property of the copy activity source must be set to SalesforceSource.Yes
queryUse the custom query to read data. You can use Salesforce Object Query Language (SOQL) query or SQL-92 query. See more tips in query tips section. If query is not specified, all the data of the Salesforce object specified in "objectApiName" in dataset will be retrieved.No (if "objectApiName" in the dataset is specified)
readBehaviorIndicates whether to query the existing records, or query all records including the deleted ones. If not specified, the default behavior is the former.
Allowed values: query (default), queryAll.
No

Important

The "__c" part of API Name is needed for any custom object.

Copy data from and to Salesforce - Azure Data Factory & Azure Synapse (7)

Example:

"activities":[ { "name": "CopyFromSalesforce", "type": "Copy", "inputs": [ { "referenceName": "<Salesforce input dataset name>", "type": "DatasetReference" } ], "outputs": [ { "referenceName": "<output dataset name>", "type": "DatasetReference" } ], "typeProperties": { "source": { "type": "SalesforceSource", "query": "SELECT Col_Currency__c, Col_Date__c, Col_Email__c FROM AllDataType__c" }, "sink": { "type": "<sink type>" } } }]

Note

For backward compatibility: When you copy data from Salesforce, if you use the previous "RelationalSource" type copy, the source keeps working while you see a suggestion to switch to the new "SalesforceSource" type.

(Video) Azure Synapse Analytics: Introduction to Copy Activity [Introduction to Synapse Analytics - Ep. 10]

Note

Salesforce source doesn't support proxy settings in the self-hosted integration runtime, but sink does.

Salesforce as a sink type

To copy data to Salesforce, set the sink type in the copy activity to SalesforceSink. The following properties are supported in the copy activity sink section.

PropertyDescriptionRequired
typeThe type property of the copy activity sink must be set to SalesforceSink.Yes
writeBehaviorThe write behavior for the operation.
Allowed values are Insert and Upsert.
No (default is Insert)
externalIdFieldNameThe name of the external ID field for the upsert operation. The specified field must be defined as "External ID Field" in the Salesforce object. It can't have NULL values in the corresponding input data.Yes for "Upsert"
writeBatchSizeThe row count of data written to Salesforce in each batch.No (default is 5,000)
ignoreNullValuesIndicates whether to ignore NULL values from input data during a write operation.
Allowed values are true and false.
- True: Leave the data in the destination object unchanged when you do an upsert or update operation. Insert a defined default value when you do an insert operation.
- False: Update the data in the destination object to NULL when you do an upsert or update operation. Insert a NULL value when you do an insert operation.
No (default is false)
maxConcurrentConnectionsTheupperlimitofconcurrentconnectionsestablishedtothedatastoreduringtheactivityrun.Specifyavalueonlywhenyouwanttolimitconcurrentconnections.No

Example: Salesforce sink in a copy activity

"activities":[ { "name": "CopyToSalesforce", "type": "Copy", "inputs": [ { "referenceName": "<input dataset name>", "type": "DatasetReference" } ], "outputs": [ { "referenceName": "<Salesforce output dataset name>", "type": "DatasetReference" } ], "typeProperties": { "source": { "type": "<source type>" }, "sink": { "type": "SalesforceSink", "writeBehavior": "Upsert", "externalIdFieldName": "CustomerId__c", "writeBatchSize": 10000, "ignoreNullValues": true } } }]

Query tips

Retrieve data from a Salesforce report

You can retrieve data from Salesforce reports by specifying a query as {call "<report name>"}. An example is "query": "{call \"TestReport\"}".

Retrieve deleted records from the Salesforce Recycle Bin

To query the soft deleted records from the Salesforce Recycle Bin, you can specify readBehavior as queryAll.

Difference between SOQL and SQL query syntax

When copying data from Salesforce, you can use either SOQL query or SQL query. Note that these two has different syntax and functionality support, do not mix it. You are suggested to use the SOQL query, which is natively supported by Salesforce. The following table lists the main differences:

SyntaxSOQL ModeSQL Mode
Column selectionNeed to enumerate the fields to be copied in the query, e.g. SELECT field1, filed2 FROM objectnameSELECT * is supported in addition to column selection.
Quotation marksFiled/object names cannot be quoted.Field/object names can be quoted, e.g. SELECT "id" FROM "Account"
Datetime formatRefer to details here and samples in next section.Refer to details here and samples in next section.
Boolean valuesRepresented as False and True, e.g. SELECT … WHERE IsDeleted=True.Represented as 0 or 1, e.g. SELECT … WHERE IsDeleted=1.
Column renamingNot supported.Supported, e.g.: SELECT a AS b FROM ….
RelationshipSupported, e.g. Account_vod__r.nvs_Country__c.Not supported.

Retrieve data by using a where clause on the DateTime column

When you specify the SOQL or SQL query, pay attention to the DateTime format difference. For example:

  • SOQL sample: SELECT Id, Name, BillingCity FROM Account WHERE LastModifiedDate >= @{formatDateTime(pipeline().parameters.StartTime,'yyyy-MM-ddTHH:mm:ssZ')} AND LastModifiedDate < @{formatDateTime(pipeline().parameters.EndTime,'yyyy-MM-ddTHH:mm:ssZ')}
  • SQL sample: SELECT * FROM Account WHERE LastModifiedDate >= {ts'@{formatDateTime(pipeline().parameters.StartTime,'yyyy-MM-dd HH:mm:ss')}'} AND LastModifiedDate < {ts'@{formatDateTime(pipeline().parameters.EndTime,'yyyy-MM-dd HH:mm:ss')}'}

Error of MALFORMED_QUERY: Truncated

If you hit error of "MALFORMED_QUERY: Truncated", normally it's due to you have JunctionIdList type column in data and Salesforce has limitation on supporting such data with large number of rows. To mitigate, try to exclude JunctionIdList column or limit the number of rows to copy (you can partition to multiple copy activity runs).

Data type mapping for Salesforce

When you copy data from Salesforce, the following mappings are used from Salesforce data types to interim data types within the service internally. To learn about how the copy activity maps the source schema and data type to the sink, see Schema and data type mappings.

Salesforce data typeService interim data type
Auto NumberString
CheckboxBoolean
CurrencyDecimal
DateDateTime
Date/TimeDateTime
EmailString
IDString
Lookup RelationshipString
Multi-Select PicklistString
NumberDecimal
PercentDecimal
PhoneString
PicklistString
TextString
Text AreaString
Text Area (Long)String
Text Area (Rich)String
Text (Encrypted)String
URLString

Note

(Video) How to Move Data from Salesforce to Azure Data Lake Store

Salesforce Number type is mapping to Decimal type in Azure Data Factory and Azure Synapse pipelines as a service interim data type. Decimal type honors the defined precision and scale. For data whose decimal places exceeds the defined scale, its value will be rounded off in preview data and copy. To avoid getting such precision loss in Azure Data Factory and Azure Synapse pipelines, consider increasing the decimal places to a reasonably large value in Custom Field Definition Edit page of Salesforce.

Lookup activity properties

To learn details about the properties, check Lookup activity.

Next steps

For a list of data stores supported as sources and sinks by the copy activity, see Supported data stores.

FAQs

How do I Copy data from Azure Data Factory? ›

Use the copy data tool to copy data
  1. Step 1: Start the copy data Tool. On the home page of Azure Data Factory, select the Ingest tile to start the Copy Data tool. ...
  2. Step 2: Complete source configuration. ...
  3. Step 3: Complete destination configuration. ...
  4. Step 4: Review all settings and deployment. ...
  5. Step 5: Monitor the running results.
Oct 25, 2022

How do I Copy data in Azure synapse? ›

Azure Data Factory and Synapse pipelines support three ways to load data into Azure Synapse Analytics.
  1. Use COPY statement.
  2. Use PolyBase.
  3. Use bulk insert.
Sep 20, 2022

When working with Azure data/factory How many linked services are required to Copy data from BLOB storage to a SQL database? ›

To copy data from Blob storage to a SQL Database, you create two linked services: Azure Storage and Azure SQL Database. Then, create two datasets: Azure Blob dataset (which refers to the Azure Storage linked service) and Azure SQL Table dataset (which refers to the Azure SQL Database linked service).

What is the optimized way of loading data to Azure synapse from Azure Data Factory? ›

Unparalleled performance by using PolyBase: Polybase is the most efficient way to move data into Azure Synapse Analytics.

How do I bulk Copy in Azure Data Factory? ›

Switch to the Source tab, and do the following steps:
  1. Select AzureSqlDatabaseDataset for Source Dataset.
  2. Select Query option for Use query.
  3. Click the Query input box -> select the Add dynamic content below -> enter the following expression for Query -> select Finish. SQL Copy.
Sep 27, 2022

How do I Copy data from one Azure database to another? ›

Copy using the Azure portal

To copy a database by using the Azure portal, open the page for your database, and then choose Copy to open the Create SQL Database - Copy database page. Fill in the values for the target server where you want to copy your database to.

What is the difference between Azure synapse and Azure Data factory? ›

Difference between Synapse Analytics and Data Factory

Data Factory offers the integration of different data sources, but Synapse Analytics serves as a platform from which you can manage, prepare and serve data for BI and Machine Learning purposes with reporting capabilities.

What is difference between Azure synapse and Azure SQL data warehouse? ›

Basically, Azure Synapse completes the whole data integration and ETL process and is much more than a normal data warehouse since it includes further stages of the process giving the users the possibility to also create reports and visualizations.

Is Azure Synapse a data lake? ›

The lake database in Azure Synapse Analytics enables customers to bring together database design, meta information about the data that is stored and a possibility to describe how and where the data should be stored.

What's the maximum amount of data that can be transferred to Azure in one operation through the Azure data box disk? ›

What is the maximum amount of data I can transfer with one Data Box device? Data Box has a raw capacity of 100 TB and usable capacity of 80 TB. You can transfer up to 80 TB of data with Data Box. To transfer more data, you need to order more devices.

What is type should you install while copying data from Azure to Azure? ›

Azure-SSIS IR network environment

The Azure-SSIS IR can be provisioned in either public network or private network. On-premises data access is supported by joining Azure-SSIS IR to a virtual network that is connected to your on-premises network.

Which of the following are valid options for transforming data within Azure Data Factory? ›

Transform natively in Azure Data Factory and Azure Synapse Analytics with data flows
  • Mapping data flows. ...
  • Data wrangling. ...
  • HDInsight Hive activity. ...
  • HDInsight Pig activity. ...
  • HDInsight MapReduce activity. ...
  • HDInsight Streaming activity. ...
  • HDInsight Spark activity. ...
  • ML Studio (classic) activities.
Sep 23, 2022

What is copy data tool in Azure Data factory? ›

In Azure Data Factory and Synapse pipelines, you can use the Copy activity to copy data among data stores located on-premises and in the cloud. After you copy the data, you can use other activities to further transform and analyze it.

How do I copy data from Azure table storage? ›

To copy data from Azure Table, set the source type in the copy activity to AzureTableSource. The following properties are supported in the copy activity source section. The type property of the copy activity source must be set to AzureTableSource. Use the custom Table storage query to read data.

How do you copy data from BLOB storage to SQL database by using Azure data factory? ›

Next steps
  1. Create a data factory.
  2. Create Azure Storage and Azure SQL Database linked services.
  3. Create Azure Blob and Azure SQL Database datasets.
  4. Create a pipeline containing a copy activity.
  5. Start a pipeline run.
  6. Monitor the pipeline and activity runs.
Sep 27, 2022

Can copy files from Azure VM to local machine? ›

Through RDP

Simply go to your Microsoft Azure portal, select your VM and press the connect button to download an RDP file that you can use to connect to your VM . Now, you have the ability to copy files from your local computer inside the VM over the RDP protocol .

What are the three ways to Copy data? ›

The Windows keyboard shortcut for Copy is the most intuitive: Ctrl + C. The Cut and Paste shortcuts also use the Ctrl key. To cut (or move) in Windows, press: Ctrl + X. After copying or cutting your data, use the Paste shortcut to add it where you want it.

What are the various data transfer options available to Copy data Azure? ›

You copy data to the device and then ship it to Azure where the data is uploaded. The available options for this case are Data Box Disk, Data Box, Data Box Heavy, and Import/Export (use your own disks).

How do I write a SQL query in Azure Data Factory? ›

Create a SQL Server linked service using UI
  1. Browse to the Manage tab in your Azure Data Factory or Synapse workspace and select Linked Services, then click New: Azure Data Factory. ...
  2. Search for SQL and select the SQL Server connector.
  3. Configure the service details, test the connection, and create the new linked service.
Oct 21, 2022

Videos

1. TDPC Jan 2023: Data Lake Management with Azure Synapse and Delta Lake
(All About Data)
2. Migrate AWS S3 Buckets to Azure Blob Storage using Azure Data Factory | Copy data AWS to Azure | ADF
(Praveen Borra - Cloud Learning Path)
3. Use Azure Data Factory to copy and transform data
(Kirby's SQL Talk)
4. 21. Dynamic Column mapping in Copy Activity in Azure Data Factory
(WafaStudies)
5. Azure Data Factory | Copy multiple tables in Bulk with Lookup & ForEach
(Adam Marczak - Azure for Everyone)
6. How to use Copy Activity to Read Json File & Limitation of Copy Activity | Azure Data Factory 2021
(TechBrothersIT)
Top Articles
Latest Posts
Article information

Author: Annamae Dooley

Last Updated: 11/19/2022

Views: 6435

Rating: 4.4 / 5 (45 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Annamae Dooley

Birthday: 2001-07-26

Address: 9687 Tambra Meadow, Bradleyhaven, TN 53219

Phone: +9316045904039

Job: Future Coordinator

Hobby: Archery, Couponing, Poi, Kite flying, Knitting, Rappelling, Baseball

Introduction: My name is Annamae Dooley, I am a witty, quaint, lovely, clever, rich, sparkling, powerful person who loves writing and wants to share my knowledge and understanding with you.