Pages

Friday, January 20, 2017

Choose Your Own Adventure - Dirty Dozen showdown with the REST API vs SOAP API vs BULK API

You're an external system to Salesforce. Stuff happened and now there are a dozen dirty records that need to be updated in Salesforce to reflect the changes. An active Salesforce Session ID (a.k.a access token) that can be used to make API calls with is available. All the records have the corresponding Salesforce Ids, so a direct update can be performed. Ignore for the moment that the records might also be deleted and in the recycle bin or fully deleted (really truly gone).

To further complicate matters, there is a quagmire of triggers, workflow, and validation on the objects in Salesforce. This is a subscriber org for a managed package, so you can't just fix those.

Which API do you use to update those records in Salesforce?
Pick a path:

  1. You use REST API PATCH requests to update records. Turn to page 666
  2. You use the REST API composite batch resource to update records. Turn to page 78
  3. You use the REST API composite tree resource to update the records. Turn to page √–1
  4. You use the SOAP API update() call. Turn to page 42
  5. You use the Bulk API to update them. Turn to page page 299792458
  6. You hand craft an Apex REST web service to do the processing. Turn to page 0

REST API PATCH requests

There are 12 records and the API will only allow you to PATCH one at a time. So that's 12 API calls.

You die a slow and painful death. GAME OVER

Try Again?

Postmortem:

Each request round trips to Salesforce, processes all the triggers,workflow, validation on each individual record, and returns the result. Individually each request is only a couple of seconds, but collectively they take way too long for the waiting user.

Request

POST /services/data/v38.0/sobjects/OpportunityLineItem/00k7000000eaaZBAAY HTTP/1.1
Host: na5.salesforce.com
Authorization: Bearer 00D700000000001!AQ0AQOzUlrjD_NotARealSession_x61fsbSS6GGWJ123456789mKjmhS0myiYYK_sW_zba
Content-Type: application/json

Request Body

{"End_Date__c": "2017-01-19"}

204 Response: Time (2,018 to 2,758 ms) multiplied by twelve records gives 24,216 to 33,096 ms

REST API Composite batch

You learnt your lesson with the individual REST API calls (or maybe you came straight here), so switch to a single composite batch call. This will give you one round trip to the server.

You die a (slightly less, but still very much) slow and painful death. GAME OVER

Try Again?

Postmortem:

You're down to one API request, which is good. But less than desirable things are happening in Salesforce. Each sub request in the batch is splitting into a separate transaction.

There is still a big penalty to pay for running the accumulation of triggers and other gunk one record at a time. The trigger bulkification can't help you is they are all separate transactions.

Also, don't forget that you can only do 25 records per batch. Not such a problem with 12 records, but it has limited scaling potential.

Request

POST /services/data/v38.0/composite/batch HTTP/1.1
Host: na5.salesforce.com
Authorization: Bearer 00D700000000001!AQ0AQOzUlrjD_StillNotARealSession_x61fsbSS6GGWJ123456789mKjmhS0myiYYK
Content-Type: application/json

Request Body

{
 "batchRequests": [{
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZBAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZCAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZDAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZEAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZFAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  },
                //...
  
 ]
}

Response: Time (20,053 ms)

{
    "hasErrors": false,
    "results": [
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        //...
    ]
}

Bonus

Look at the log duration for each sub request. They appear to be the accumulation of time for the entire API request rather than each individual sub transaction. It certainly confused me for a bit.

REST API Composite tree

Currently (as at Spring '17) it can work with up to 200 records, which is a good start. However, the composite tree resource is only for creating records, not updating them.

You die of embarrassment from trying to use an incompatible API. GAME OVER

Try Again?

Postmortem:

Always check the documentation first.

SOAP API update call

SOAP, are you sure? That API's been rattling around since 2004 in API v5.0.

Success, the records are all updated in a reasonable timeframe.

Try something else?

Review:

One POST request, and 4262 ms later you have a response. Processing time does increase with each record added, but nowhere near the overhead of the previous REST API's.

POST Request to https://na5.salesforce.com/services/Soap/u/38.0

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="urn:partner.soap.sforce.com" xmlns:urn1="urn:sobject.partner.soap.sforce.com">
   <soapenv:Header>
      <urn:SessionHeader>
         <urn:sessionId>00D700000000001!AQ0AQOzUlrjD_SessionIdCleanedWithSoap_x61fsbSS6GGWJ123456789mKjmhS0myiYYK_sW_zba</urn:sessionId>
      </urn:SessionHeader>
   </soapenv:Header>
   <soapenv:Body>
      <urn:update>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZBAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZCAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZDAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZEAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <!-- ... -- >

      </urn:update>
   </soapenv:Body>
</soapenv:Envelope>

Response

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns="urn:partner.soap.sforce.com">
   <soapenv:Header>
      <LimitInfoHeader>
         <limitInfo>
            <current>465849</current>
            <limit>6700000</limit>
            <type>API REQUESTS</type>
         </limitInfo>
      </LimitInfoHeader>
   </soapenv:Header>
   <soapenv:Body>
      <updateResponse>
         <result>
            <id>00k7000000eaaZBAAY</id>
            <success>true</success>
         </result>
         <result>
            <id>00k7000000eaaZCAAY</id>
            <success>true</success>
         </result>
         <result>
            <id>00k7000000eaaZDAAY</id>
            <success>true</success>
         </result>
         <!-- ... -->
            
      </updateResponse>
   </soapenv:Body>
</soapenv:Envelope>

Bulk API

It's primarily billed as a way to asynchronously load large sets of data into Salesforce. Let's see how we go with only 12...

You have a harrowing brush with death by API ceremony. If the asynchronous gods favor you it is a timely update. Otherwise disgruntled users tear you limb from limb as they get fed up of waiting for the results to come back.

Try something else?

Results:

There are five API calls to be made to complete this operation on a good day. If things go bad then you might be waiting longer than expected. You need to keep polling the API for the job to complete before you can get the results back. You're also burning five API calls where you could be using one to complete the entire operation.

Create Job

Request

POST /services/async/38.0/job HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_NothingToSeeHere_x61fsbSS6GGWJ123456789mKjmhS0my
Content-Type: application/xml

Request Body

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <operation>update</operation>
    <object>OpportunityLineItem</object>
    <contentType>CSV</contentType>
</jobInfo>

Response Time (617 ms)

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75070000003qVrHAAU</id>
    <operation>update</operation>
    <object>OpportunityLineItem</object>
    <createdById>00570000004uCVJAA2</createdById>
    <createdDate>2017-01-19T23:08:06.000Z</createdDate>
    <systemModstamp>2017-01-19T23:08:06.000Z</systemModstamp>
    <state>Open</state>
    <concurrencyMode>Parallel</concurrencyMode>
    <contentType>CSV</contentType>
    <numberBatchesQueued>0</numberBatchesQueued>
    <numberBatchesInProgress>0</numberBatchesInProgress>
    <numberBatchesCompleted>0</numberBatchesCompleted>
    <numberBatchesFailed>0</numberBatchesFailed>
    <numberBatchesTotal>0</numberBatchesTotal>
    <numberRecordsProcessed>0</numberRecordsProcessed>
    <numberRetries>0</numberRetries>
    <apiVersion>38.0</apiVersion>
    <numberRecordsFailed>0</numberRecordsFailed>
    <totalProcessingTime>0</totalProcessingTime>
    <apiActiveProcessingTime>0</apiActiveProcessingTime>
    <apexProcessingTime>0</apexProcessingTime>
</jobInfo>

Add a Batch to the Job

Request

POST /services/async/38.0/job/75070000003qVrHAAU/batch HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_HereIsSomeWorkToDo_x61fsbSS6GGWJ123456mKjmhS0myiYYK_sW_zba
Content-Type: text/csv

Request Body

Id,End_Date__c
"00k7000000eaaZBAAY","2017-01-19"
"00k7000000eaaZCAAY","2017-01-19"
"00k7000000eaaZDAAY","2017-01-19"
"00k7000000eaaZEAAY","2017-01-19"
"00k7000000eaaZFAAY","2017-01-19"
"00k7000000eaaYDAAY","2017-01-19"
"00k7000000eaaZQAAY","2017-01-19"
"00k7000000eaaZpAAI","2017-01-19"
"00k7000000eaaa4AAA","2017-01-19"
"00k7000000eaaZkAAI","2017-01-19"
"00k7000000eaaZlAAI","2017-01-19"
"00k7000000eaaXKAAY","2017-01-19"

Response time: 964 ms

<?xml version="1.0" encoding="UTF-8"?>
<batchInfo
   
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75170000005cAFMAA2</id>
    <jobId>75070000003qVrHAAU</jobId>
    <state>Queued</state>
    <createdDate>2017-01-19T23:15:21.000Z</createdDate>
    <systemModstamp>2017-01-19T23:15:21.000Z</systemModstamp>
    <numberRecordsProcessed>0</numberRecordsProcessed>
    <numberRecordsFailed>0</numberRecordsFailed>
    <totalProcessingTime>0</totalProcessingTime>
    <apiActiveProcessingTime>0</apiActiveProcessingTime>
    <apexProcessingTime>0</apexProcessingTime>
</batchInfo>

Close the Job

Request

POST /services/async/38.0/job/75070000003qVrHAAU HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_AnotherApiCall_ReallyQ_x61fsbSS6GGWJ56789mKjmhS0myiYYK_sW_zba
Content-Type: application/xml; charset-UTF-8

Request Body

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo xmlns="http://www.force.com/2009/06/asyncapi/dataload">
  <state>Closed</state>
</jobInfo>

Response time: 1291 ms

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo
   
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75070000003qVrHAAU</id>
    <operation>update</operation>
    <object>OpportunityLineItem</object>
    <createdById>00570000004uCVJAA2</createdById>
    <createdDate>2017-01-19T23:08:06.000Z</createdDate>
    <systemModstamp>2017-01-19T23:08:06.000Z</systemModstamp>
    <state>Closed</state>
    <concurrencyMode>Parallel</concurrencyMode>
    <contentType>CSV</contentType>
    <numberBatchesQueued>0</numberBatchesQueued>
    <numberBatchesInProgress>0</numberBatchesInProgress>
    <numberBatchesCompleted>0</numberBatchesCompleted>
    <numberBatchesFailed>1</numberBatchesFailed>
    <numberBatchesTotal>1</numberBatchesTotal>
    <numberRecordsProcessed>0</numberRecordsProcessed>
    <numberRetries>0</numberRetries>
    <apiVersion>38.0</apiVersion>
    <numberRecordsFailed>0</numberRecordsFailed>
    <totalProcessingTime>0</totalProcessingTime>
    <apiActiveProcessingTime>0</apiActiveProcessingTime>
    <apexProcessingTime>0</apexProcessingTime>
</jobInfo>

Check the Batch Status

Request

GET /services/async/38.0/job/75070000003qVrHAAU/batch/75170000005cAFMAA2 HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_LosingTheWillToLive_x61fsbSS6GGWJ126789mKjmhS0myiYYK_sW_zba

Response time: 242 ms

<?xml version="1.0" encoding="UTF-8"?>
<batchInfo
   
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75170000005cAFMAA2</id>
    <jobId>75070000003qVrHAAU</jobId>
    <state>Completed</state>
    <createdDate>2017-01-19T23:27:54.000Z</createdDate>
    <systemModstamp>2017-01-19T23:27:56.000Z</systemModstamp>
    <numberRecordsProcessed>12</numberRecordsProcessed>
    <numberRecordsFailed>1</numberRecordsFailed>
    <totalProcessingTime>1889</totalProcessingTime>
    <apiActiveProcessingTime>1741</apiActiveProcessingTime>
    <apexProcessingTime>1555</apexProcessingTime>
</batchInfo>

Retrieve the Batch Results

Request

GET /services/async/38.0/job/75070000003qVrHAAU/batch/75170000005cAFMAA2/result HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_AreWeThereYet_x61fsbSS6GGWJ123456789mKjmhS0myiYYK_sW_zba

Response time: 236 ms

"Id","Success","Created","Error"
"00k7000000eaaZBAAY","true","false",""
"00k7000000eaaZCAAY","true","false",""
"00k7000000eaaZDAAY","true","false",""
"00k7000000eaaZEAAY","true","false",""
"00k7000000eaaZFAAY","true","false",""
"00k7000000eaaYDAAY","true","false",""
"00k7000000eaaZQAAY","true","false",""
"00k7000000eaaZpAAI","true","false",""
"00k7000000eaaa4AAA","true","false",""
"00k7000000eaaZkAAI","true","false",""
"00k7000000eaaZlAAI","true","false",""
"00k7000000eaaXKAAY","true","false",""

Review:

With only a single call to check the batch status it came back at a respectable 3350 ms total for all the API calls. That doesn't include any of the overhead on the client side. There could be some variance here while waiting for they Async job to complete.

Apex REST Web Service

OK, I'll be honest, after all those BULK API calls I'm exhausted. Also, I can't just deploy an Apex Web Service to the production org where I was bench marking against.

Your fate is ambiguous because the narrator was to lazy to test it. Go to page 0.

Try something else? or Try again?

Review:

Performance is probably "pretty good"™ with only one API call and one transaction that can use the bulkification in the triggers. However, you'll need to define the interface, maintain the code, create tests and mocks.

Revised Results

I had some time to revisit this, create an Apex REST web service in the sandbox, and test it.

It takes a bit more effort to create the Apex class with the associated test methods and then deploy them to production. The end result is a timely response.

Revised Review:

In the ideal world the Apex REST web service would be streamlined to the operation being performed. I sort of cheated a bit and created it to have the same signature as the composite batch API. It also bypasses any sort of error checking or handling.

@RestResource(urlMapping='/compositebatch/*')
global class TestRestResource {

    @HttpPatch
    global static BatchRequestResult updateOlis() {
        
        RestRequest req = RestContext.request;
        BatchRequest input = (BatchRequest)JSON.deserialize(req.requestBody.toString(), BatchRequest.class);
        
        BatchRequestResult result = new BatchRequestResult();
        result.hasErrors = false;
        result.results = new List<BatchResult>();
        
        List<OpportunityLineItem> olisToUpdate = new List<OpportunityLineItem>();
        for(BatchRequests br : input.batchRequests) {
            olisToUpdate.add(br.richInput);
            Id oliId = br.url.substringAfterLast('/');
            br.richInput.Id = oliId;
            
            result.results.add(new BatchResult(204));
        }
        System.debug('Updating: ' + olisToUpdate.size() + ' records');
        
        // Should be using Database.update so any errors could be split out.
        update olisToUpdate;  
        
       return result;
    }
    
    global class BatchRequest {
        public List<BatchRequests> batchRequests;
    }
    
    global class BatchRequests {
        public String method;
        public String url;
        public OpportunityLineItem richInput;
    }
    
    global class BatchRequestResult {
        boolean hasErrors;
        List<BatchResult> results;
    }
    
    global class BatchResult {
        public integer statusCode;
        public string result;
        
        public BatchResult(integer status) {
            this.statusCode = status;
        }
    }
    
}

This can then use exactly the same request that the composite batch did.

Response: Time (3,362 ms) against a sandbox Org

To give a relative benchmark in the same sandbox, the SOAP API took 3,172 ms. That gives a time of around 4,500 ms in "production time".

Summary

Lets recap how long it took to update our dozen dirty records:

  • REST API PATCH requests — 24,216 to 33,096 ms
  • REST API Composite batch — 20,053 ms
  • REST API Composite tree — n/a for updates
  • SOAP API update call — 4262 ms
  • Bulk API — 3350 ms = 617 ms + 964 ms + 1291 ms + n*242 ms + 236 ms
  • Apex REST Web Service — 4,517 ms (extrapolated from sandbox)

I was expecting the SOAP API to fare better against the Bulk API with such a small set of records and one API call versus five. But they came out pretty comparable.

Certainly as the number of records increases the Bulk API should leave the SOAP API in the dust. Especially with the SOAP API needing to start batching ever 200 records.

The other flavors of the REST API are pretty awful when updating multiple records of the same type as they get processed in individual transactions. To be fair, that's not what they are intended for.

Your results will vary significantly as the subscriber org I was testing against had some pretty funky triggers going on. Those triggers were magnifying the impact of sub request transaction splitting by the composite batch processing. I wouldn't usually classify 4 second responses as "timely". It's all relative.

Also, I could have been more rigorous in how the timing measurements were made. E.g. trying multiple times, etc... It's pretty difficult to get consistent times when there are so many variables in a multi-tenanted environment. Repeated calls could easily create ± 500 ms variance between calls.

The idea did occur to me to Allow REST API composite batch subrequests to be processed in one transaction. That would overcome the gap in the REST API where a small number of related records could be updated in one API call.


See Also:

3 comments:

  1. Great article. I kept meaning to test REST vs SOAP performance with all the factors removed (plain dev org, mostly do queries). There is an excellent tool called httpstat where you can see actual server processing time.

    I suspect REST would loose here again, just because of extra JSON deserialisation layer, but probably it won't be much.

    ReplyDelete
  2. If am sending a data more than 131070 bytes using /Composite/ API getting a error that "data is too large". I have created a trigger to handle this scenario that before insert substring the data to less than 131000 bytes and then send remaining data as attachment. But In case of large data it is not hitting the trigger directly it is sending the response to the ipad that"data is too large" .can you help me to fix this issue.????

    ReplyDelete