Pages

Friday, March 24, 2017

The Mad Catter - Salesforce Predictive Vision Services

Disclaimer

No animals were harmed in the creation of this blog post or the associated presentation. The standard disclaimer applies.

I've got a problem. A cat problem to be precise. While I'm more of a dog person, I don't particularly mind cats. However, recently there has been a bit of an influx of them with the neighbors. There are at least a half dozen of them that roam freely around the neighborhood. Not the end of the world, but they have a nasty habit of leaving presents on the lawn for the kids to find. Things like the following:

In short, they poop everywhere and leave the remains of birds lying around. Things that aren't so great to step on or for the kids to find when playing in the garden.

I spoke with the immediate neighbor who owns two of the cats, and he suggested spraying them with water to deter them away. While that did indeed prove to be a very effective, amusing, and satisfying approach to move them on it required 24 hour vigilance as they just kept coming back.

Get a Dog

My wife keeps saying we should just get a dog. I guess getting a dog to chase the cats away is an option. But it seems like training a dog to use the hose might be more trouble in the long run.

Technology to the Rescue

Thankfully I found a handy device the attaches to the end of the house and activates a sprinkler head when a built in motion detector is set off.

Perfect! Great! Cat comes into range, then a sudden noise and spray of water sends them off towards someone else's lawn to do their cat business. Problem solved and I can go back to doing more fun activities.

Except there was one small problem. The PIR motion sensor didn't particularly care what was moving in front of it. Cats, birds, the kids on their way to school, a courier with a parcel for me, a tree in the wind, the mother in law. It would spray them all regardless of whether I wanted it to or not.

Salesforce Predictive Vision Service

Technology wasn't solving my problem. I needed to use more of it!

I recalled a recent presentation by the Salesforce Developers team - Build Smarter Apps with New Predictive Vision Service. The short version of that presentation is you can train a deep learning image classifier with a set of training images. Then when you give it a new image it will give you probabilities about what is likely in the picture. I've create a quick start unmanaged package for it to save you going through most of the install steps.

To make this work I needed a large collection of cat images to train the dataset from a create a model. Luckily for me, providing pictures of cats is something that the internet excels at.

The second challenge with using the predictive vision services is managing how many images I am going to send through to the service. If I just point a web camera out the window it could be capturing 30+ frames per second. Not really practical to send off each frame to the service for identification when there might be nothing of interest happening 99% of the time.

Motion detection

I had a few options here.

Option One would to to stick with the basic PIR motion sensor, but it would still be generating a ton of false positives that would need to pass through the image recognition. A simple cool down timer would help, but the image captured immediately after the first motion is detected would likely get something as it is just entering the frame.

I figure since I'm going to need a camera to capture the initial image I might as well get some usage out of it in detecting the motion. Because of the initial processing step I can exclude motion from certain areas, such as a driveway or tree that often moves in the wind. There can also be a slight delay after the motion is detected and before the prediction image is captured. This gives the subject time to move into the image.

The prototype solution looks like this:

  1. A webcam that can be pointed at the area to be monitored.
  2. Motion Detection software to process the video feed and determine where the movement is and the magnitude. The magnitude is useful, as particularly small subjects like birds can be completely ignored.
  3. The ability for that software to pass frames of interest off to the Salesforce Predictive Vision Service. This is a simple REST POST request using an access token.
  4. If the probability from the frame indicates a Cat is present, send a signal to the Raspberry Pi.
  5. On the signal, the Raspberry Pi activates the GPIO pin connected to a relay.
  6. When activated, the relay supplies power to the existing automated sprinkler, which activates on initial power on when the sensitivity is set to maximum. Another option here is directly connecting a solenoid water value to the hose line.

When all put together the end result looks something like this:

The Einstein bust with terminator-esque glowing red eyes was part of the presentation I gave on this topic.

While filming that video I inadvertently live tested it on myself as well. An aging fitting on the hose connector to the sprinkler had come loose outside at the tap. So I went out to fix that, restored the water pressure to the sprinkler, then walked back to the laptop to resume the test. Only when I checked the motion detection screen did I realize it had captured my image passing in front of the sprinkler. Thankfully the predictive vision services came back indicating I didn't resemble a cat and the sprinkler didn't activate. Success!

Refinements

It occured to me that there were further improvements that could be made.

The first and easiest change I made is to activate on things other than cats. It can be equally selective to activate on a wandering neighbors dog, squirrels, general wildlife, etc...

I needed a way to deal with unfortunate false positives, such as a person wearing something with a picture of a cat on it. These can partially be avoided by looking at all the probabilities that Einstein is returning and having thresholds against each label. I.e. Activate on any Cat prediction above 50% unless there is any prediction indicating a person in the field of view. Images kept from the activations could also be used to further refine the examples in the dataset.

These first two refinements are actually present in the video above. When using the general image classifier it typically identifies the stuffed cat as a teddy bear. So in the small section to the bottom right of the app you can mark labels to activate on and labels that will override and prevent the activation.

Other changes I might consider making:

The motion sensor could be maintained and introduced as the step prior to activating the video feed. This would increase the time between the target entering the area and the sprinkler activating, but would save a lot of endless processing loops looking at an unchanging image.

If I forgo some of the more processing intensive motion tracking the whole solution could be moved onto the Raspberry Pi. This would make it a much more economical solution.

However, another option with the motion detection still in place would be to crop the frame image to just the area where the motion was detected. This should lead to much higher prediction accuracy as it is only focusing on the moving subject.

When real world testing commences with live subjects I'll need to add a video capture option to cover the time from just before the sprinkler is activated till just after it switches off. I think the results will be worth the extra effort.

I have a range of other devices that could easily be activated via the relay attached to the Raspberry Pi. One such device is an ultrasonic pest repeller. Perhaps combined with a temperature sensor as a slightly kinder deterrent on cold nights.

User Group Presentation

I gave a talk to the Sydney developer user group on this project. The slides, as they were:


I still feel the need to settle on a name for the project. Options include:

  • The Mad Catter (after the elusive Catter Trailhead badge)
  • The Einstein Cannon
  • The Cattinator (After the general them of the presentation.)

See also:

Gallery

Thursday, March 2, 2017

Salesforce SOAP Callout debugging trickery

Here's a handy practice when making SOAP callouts from Salesforce and handling errors.

When a Callout goes pear-shaped and you get an exception, keep track of the request parameters by doing a JSON serialize and keeping the result in a custom object.

Then in the dev packaging org you can rehydrate the same request by deserializing the JSON from the custom object and making the same callout. Because you are now in a dev org you can see the raw SOAP message in the CALLOUT_REQUEST logging.

string jsonData = [Select ReferenceData__c from Error_Details__c where ID = 'a084000000w6ReO'].ReferenceData__c;

SoapWebService.Order order = (SoapWebService.Order)JSON.deserialize(jsonData, SoapWebService.Order.class);

SoapWebService.ServiceCredential credential = new SoapWebService.ServiceCredential();

SoapWebService.BasicHttpBinding_IConnectNS service = new SoapWebService.BasicHttpBinding_IConnectNS();
service.UpdateOrder(credential, order);

From there you can take the raw SOAP request over to something like SOAP UI to debug it further.

Friday, February 10, 2017

Visualforce Quick Start with the Salesforce Predictive Vision Services

Salesforce recently released the Salesforce Predictive Vision Services Pilot. You can watch the corresponding webinar.

I went through the Apex Quick Start steps and thought I could simplify them a bit. At the end of the process you should have the basic Apex and a visualforce page to test image predictions against.

Steps

  1. Sign up for a predictive services account using a Developer Org. The instructions here are fairly straight forward.
    Go to https://metamind.io/ and use the Free Sign Up link. OAuth to your dev org. Download the resulting predictive_services.pem file that contains your private key and make note of the "you've signed up with" email address. You will need the file later and the email address if your org users email address differs.
    • Note: the signup is associated with the Users Email address, not the username. So you might get conflicts between multiple dev orgs sharing the same email address.
  2. Upload your predictive_services.pem private key to the same developer org into Files and title it 'predictive_services'. This title is used to get the details of the private key by the Apex Code.
  3. Install the unmanaged package that I created (Requires Spring '17).
    I've pulled the required parts together from https://github.com/salesforceidentity/jwt and https://github.com/MetaMind/apex-utils. I've also made some modifications to the Visualforce page and corresponding controller to give more flexibility defining the image URL.
  4. Browse to the PredictService Visualforce Tab.
  5. Press the [Vision From Url] button.
  6. Examine the predictions against the General Image Model Class List.
  7. Change the Image URL to something publicly accessible and repeat the previous couple of steps as required.

Thursday, February 2, 2017

FuseIT SFDC Explorer 3.5.17023.3 - The more logging edition

The latest v3.5 release of the FuseIT SFDC Explorer is out and contains a couple of new features around Apex Debug logs.

The Challenge and Premise

Understanding an Apex log can require understanding events occurring at vastly different time scales.

At the very fine end, each event timestamp is supplemented with the elapsed time in nanoseconds since the start of the request. At the other end is the duration of the entire log itself, which can span seconds or even minutes of execution time.

By my figuring that is 3 orders of magnitude difference.

To try and put that into perspective...

DurationExample (using something very fast)
1 nanosecondLight travels 30 centimeters (12 inches)
1 minuteLight travels 17,990,000 kilometers (11,180,000 miles)

So while in a nanosecond reflected light could travel from your hand to your eyes. In a minute it could travel between the earth and moon 46 times. Or around the circumference of the earth almost 450 times. Yes, I'm playing a bit fast and loose using the speed of light in a vacuum, but you get the general gist of how vastly different a nanosecond duration is to seconds or minutes. It takes one billion nanoseconds to make a second, and that is a very big number when you are dealing with log events.

That's enough of a detour trying to make the point that we are dealing with periods of time at vastly different scales. I'll now take a similarly cavalier approach to how I'm going to address this challenge.

The human brain processes visual data 60,000 times faster than text

That's an awesome quote for what I'm trying to demonstrate, except it doesn't seem to be backed up by any actual research. Let's roll with it anyway.

When looking at a log it is useful to see how an events timing relates to those events immediately around it and where it sits in the overall transaction. To that end, I've started plotting the debug log events out in a timeline view under the core log.

"But Daniel" you say, "The Developer Console Log Inspector has had the Execution Overview for yonks, why do we need another log viewer?"
To which I reply, "Are you British? Because yonks sounds like a unit of time you would hear about when watching something from the BBC." and "How on earth did you include a hyperlink in speech? That's some next level DOM injection right there."

My primary reason for making a log parser has always been that the Developer Console is of no use to you if you can't load the log of interest into it. Logs don't just come from directly in the console. They get emailed to you in the "Developer script exception" emails, or from a well meaning admin. They get saved to disk and then examined days after the fact. In cases like these the Developer Console can't help you at all.

While the FuseIT SFDC Explorer will happily load logs captured directly in the org, it can also have them pasted straight in and parse them all the same.

Debug log Timeline view

I've deliberately tried to avoid making a carbon copy of the existing Developer Console functionality. What would be the point? Instead I've looked for a way to visualize all the events in one timeline view. Of course, with some things occurring so closely together the finer details get lost. Where I've found it useful is:

  • in identifying clumps of events,
  • where an event sits in relation to the rest of the log, and
  • to jump to events of importance quickly.

Let's look at the timeline the came out of a test class run. The log had reached the 2MB limit and covered 13,000 events over 39,500 lines. One of the test methods failed, and we want to hone in on that in the log.

Note the bang icon in the middle of the timeline. Clicking on that takes use straight to the FATAL_ERROR in question.

Debug log Tree view

The Developer Console provides both the Stack Tree and Execution Stack for the currently selected event. I've always found these a little odd to be honest in the slight disconnect with the actual log events. E.g. USER_DEBUG becomes "debug".

Let's start with something simple. Execute anonymous for a for loop that does 8 iterations of a debug statement.

for(integer i = 0; i < 8; i++) {
    System.debug(i);
}

The Developer Console shows the 8 debug statements. All with a duration of 0.01 ms with the exception of one that took 0.06 ms. The Execution Stack shows similar details for the currently selected event.

What can we see from the same code in the FuseIT SFDC Explorer? That depends on how you filter the log.

If you keep the default settings with opening the log with [Prefilter Log] enabled that various events like SYSTEM_METHOD_ENTRY and SYSTEM_METHOD_EXIT will be completely omitted. This makes the log easier to work with, but mucks with the event durations. With logs you can easily tell when something happened, but to get an accurate duration you need a BEGIN/END or ENTRY/EXIT pair of events. Hence the duration of the first USER_DEBUG seems excessively long as it was measured from the prior event.

If you keep all the log events in then you get a tree with very similar figures. The main difference being that you and see the ENTRY/EXIT pairs.

Real World example

Have a look at the Apex CPU time limit exceeded in tidy trigger pattern question on the Salesforce StackExchange without skipping down to the answer (NO CHEATING). Grab the apex log they attached and try and figure out what the likely cause of the CPU limit exception is.


Read on when you've figured it out...


Here's what I can tell you from the log timeline.

Notice the recurring pattern of red (before update triggers), green (validation), orange (after update triggers), and purple (workflow). As per the question they are updating 2956 Account records. So the records are processed in batches of 200. You can also see where the skipped log section is (exclamation mark about 3/4 away along) and the FATAL_ERRORs at the end of the log.

If you then look at one of those batches in the tree view I can see that the triggers themselves are relatively quick and the longest duration from any of the code units is for the workflow. Definitely the smoking gun to investigate first.

I like to think that the combination of the timeline and treeview made isolating the problem much easier. Especially considering the Developer Console wasn't available in this case.

The forward looking statements

It's still very much a work in progress.

The biggest thing that stands out to me at the moment is the color coding for events. I want similar events to have similar colors. Important events to stand out and less important events to fade away. The CODE_UNIT color categories not to conflict with the event colors. This is a tricky thing to do when you struggle to name more than the standard 16 colors supported with the Windows VGA palette

The accuracy of the duration measurements is important. In the current 3.5 release the elapsed times were all converted to C# Timespans, which lacked the nanosecond accuracy. In the next release I'll do all the calculations from the raw nanoseconds and convert to Timespans only when needed for display.

Friday, January 20, 2017

Choose Your Own Adventure - Dirty Dozen showdown with the REST API vs SOAP API vs BULK API

You're an external system to Salesforce. Stuff happened and now there are a dozen dirty records that need to be updated in Salesforce to reflect the changes. An active Salesforce Session ID (a.k.a access token) that can be used to make API calls with is available. All the records have the corresponding Salesforce Ids, so a direct update can be performed. Ignore for the moment that the records might also be deleted and in the recycle bin or fully deleted (really truly gone).

To further complicate matters, there is a quagmire of triggers, workflow, and validation on the objects in Salesforce. This is a subscriber org for a managed package, so you can't just fix those.

Which API do you use to update those records in Salesforce?
Pick a path:

  1. You use REST API PATCH requests to update records. Turn to page 666
  2. You use the REST API composite batch resource to update records. Turn to page 78
  3. You use the REST API composite tree resource to update the records. Turn to page √–1
  4. You use the SOAP API update() call. Turn to page 42
  5. You use the Bulk API to update them. Turn to page page 299792458
  6. You hand craft an Apex REST web service to do the processing. Turn to page 0

REST API PATCH requests

There are 12 records and the API will only allow you to PATCH one at a time. So that's 12 API calls.

You die a slow and painful death. GAME OVER

Try Again?

Postmortem:

Each request round trips to Salesforce, processes all the triggers,workflow, validation on each individual record, and returns the result. Individually each request is only a couple of seconds, but collectively they take way too long for the waiting user.

Request

POST /services/data/v38.0/sobjects/OpportunityLineItem/00k7000000eaaZBAAY HTTP/1.1
Host: na5.salesforce.com
Authorization: Bearer 00D700000000001!AQ0AQOzUlrjD_NotARealSession_x61fsbSS6GGWJ123456789mKjmhS0myiYYK_sW_zba
Content-Type: application/json

Request Body

{"End_Date__c": "2017-01-19"}

204 Response: Time (2,018 to 2,758 ms) multiplied by twelve records gives 24,216 to 33,096 ms

REST API Composite batch

You learnt your lesson with the individual REST API calls (or maybe you came straight here), so switch to a single composite batch call. This will give you one round trip to the server.

You die a (slightly less, but still very much) slow and painful death. GAME OVER

Try Again?

Postmortem:

You're down to one API request, which is good. But less than desirable things are happening in Salesforce. Each sub request in the batch is splitting into a separate transaction.

There is still a big penalty to pay for running the accumulation of triggers and other gunk one record at a time. The trigger bulkification can't help you is they are all separate transactions.

Also, don't forget that you can only do 25 records per batch. Not such a problem with 12 records, but it has limited scaling potential.

Request

POST /services/data/v38.0/composite/batch HTTP/1.1
Host: na5.salesforce.com
Authorization: Bearer 00D700000000001!AQ0AQOzUlrjD_StillNotARealSession_x61fsbSS6GGWJ123456789mKjmhS0myiYYK
Content-Type: application/json

Request Body

{
 "batchRequests": [{
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZBAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZCAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZDAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZEAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  }, {
   "method": "PATCH",
   "url": "v38.0/sobjects/OpportunityLineItem/00k7000000eaaZFAAY",
   "richInput": {
    "End_Date__c": "2017-01-19"
   }
  },
                //...
  
 ]
}

Response: Time (20,053 ms)

{
    "hasErrors": false,
    "results": [
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        {
            "statusCode": 204,
            "result": null
        },
        //...
    ]
}

Bonus

Look at the log duration for each sub request. They appear to be the accumulation of time for the entire API request rather than each individual sub transaction. It certainly confused me for a bit.

REST API Composite tree

Currently (as at Spring '17) it can work with up to 200 records, which is a good start. However, the composite tree resource is only for creating records, not updating them.

You die of embarrassment from trying to use an incompatible API. GAME OVER

Try Again?

Postmortem:

Always check the documentation first.

SOAP API update call

SOAP, are you sure? That API's been rattling around since 2004 in API v5.0.

Success, the records are all updated in a reasonable timeframe.

Try something else?

Review:

One POST request, and 4262 ms later you have a response. Processing time does increase with each record added, but nowhere near the overhead of the previous REST API's.

POST Request to https://na5.salesforce.com/services/Soap/u/38.0

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="urn:partner.soap.sforce.com" xmlns:urn1="urn:sobject.partner.soap.sforce.com">
   <soapenv:Header>
      <urn:SessionHeader>
         <urn:sessionId>00D700000000001!AQ0AQOzUlrjD_SessionIdCleanedWithSoap_x61fsbSS6GGWJ123456789mKjmhS0myiYYK_sW_zba</urn:sessionId>
      </urn:SessionHeader>
   </soapenv:Header>
   <soapenv:Body>
      <urn:update>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZBAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZCAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZDAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <urn:sObjects>
            <urn1:type>OpportunityLineItem</urn1:type>
            <urn1:fieldsToNull></urn1:fieldsToNull>
            <urn1:Id>00k7000000eaaZEAAY</urn1:Id>
            <urn1:End_Date__c>2017-01-19</urn1:End_Date__c>
         </urn:sObjects>
         <!-- ... -- >

      </urn:update>
   </soapenv:Body>
</soapenv:Envelope>

Response

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns="urn:partner.soap.sforce.com">
   <soapenv:Header>
      <LimitInfoHeader>
         <limitInfo>
            <current>465849</current>
            <limit>6700000</limit>
            <type>API REQUESTS</type>
         </limitInfo>
      </LimitInfoHeader>
   </soapenv:Header>
   <soapenv:Body>
      <updateResponse>
         <result>
            <id>00k7000000eaaZBAAY</id>
            <success>true</success>
         </result>
         <result>
            <id>00k7000000eaaZCAAY</id>
            <success>true</success>
         </result>
         <result>
            <id>00k7000000eaaZDAAY</id>
            <success>true</success>
         </result>
         <!-- ... -->
            
      </updateResponse>
   </soapenv:Body>
</soapenv:Envelope>

Bulk API

It's primarily billed as a way to asynchronously load large sets of data into Salesforce. Let's see how we go with only 12...

You have a harrowing brush with death by API ceremony. If the asynchronous gods favor you it is a timely update. Otherwise disgruntled users tear you limb from limb as they get fed up of waiting for the results to come back.

Try something else?

Results:

There are five API calls to be made to complete this operation on a good day. If things go bad then you might be waiting longer than expected. You need to keep polling the API for the job to complete before you can get the results back. You're also burning five API calls where you could be using one to complete the entire operation.

Create Job

Request

POST /services/async/38.0/job HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_NothingToSeeHere_x61fsbSS6GGWJ123456789mKjmhS0my
Content-Type: application/xml

Request Body

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <operation>update</operation>
    <object>OpportunityLineItem</object>
    <contentType>CSV</contentType>
</jobInfo>

Response Time (617 ms)

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75070000003qVrHAAU</id>
    <operation>update</operation>
    <object>OpportunityLineItem</object>
    <createdById>00570000004uCVJAA2</createdById>
    <createdDate>2017-01-19T23:08:06.000Z</createdDate>
    <systemModstamp>2017-01-19T23:08:06.000Z</systemModstamp>
    <state>Open</state>
    <concurrencyMode>Parallel</concurrencyMode>
    <contentType>CSV</contentType>
    <numberBatchesQueued>0</numberBatchesQueued>
    <numberBatchesInProgress>0</numberBatchesInProgress>
    <numberBatchesCompleted>0</numberBatchesCompleted>
    <numberBatchesFailed>0</numberBatchesFailed>
    <numberBatchesTotal>0</numberBatchesTotal>
    <numberRecordsProcessed>0</numberRecordsProcessed>
    <numberRetries>0</numberRetries>
    <apiVersion>38.0</apiVersion>
    <numberRecordsFailed>0</numberRecordsFailed>
    <totalProcessingTime>0</totalProcessingTime>
    <apiActiveProcessingTime>0</apiActiveProcessingTime>
    <apexProcessingTime>0</apexProcessingTime>
</jobInfo>

Add a Batch to the Job

Request

POST /services/async/38.0/job/75070000003qVrHAAU/batch HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_HereIsSomeWorkToDo_x61fsbSS6GGWJ123456mKjmhS0myiYYK_sW_zba
Content-Type: text/csv

Request Body

Id,End_Date__c
"00k7000000eaaZBAAY","2017-01-19"
"00k7000000eaaZCAAY","2017-01-19"
"00k7000000eaaZDAAY","2017-01-19"
"00k7000000eaaZEAAY","2017-01-19"
"00k7000000eaaZFAAY","2017-01-19"
"00k7000000eaaYDAAY","2017-01-19"
"00k7000000eaaZQAAY","2017-01-19"
"00k7000000eaaZpAAI","2017-01-19"
"00k7000000eaaa4AAA","2017-01-19"
"00k7000000eaaZkAAI","2017-01-19"
"00k7000000eaaZlAAI","2017-01-19"
"00k7000000eaaXKAAY","2017-01-19"

Response time: 964 ms

<?xml version="1.0" encoding="UTF-8"?>
<batchInfo
   
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75170000005cAFMAA2</id>
    <jobId>75070000003qVrHAAU</jobId>
    <state>Queued</state>
    <createdDate>2017-01-19T23:15:21.000Z</createdDate>
    <systemModstamp>2017-01-19T23:15:21.000Z</systemModstamp>
    <numberRecordsProcessed>0</numberRecordsProcessed>
    <numberRecordsFailed>0</numberRecordsFailed>
    <totalProcessingTime>0</totalProcessingTime>
    <apiActiveProcessingTime>0</apiActiveProcessingTime>
    <apexProcessingTime>0</apexProcessingTime>
</batchInfo>

Close the Job

Request

POST /services/async/38.0/job/75070000003qVrHAAU HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_AnotherApiCall_ReallyQ_x61fsbSS6GGWJ56789mKjmhS0myiYYK_sW_zba
Content-Type: application/xml; charset-UTF-8

Request Body

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo xmlns="http://www.force.com/2009/06/asyncapi/dataload">
  <state>Closed</state>
</jobInfo>

Response time: 1291 ms

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo
   
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75070000003qVrHAAU</id>
    <operation>update</operation>
    <object>OpportunityLineItem</object>
    <createdById>00570000004uCVJAA2</createdById>
    <createdDate>2017-01-19T23:08:06.000Z</createdDate>
    <systemModstamp>2017-01-19T23:08:06.000Z</systemModstamp>
    <state>Closed</state>
    <concurrencyMode>Parallel</concurrencyMode>
    <contentType>CSV</contentType>
    <numberBatchesQueued>0</numberBatchesQueued>
    <numberBatchesInProgress>0</numberBatchesInProgress>
    <numberBatchesCompleted>0</numberBatchesCompleted>
    <numberBatchesFailed>1</numberBatchesFailed>
    <numberBatchesTotal>1</numberBatchesTotal>
    <numberRecordsProcessed>0</numberRecordsProcessed>
    <numberRetries>0</numberRetries>
    <apiVersion>38.0</apiVersion>
    <numberRecordsFailed>0</numberRecordsFailed>
    <totalProcessingTime>0</totalProcessingTime>
    <apiActiveProcessingTime>0</apiActiveProcessingTime>
    <apexProcessingTime>0</apexProcessingTime>
</jobInfo>

Check the Batch Status

Request

GET /services/async/38.0/job/75070000003qVrHAAU/batch/75170000005cAFMAA2 HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_LosingTheWillToLive_x61fsbSS6GGWJ126789mKjmhS0myiYYK_sW_zba

Response time: 242 ms

<?xml version="1.0" encoding="UTF-8"?>
<batchInfo
   
    xmlns="http://www.force.com/2009/06/asyncapi/dataload">
    <id>75170000005cAFMAA2</id>
    <jobId>75070000003qVrHAAU</jobId>
    <state>Completed</state>
    <createdDate>2017-01-19T23:27:54.000Z</createdDate>
    <systemModstamp>2017-01-19T23:27:56.000Z</systemModstamp>
    <numberRecordsProcessed>12</numberRecordsProcessed>
    <numberRecordsFailed>1</numberRecordsFailed>
    <totalProcessingTime>1889</totalProcessingTime>
    <apiActiveProcessingTime>1741</apiActiveProcessingTime>
    <apexProcessingTime>1555</apexProcessingTime>
</batchInfo>

Retrieve the Batch Results

Request

GET /services/async/38.0/job/75070000003qVrHAAU/batch/75170000005cAFMAA2/result HTTP/1.1
Host: na5.salesforce.com
X-SFDC-Session: Bearer 00D700000000001!AQ0AQOzUlrjD_AreWeThereYet_x61fsbSS6GGWJ123456789mKjmhS0myiYYK_sW_zba

Response time: 236 ms

"Id","Success","Created","Error"
"00k7000000eaaZBAAY","true","false",""
"00k7000000eaaZCAAY","true","false",""
"00k7000000eaaZDAAY","true","false",""
"00k7000000eaaZEAAY","true","false",""
"00k7000000eaaZFAAY","true","false",""
"00k7000000eaaYDAAY","true","false",""
"00k7000000eaaZQAAY","true","false",""
"00k7000000eaaZpAAI","true","false",""
"00k7000000eaaa4AAA","true","false",""
"00k7000000eaaZkAAI","true","false",""
"00k7000000eaaZlAAI","true","false",""
"00k7000000eaaXKAAY","true","false",""

Review:

With only a single call to check the batch status it came back at a respectable 3350 ms total for all the API calls. That doesn't include any of the overhead on the client side. There could be some variance here while waiting for they Async job to complete.

Apex REST Web Service

OK, I'll be honest, after all those BULK API calls I'm exhausted. Also, I can't just deploy an Apex Web Service to the production org where I was bench marking against.

Your fate is ambiguous because the narrator was to lazy to test it. Go to page 0.

Try something else? or Try again?

Review:

Performance is probably "pretty good"™ with only one API call and one transaction that can use the bulkification in the triggers. However, you'll need to define the interface, maintain the code, create tests and mocks.

Revised Results

I had some time to revisit this, create an Apex REST web service in the sandbox, and test it.

It takes a bit more effort to create the Apex class with the associated test methods and then deploy them to production. The end result is a timely response.

Revised Review:

In the ideal world the Apex REST web service would be streamlined to the operation being performed. I sort of cheated a bit and created it to have the same signature as the composite batch API. It also bypasses any sort of error checking or handling.

@RestResource(urlMapping='/compositebatch/*')
global class TestRestResource {

    @HttpPatch
    global static BatchRequestResult updateOlis() {
        
        RestRequest req = RestContext.request;
        BatchRequest input = (BatchRequest)JSON.deserialize(req.requestBody.toString(), BatchRequest.class);
        
        BatchRequestResult result = new BatchRequestResult();
        result.hasErrors = false;
        result.results = new List<BatchResult>();
        
        List<OpportunityLineItem> olisToUpdate = new List<OpportunityLineItem>();
        for(BatchRequests br : input.batchRequests) {
            olisToUpdate.add(br.richInput);
            Id oliId = br.url.substringAfterLast('/');
            br.richInput.Id = oliId;
            
            result.results.add(new BatchResult(204));
        }
        System.debug('Updating: ' + olisToUpdate.size() + ' records');
        
        // Should be using Database.update so any errors could be split out.
        update olisToUpdate;  
        
       return result;
    }
    
    global class BatchRequest {
        public List<BatchRequests> batchRequests;
    }
    
    global class BatchRequests {
        public String method;
        public String url;
        public OpportunityLineItem richInput;
    }
    
    global class BatchRequestResult {
        boolean hasErrors;
        List<BatchResult> results;
    }
    
    global class BatchResult {
        public integer statusCode;
        public string result;
        
        public BatchResult(integer status) {
            this.statusCode = status;
        }
    }
    
}

This can then use exactly the same request that the composite batch did.

Response: Time (3,362 ms) against a sandbox Org

To give a relative benchmark in the same sandbox, the SOAP API took 3,172 ms. That gives a time of around 4,500 ms in "production time".

Summary

Lets recap how long it took to update our dozen dirty records:

  • REST API PATCH requests — 24,216 to 33,096 ms
  • REST API Composite batch — 20,053 ms
  • REST API Composite tree — n/a for updates
  • SOAP API update call — 4262 ms
  • Bulk API — 3350 ms = 617 ms + 964 ms + 1291 ms + n*242 ms + 236 ms
  • Apex REST Web Service — 4,517 ms (extrapolated from sandbox)

I was expecting the SOAP API to fare better against the Bulk API with such a small set of records and one API call versus five. But they came out pretty comparable.

Certainly as the number of records increases the Bulk API should leave the SOAP API in the dust. Especially with the SOAP API needing to start batching ever 200 records.

The other flavors of the REST API are pretty awful when updating multiple records of the same type as they get processed in individual transactions. To be fair, that's not what they are intended for.

Your results will vary significantly as the subscriber org I was testing against had some pretty funky triggers going on. Those triggers were magnifying the impact of sub request transaction splitting by the composite batch processing. I wouldn't usually classify 4 second responses as "timely". It's all relative.

Also, I could have been more rigorous in how the timing measurements were made. E.g. trying multiple times, etc... It's pretty difficult to get consistent times when there are so many variables in a multi-tenanted environment. Repeated calls could easily create ± 500 ms variance between calls.

The idea did occur to me to Allow REST API composite batch subrequests to be processed in one transaction. That would overcome the gap in the REST API where a small number of related records could be updated in one API call.


See Also:

Tuesday, January 10, 2017

JavaScript Security for Visualforce

I thought I'd touch on some of the security considerations that should be made when working with JavaScript from Visualforce.

Cross Site Scripting (XSS)

The risk of cross site scripting is always something to be aware of when developing web applications and needs to be considered when using JavaScript in Visualforce as well. Generally speaking, you want to prevent untrusted user input from being reflected back into JavaScript code.

As an example - say you were trying to read a page parameter info JavaScript with the following (Example only - DON'T DO THIS):

<apex:page>
    <script>var foo = '{!$CurrentPage.parameters.userparam}';</script>
</apex:page>

If you load this in the browser and include a &userParam=bar in the query string then the resulting HTML is:

Great! But what if someone puts something malicious into the URL? Something like 1';alert('All%20your%20Salesforce%20are%20belong%20to%20us');var%20foo='2. Here is the result in the browser:

And again from the page source:

So definitely not what we wanted. Notice how the apostrophe characters in the user input allow the code to take on an entirely different meaning. It would be very easy to extend the example to submit the current session cookies to an external resource - compromising you Session Id and pretty much everything else from there.

The solution here is to use JSENCODE to encode any text before reflection into JavaScript. This will use a backslash to escape any unsafe JavaScript characters, such as apostrophe (').

<apex:page>
    <script>var foo = '{!JSENCODE($CurrentPage.parameters.userparam)}';</script>
</apex:page>

See also:

Javascript Remoting with escape: false

When using Javascript Remoting beware the using {escape:false} in the configuration and loading the result into the DOM. Much like the basic XSS example above, it can be used to inject executing JavaScript into the page.

See also:

Avoid the urge to hack the Salesforce DOM

Not so long ago developers would put JavaScript in the sidebar so it would load with every page. This could then be used to manipulate the standard Salesforce DOM to do all sorts of things, such as showing/hiding components, adding additional validation, or changing their presentation.

Salesforce took umbrage with this approach as it opened up all sorts of consistency and security problems. As such, they shut the practice down in the Summer '15 release. Needless to say, those who had used the sidebar to manipulate the page found their workaround hack solution no longer working. Best to avoid these sorts of unsupported shenanigans if you can as they will be closed off sooner or later.

See also:

Protect the Session Id / AccessToken

The Session ID is the key to the kingdom. If someone can get hold of yours they can interact with Salesforce like you would*. If you can avoid exposing it in Visualforce then do so. E.g. Use JavaScript Remoting rather than interacting with the APIs directly.

Depending on the context where you request it, you can get a "second class session id" that doesn't grant you the same level of access as a full "first class" UI session type.

For instance, a Session ID in Visualforce from can be used to make API calls, but can't be used to access the full web UI (Such as via frontdoor.jsp).

Checking the User Session Information page can show the different Session Types that get created. In the example below note the TempVisualforceExchange Session that was created from the Parent UI Session.

See also:

* There are some exceptions on if they can use it if "Lock sessions to the IP address from which they originated" or "Enforce login IP ranges on every request" are enabled.

Static Resource rather than CDN

Using a CDN such as Google Hosted Libraries or the Microsoft Ajax Content Delivery Network to bring in something like jQuery is appealing, but can open you up to security problems and overall make the app exchange security review more troublesome than it needs to be. Instead consider using a zip file in a static resource and referencing the contents of the zip using URLFOR.

See also:


Further reading:

Friday, January 6, 2017

Trigger recursion giving me a bad day

Salesforce trigger recursion is always fun. Especially when the problem is occurring in a subscribers org with your managed package triggers. This is a follow on to my previous post Preventing trigger recursion and handling a Workflow field update but with an added twist.

As per the previous post, I needed a trigger recursion mechanism that could prevent an infinite loop but still handle subsequent changes made to the Opportunity by workflows. Using the hashCode of the Opportunity worked. Then I got the following error from the managed package trigger after insert:

System.DmlException: Upsert failed. First exception on row 0; first error: DUPLICATE_VALUE, duplicate value found: Namespace__OpportunityIdDynamicPropertyIdUnique__c duplicates value on record with id: a0uc0000003RsMh: []

This was odd, how could the after insert trigger be falling over on existing records that used the Opportunity Id in the composite key? The Opportunity had only just been inserted.

The subscriber org in this case had created there own after insert trigger on Opportunity that created additional OpportunityLineItem records. This resulted in a sequence of events that went something like:

  1. DML Insert Opportunity
  2. Subscribers After Insert Trigger on Opportunity
    1. DML Insert OpportunityLineItems
    2. Subscribers After Update Trigger on Opportunity
    3. Managed Package After Update Trigger on Opportunity
  3. Managed Package After Insert Trigger on Opportunity

Note how the subscribers after insert/update triggers went first and the resulting changes to the Opportunity.Amount from the new OpportunityLineItem records updated the Opportunity records. As a result, the managed packages After Update trigger fired before the After Insert trigger.

The sequence of events again, to hopefully provide some clarity:

  1. Opportunity is inserted
  2. Subscriber after insert trigger code inserts OpportunityLineItems for the Opportunity
  3. Managed package Opportunity After Update trigger fires and inserts records related to the Opportunity
  4. Managed package Opportunity After Insert trigger fires and attempts to insert records without checking for existing records (because it is an insert, so there shouldn't be any related records yet - which doesn't hold up in practice).
  5. The insertion of the related records fails as they were already created by the Opportunity After Update trigger that occured when the OpportunityLineItems were inserted.

I'd made the incorrect assumption in the After Insert trigger that I didn't need to check for existing records with a lockup to the Opportunity as it couldn't exist yet without completing the insertion transaction.

In hindsight it is clear enough, but it wasn't much fun to figure out from the observed behavior. It's also something I'm sure I've encountered previously. Once I find my previous notes on this problem I'll link them here. I'd previously encountered this in A Tale of Two Triggers.

So, repeating the moral of the story from my previous post in the hopes I'll remember it:

Don't assume that the After Insert trigger will occur before the After Update trigger.