Wednesday, May 20, 2020

Dreamforce 2019 Presentation - Reducing the Cost of Enforcing CRUD and FLS in the ESAPI

At Dreamforce this last year (2019) Chris Peterson and I gave a theater presentation on combining the new Apex Security.stripInaccessible method with the existing ESAPI library for enforcing Create, Retrieve, Update, Delete (CRUD) and field level security (FLS) in Apex.

Unfortunately the Dreamforce theater sessions weren't recorded in 2019 as they were in previous years ๐Ÿคฆ‍♂️. To make up for that, below are the session slides. I'll also expand on the key points in this blog post. These will be my words rather than Chris's, although I'll try and cover most of the same content.

What is the ESAPI?

The ES stands for Enterprise Security. And the API... Well, hopefully you know what an API is.

The Salesforce ESAPI is a port of a Java library created by OWASP (Open Web Application Security Project).

To address why you would want it in your Salesforce org, here is a quote from the OWASP ESAPI project page:

The ESAPI libraries are designed to make it easier for programmers to retrofit security into existing applications. The ESAPI libraries also serve as a solid foundation for new development.

These seem like noble goals. We want to improve the security of existing applications and implement newer applications with the same level of security from the start.

The three core areas that the Salesforce ESAPI addresses are:

  1. Input Validation - is a given string a valid date? Is it a valid credit card number? A valid redirection URL?
    • E.g. ESAPI.validator.SFDC_isValidDate
  2. Output Encoding - is is safe to render the content back to the users browser via HTML?
    • E.g. ESAPI.encoder.SFDC_URLENCODE
  3. Access Control - enforce the built in access control mechanisms: CRUD, FLS, and Sharing.
    • insertAsUser/updateAsUser
    • DML on a limited set of fields
    • Override sharing for a single DML operation

(Re) Introducing the ESAPI

Salesforce originally released their version of the ESAPI library in 2010. In 2016 they added the new ESAPI.encoder.SFDC_BASE64_URLENCODE method. Other than that it was stalled for any maintenance or new development.

In 2019 Chris Peterson and Jake Meredith from Salesforce took ownership of the Github repo. Even better, they are accepting pull requests.

One of the first steps in rejuvenating the repo was to increase the built in code coverage from 54% to 93% on the security specific test methods. And, perhaps more importantly, add a number of meaningful assertions and negative test cases along the way. Overall project test coverage is now up by 39%.

One of the particular challenges with this was scripting out of the box test cases using only the built in Profiles and sObjects. The test cases needed to be portable to any org, so they couldn't rely on a specific custom Profile existing. At the time the "Read Only" profile was the most restrictive system profile available. Going forward I might revisit this with the still to GA Minimum Access profile.

With the better test coverage in place it was then possible to overhaul how the field and object level security was enforced. More on that later...

A Recap of the Security.stripInaccessible() method

The Spring '20 release included the GA version of the new Security.stripInaccessible method.

This new method provides a streamlined way to check both field and object level data permissions for the following access types:

  • AccessType.READABLE - Check the fields of an sObject for read access.
  • AccessType.CREATABLE - Check the fields of an sObject for create access.
  • AccessType.UPDATABLE - Check the fields of an sObject for update access.
  • AccessType.UPSERTABLE - Check the fields of an sObject for insert and update access.

The SObjectAccessDecision Object

After calling stringInaccessible() an SObjectAccessDecision object is returned. This provides a number of helpful methods.

getRecords() provides a new List<SObject> that has all the inaccessible fields removed. These are also detached from the source sObjects.

Two additional methods getModifiedIndexes() and getRemovedFields() provide details about which records and specific fields were modified.

How it works in practice to enforce security requirements

Using the newer stringInaccessible method has a number of advantages. It will cover all possible sObject field types. It will check relationship fields, including nested relationships. Sub-queries and polymorphic lookups are also covered.

The example in the image above shows a new Opportunity for "Appy's App" that was generated in the trusted system mode. As such, it could set the custom Standing__c and Value__c fields. This Opportunity is then passed through stringInaccessible with the AccessType.CREATABLE parameter. This does all the hard work for us of checking the users Profiles, Permission Sets, the Permission Set groups, the muting permission sets, etc... The output Opportunity has the fields that the user doesn't have create access to completely removed. The stripped fields aren't null: they're really undefined. This is important when it comes to subsequent DML as we don't want to inadvertently clear fields out.

Other advantages:

  • It's particularly useful when handling untrusted input (like from JavaScript controllers)
  • It's also great for gracefully degrading UI experiences like SFDC does natively

Example walk through code the changes to the ESAPI

The video shows the core structural changes that were made to the ESAPI methods to use the new methods.

Measuring the Performance changes - Methodology

Beyond enforcing the security requirements the next important consideration is changes to CPU Limit usage and Heap usage. Usually you trade more CPU for less Heap or vice versa. If we can cut both down we are doing well!

The general goal is always to see equal of better performance while enforcing the same security requirements. There may be some tradeoffs in performance made, but the security can't be compromised on. The coverage and assertions from the automated tests ensure we are still enforcing the same security requirements.

The performance differences in terms of Apex limits were measured using Test harness classes and Adrian Larsons LimitsProfiler framework. Performing multiple runs between the current/baseline implementation and the new version using the stripping methods allows the relative changes in limits to be measured.

During the testing it was important to have the debug logging completely off as it would otherwise affect the outcomes.

The testing below was done in ALL_OR_NONE mode. This is more demanding than the alternative BEST_EFFORT mode as it requires per field level checks.

To allow for the limits testing framework to repeatedly insert multiple records Savepoints were used as part of the setup and teardown steps. This prevented hitting the storage limits while still measuring the performance differences in the ESAPI.

Measured results for bulk inserting Contacts

The performance difference can vary greatly based on scenario:

  • There is negligible performance difference when checking object CRUD permissions. Note, the Apex ESAPI is still enforcing against the sObject Schema.SObjectType until 224 (Spring `20) with the fix.
  • There is a significant performance difference if there are a large number of requested fields that aren’t set on all the sObjects.

Measured improvements on 25 iterations inserting 200 Contacts. 33 Standard fields

  • 25% Less CPU usage
  • 18.6% Less Heap usage

Links and Resource

Thursday, February 13, 2020

FuseIT SFDC Explorer 3.13.20008.1 - Winter '20

Another roundup of some of the changes to the FuseIT SFDC Explorer since the 3.12.19175.1 release.

Redesigned Apex Tests selector

The Apex test selection control has been reworked to include a top level node for the last known state of each test method. This allows you to monitor the accumulated test results from a single location. If you redeploy one of the test cases in question it will reset to an unknown status.

Updates to the Debug Log Executed Units

In the last release I put in the first version of the Executed Units control. This release builds on that.

It is now possible to mix and match the different log types. So you can focus on, for example, just the SOQL queries.

The same toolstrip includes a button to export the executed units data.

Smaller generated Wsdl2Apex classes

If you've ever looked closely at the Apex classes that Wsdl2Apex generates you'll notice a recurring string - the namespace of the the root element. Depending on the WSDL it's not uncommon for this same string to be repeated thousands of times in the generated Apex class. This update will now generate a single static final to handle the repeated usages.

New controls for exploring Execution overlays, including heap dumps

The developer console currently provides a viewer for execution overlay output created by a ApexExecutionOverlayAction record. This includes the Heap Dump, which has all sorts of interesting details about the data stored in heap/memory for the transaction at the the point of time the overlay was captured.

Right now it mostly just replicates the existing developer console functionality and allows browsing the stored objects by type.

Other changes 3.13

  • [cs32197] SFDCExplorer: Option to save a debug log directly to disk.
  • [cs32152] SFDCExplorer: Auto add metadata files when adding package components. Update display of Metadata Deploy results.
  • [cs32150] SFDCExplorer: Allow the target save directory to be specified when doing an unpackaged metadata export (if the working directory isn't accessible)
  • [cs32149] SFDCExplorer: Use internal list of known key prefixes with the Entity Explorer
  • [cs32148] SFDCExplorer: Option on running SOQL queries to include deleted rows
  • [cs32003] Update to Winter '20 release. v47.0
  • [cs31838] Wsdl2Apex: Handle an operation binding with no output. Raising a warning for a missing portType rather than an exception.
  • [cs31820] SFDCExplorer: Hide JavaScript errors during OAuth login

Wednesday, November 27, 2019

Dreamforce 2019 Roundup / Summary

The recap for this Dreamforce is going to come across as a bit of a photo journal as I go back through my phones timeline to piece back together what I got up to. Somehow, even on my 6th time attending this conference things still went past at a frantic pace.

Table of Contents

  1. Preconference
  2. Day One
    • Main Keynote
    • Meet the Developers
  3. Day Two
    • Developer Keynote
    • My Presentation
    • True to the Core
    • Dreamfest
  4. Day Three
  5. Day Four

Preconference

A visit to the Salesforce Tower for the Tooling Partner meeting and the Dreamforce speaker party.

Day One - 20th Nov

Main keynote

Meet the Developers

  • This session wasn't recorded, but there were a number of good questions raised. I asked about possible ways to streamline the reporting and monitoring of GACKs.

Platform Cache: Why Implementing Platform Cache Is Just Like Riding a Bike

Day Two - 21st Nov

Developer Keynote

Reducing the Cost of Enforcing CRUD and FLS in the ESAPI

  • Presentation on updates to the ESAPI by Chris and myself

True to the Core

Entity Interfaces

Dreamfest

  • Bands this year were Beck and Fleetwood Mac

Day Three - 22nd

Open conversation about DX

Everything You Need to Know on Apex Debuggers

Quip and Into the blue parties

Day 4 - 23rd

Salesforce Evergreen & Evergreen Functions: Evented Serverless Consumer Apps (2)

See Also

Friday, October 25, 2019

Dreamforce 2019 Session picks

Here are some of my current picks for Dreamforce 2019 sessions. I'm aiming for a mix of developer related topics in areas I want to learn more about plus anything that sounds informative. It isn't an exhaustive list and there are certainly some other session that I'll be adding.

Most important session that I'm definitely going to attend

I might be biased, but this session is on my must attend list. Yes, that's the same thing I said last year.

This year Chris Peterson and I we will cover how taking up stripping can improve application security!
Think of combining the existing rigid security focus of the Salesforce Enterprise Security API (ESAPI). With the flexibility of the new Security.stripInaccessible() method. Into a hybrid that is both formally secure and stripped down on the CPU usage!
  • Reducing the cost of enforcing CRUD and FLS in the ESAPI
    Reworking the ESAPI to use the new pilot Apex feature to dramatically reduce the cost of checking FLS in Apex. Explore how the new Security.stripInaccessible (pilot) method can be utilized to enforce FLS during CRU[D] operations in the context of the existing ESAPI library.

Keynotes

Meet The *'s

Apex

Customer 360

Salesforce DX

APIs

Platform Events

Platform

Einstein / AI

ISV

Fun

Lightning

  • TODO - Need something to fill out this area to be a well rounded developer. Lightning Roadmap maybe?

See also

Monday, June 24, 2019

FuseIT SFDC Explorer 3.12.19175.1 - Summer '19

Another roundup of some of the changes to the FuseIT SFDC Explorer since the 3.11.19071.3 release.

Fix for Salesforce breaking the Partner API

One of the primary reasons to use the Partner API in an application is that it is org agnostic. It doesn't really care which org you are connecting to and will adjust to fit the shape of the org it is currently working with. So as long as you don't change the API version in the URL it will keep working indefinitely. Or at least, that is the theory...

Sadly, recent changes around the Real-Time Event Monitoring Beta caused metadata on a JSON field type to leak back into older Salesforce API versions. See Known Issue: DescribeSObject call using the v45.0 Partner API are failing due to a complex type(json) that isn't defined in the WSDL.

By updating to v46.0 of the Partner API we now avoid this issue. In older version it will manifest as a "There is an error in XML document" exception when making the describe call.

View Executed Units from an Apex Log

It's true, the Developer Console already has this information in a sub tab under the Execution Overview. Right now my version just shows the execution count for each Apex method and the basic execution time stats. It if isn't showing up, try turning on the timeline or parse tree first. The debug log will need to show the METHOD_ENTRY and METHOD_EXIT entries, so it the Apex Code logging level will need to be FINE or lower.

Wsdl2Apex : Make it easier to debug out stub objects

Currently, if you try and debug out any of the proxy sObjects that get created for a SOAP callout you end up with a ton of extra *_type_info arrays. While necessary for WebServiceCallout.invoke they have zero usage in debugging the state of the request. They are generally constant and don't really reflect the useful state details.

I'd previously written about using JSON serialization to make repeating a SOAP callout easier. Then it occured to me, why not make all the _type_info arrays transient. Then they won't even appear in the debug output. That's exactly what I've done with the current WSDL2Apex build.

Sadly, transient doesn't work for object.toString(). I did raise an idea for it - Provide a mechanism to exclude object variables/properties from toString()

Reciprocal Logging Event selection and highlighting

There are additional context menu options on a debug log to navigate between related opening and closing events. When applicable you can also select the parent event.

Other changes 3.11

  • Update all API calls to use v46.0 Summer '19
  • Support for monitoring multiple test jobs.
  • ApexTestRunner: New async methods for checking test status. Beta throttled test runner.
  • Fix bug where the 10th column in SOQL results was always being blanked out.
  • Data Export CLI: Allow for parameters to vary in order.
  • Expanded support for deploying different Metadata types.
  • MetadataServiceWrapper - Support for .layout metadata types. Improve creation of package zips - only create the required folders.
  • Option to extract login credentials from a URL formed with "un" and "pw" query string parameters.
  • Wsdl2Apex: Handle SimpleType parameters. Warn if a Union simpleType is encountered.
  • Fix serverURL construction from sfdx cli logins.
  • Handle exceptions with SFDX logins

Tuesday, April 9, 2019

Trailhead: Deep Learning and Natural Language Processing

Recently I tackled the Deep Learning and Natural Language Processing Trailhead module.

I can best summarize the experience with an image based on the Suppose you have one rabbit meme explanation of arithmetic:

To be fair, the first unit in the module does call out some prerequisites:

This is an advanced topic, and this module assumes you have a basic understanding of machine learning vocabulary, some experience with Python, and at least a little hands-on experience working with machine learning data and algorithms. If you don’t already have that background, you can get yourself up to speed using the following resources.

I personally think I meet those prerequisites. While I don't work with Python day to day the syntax is familiar enough I figured I can fake it till I make it.

The first two units were reasonably straight forward.

However, by the third unit - Apply Deep Learning to Natural Language Processing it started to get complicated. In particular, with the Hands-on Logistic regression questions 6, and to a lesser extent question 7. These both required completing several TODO lines of Python to derive the loss after 100 epochs.

Let's look at the code from the first part of that that needed to be completed in the TODO sections:

The challenge here is, as with all programming, that all the steps need to be completed successfully before you will get the expected answer. Get any of them wrong and things will go pear shaped fast. I could save you the pain of solving this challenge and provide you the full script, but that doesn't really go with the spirit on Trailhead. Instead I'll add some debugging outputs at various points to show the state of the tensors to hopefully make it clearer what should be happening. At least then if things start to go off track you can pick up the problem immediately.

Exercise 6

TODO: Generate 2 clusters of 100 2d vectors, each one distributed normally, using only two calls of randn()

println(classApoints)
tensor([[ 0.3374, -0.1778],
        [-0.3035, -0.5880],
        [ 0.3486,  0.6603],
        [-0.2196, -0.3792],
        #...
        [-0.7952, -0.9178],
        [ 0.4187, -1.1123],
        [ 1.1227,  0.2646],
        [-0.4698,  1.0866],
        [-0.8892,  0.7647]])
assert(classApoints.size() == torch.Size([100, 2]))
println(classBpoints)
tensor([[ 0.4771,  0.7203],
        [-0.0215,  1.0731],
        [-0.1408, -0.5394],
        [-1.2782, -0.8107],
        #...
        [ 1.1051, -0.5454],
        [ 0.1073,  0.8727],
        [-1.2800, -0.4619],
        [ 1.4342, -1.2103],
        [ 1.3834,  0.0324]])
assert(classBpoints.size() == torch.Size([100, 2]))

TODO: Add the vector [1.0,3.0] to the first cluster and [3.0,1.0] to the second.

println(classApoints)
tensor([[ 1.3374,  2.8222],
        [ 0.6965,  2.4120],
        [ 1.3486,  3.6603],
        [ 0.7804,  2.6208],
        #...
        [ 0.2048,  2.0822],
        [ 1.4187,  1.8877],
        [ 2.1227,  3.2646],
        [ 0.5302,  4.0866],
        [ 0.1108,  3.7647]])
println(classBpoints)
tensor([[ 3.4771,  1.7203],
        [ 2.9785,  2.0731],
        [ 2.8592,  0.4606],
        [ 1.7218,  0.1893],
        #...
        [ 4.1051,  0.4546],
        [ 3.1073,  1.8727],
        [ 1.7200,  0.5381],
        [ 4.4342, -0.2103],
        [ 4.3834,  1.0324]])

TODO: Concatenate these two clusters along dimension 0 so that the points distributed around [1.0, 3.0] all come first

println(inputs)
tensor([[ 1.3374,  2.8222],
        [ 0.6965,  2.4120],
        [ 1.3486,  3.6603],
        [ 0.7804,  2.6208],
        #...
        [ 4.1051,  0.4546],
        [ 3.1073,  1.8727],
        [ 1.7200,  0.5381],
        [ 4.4342, -0.2103],
        [ 4.3834,  1.0324]])

println(inputs.size())
torch.Size([200, 2])

TODO: Create a tensor of target values, 0 for points for the first cluster and # 1 for the points in the second cluster.

println(classA)
tensor([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0])

println(classA.size())
torch.Size([100])
println(classB)
tensor([ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1])

println(classB.size())
torch.Size([100])

# TODO: Initialize a Linear layer to output scores for each class given the 2d examples

println(model)
Linear(in_features=2, out_features=2, bias=True)

# TODO: Define your loss function

Here you need to decide if you are going to use:

  1. MSE Loss for Regression
  2. NLLLoss for Classification
  3. CrossEntropyLoss for Classification

Worst case here, you could just try them each one at a time until you get an answer that matches the expected answers in the Trailhead challenge.

Finishing up exercise 6

After that the rest should fall into place fairly easily based on the prior examples.

Exercise 7: Logistic Regression with a Neural Network

Firstly, the instructions when I completed this included this:

# forward takes self and x as input
#         passes x through linear_to_hidden, then a tanh activation function,
#         and then hidden_to_linear and returns the output

I suspect that should actually be "and then hidden_to_output and returns the output".

The torch.tanh function is required for forward.

Initialize your new network to have in_size 2, hidden_size 6, and out_size 2

You are going to use the NeuralNet class that you just completed defining here.

Define your loss function

As with exercise 6, you can just try the 3 example loss functions to find a good fit for the expected answers.

Finishing up exercise 7

Most of the other parts for this question all fall into place based on the prior example.

Results

It would be accurate to say that this last unit took me way longer than the 90 minutes that Trailhead indicated. But dammit I earned those 100 measly points.

Monday, April 1, 2019

Trailhead Electric Imp plus plus

After completing the the Trailhead Electric Imp project I was left wanting an excuse to access a number of other features of the imp001 hardware. Namely the accelerometer and pressure sensor. Yet there didn't seem many opportunities for a static fridge to fully utilize them.

While I was experimenting with this back in May 2018 the #BeABuilder challenge happened and I got sidetracked using the hardware for that instead. Somewhere in that process I never got around to hitting publish on this blog post, so here we are a year or so latter with a better late than never post.

Firstly I expanded the data model in Salesforce so I'd have fields to receive the additional data. This included the acceleration and pressure readings.

Out of interest I tried getting the IoT Orchestration Traffic into a Lightning component. I didn't have much luck at the time and it didn't really liked to be iframed in.

Hunting through the Salesforce Labs I found the salesforce-iot-toolkit. This provided some great visualizations via platform events as the data came in from the fridge. It directly monitored the platform events to drive the graphs. I found I needed to use the source based version rather than the app exchange version, which was missing a number of features.

Dramatic reenactment of sensor usage

Modified Agent Nut Source

A few points of interest:

  1. My source was the version before they added proper refresh token support to handle expired sessions. So you probably don't want my version of getStoredCredentials()
  2. Added #require statements to access additional sensors.
  3. Dropped the READING_INTERVAL_SEC down to 1 second to improve the accelerometer data.
  4. Retrieved the X,Y, and Z axis acceleration values.
  5. Retrieved the pressure sensor and additional temperature reading from that sensor as well.
  6. Set the RGB color based on the current orientation indicated by the acceleration.
  7. The LIS2DH12 sensor needed a non-default I2C address of 0x32.

Modified Salesforce Platform Event Trigger Code

Nothing fancy here. Just collect the additional properties the map them to the new custom fields.

trigger SmartFridgeReadingReceived on Smart_Fridge_Reading__e (after insert) {
    List records = new List(); 
    for (Smart_Fridge_Reading__e event : Trigger.New) {
        SmartFridge__c record = new SmartFridge__c();
        record.deviceId__c = event.deviceId__c;
        record.temperature__c = event.temperature__c;
        record.humidity__c = event.humidity__c;
        record.door__c = event.door__c;
        record.ts__c = event.ts__c;
        record.accX__c = event.accX__c;
        record.accY__c = event.accY__c;
        record.accZ__c = event.accZ__c;
        record.pressure__c = event.pressure__c;
        record.temp2__c = event.temp2__c;
        records.add(record); }
    insert records;
}