Tuesday, April 9, 2019

Trailhead: Deep Learning and Natural Language Processing

Recently I tackled the Deep Learning and Natural Language Processing Trailhead module.

I can best summarize the experience with an image based on the Suppose you have one rabbit meme explanation of arithmetic:

To be fair, the first unit in the module does call out some prerequisites:

This is an advanced topic, and this module assumes you have a basic understanding of machine learning vocabulary, some experience with Python, and at least a little hands-on experience working with machine learning data and algorithms. If you don’t already have that background, you can get yourself up to speed using the following resources.

I personally think I meet those prerequisites. While I don't work with Python day to day the syntax is familiar enough I figured I can fake it till I make it.

The first two units were reasonably straight forward.

However, by the third unit - Apply Deep Learning to Natural Language Processing it started to get complicated. In particular, with the Hands-on Logistic regression questions 6, and to a lesser extent question 7. These both required completing several TODO lines of Python to derive the loss after 100 epochs.

Let's look at the code from the first part of that that needed to be completed in the TODO sections:

The challenge here is, as with all programming, that all the steps need to be completed successfully before you will get the expected answer. Get any of them wrong and things will go pear shaped fast. I could save you the pain of solving this challenge and provide you the full script, but that doesn't really go with the spirit on Trailhead. Instead I'll add some debugging outputs at various points to show the state of the tensors to hopefully make it clearer what should be happening. At least then if things start to go off track you can pick up the problem immediately.

Exercise 6

TODO: Generate 2 clusters of 100 2d vectors, each one distributed normally, using only two calls of randn()

println(classApoints)
tensor([[ 0.3374, -0.1778],
        [-0.3035, -0.5880],
        [ 0.3486,  0.6603],
        [-0.2196, -0.3792],
        #...
        [-0.7952, -0.9178],
        [ 0.4187, -1.1123],
        [ 1.1227,  0.2646],
        [-0.4698,  1.0866],
        [-0.8892,  0.7647]])
assert(classApoints.size() == torch.Size([100, 2]))
println(classBpoints)
tensor([[ 0.4771,  0.7203],
        [-0.0215,  1.0731],
        [-0.1408, -0.5394],
        [-1.2782, -0.8107],
        #...
        [ 1.1051, -0.5454],
        [ 0.1073,  0.8727],
        [-1.2800, -0.4619],
        [ 1.4342, -1.2103],
        [ 1.3834,  0.0324]])
assert(classBpoints.size() == torch.Size([100, 2]))

TODO: Add the vector [1.0,3.0] to the first cluster and [3.0,1.0] to the second.

println(classApoints)
tensor([[ 1.3374,  2.8222],
        [ 0.6965,  2.4120],
        [ 1.3486,  3.6603],
        [ 0.7804,  2.6208],
        #...
        [ 0.2048,  2.0822],
        [ 1.4187,  1.8877],
        [ 2.1227,  3.2646],
        [ 0.5302,  4.0866],
        [ 0.1108,  3.7647]])
println(classBpoints)
tensor([[ 3.4771,  1.7203],
        [ 2.9785,  2.0731],
        [ 2.8592,  0.4606],
        [ 1.7218,  0.1893],
        #...
        [ 4.1051,  0.4546],
        [ 3.1073,  1.8727],
        [ 1.7200,  0.5381],
        [ 4.4342, -0.2103],
        [ 4.3834,  1.0324]])

TODO: Concatenate these two clusters along dimension 0 so that the points distributed around [1.0, 3.0] all come first

println(inputs)
tensor([[ 1.3374,  2.8222],
        [ 0.6965,  2.4120],
        [ 1.3486,  3.6603],
        [ 0.7804,  2.6208],
        #...
        [ 4.1051,  0.4546],
        [ 3.1073,  1.8727],
        [ 1.7200,  0.5381],
        [ 4.4342, -0.2103],
        [ 4.3834,  1.0324]])

println(inputs.size())
torch.Size([200, 2])

TODO: Create a tensor of target values, 0 for points for the first cluster and # 1 for the points in the second cluster.

println(classA)
tensor([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0])

println(classA.size())
torch.Size([100])
println(classB)
tensor([ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1])

println(classB.size())
torch.Size([100])

# TODO: Initialize a Linear layer to output scores for each class given the 2d examples

println(model)
Linear(in_features=2, out_features=2, bias=True)

# TODO: Define your loss function

Here you need to decide if you are going to use:

  1. MSE Loss for Regression
  2. NLLLoss for Classification
  3. CrossEntropyLoss for Classification

Worst case here, you could just try them each one at a time until you get an answer that matches the expected answers in the Trailhead challenge.

Finishing up exercise 6

After that the rest should fall into place fairly easily based on the prior examples.

Exercise 7: Logistic Regression with a Neural Network

Firstly, the instruction when I completed this included this:

# forward takes self and x as input # passes x through linear_to_hidden, then a tanh activation function, # and then hidden_to_linear and returns the output

I suspect that should actually be "and then hidden_to_output and returns the output".

The torch.tanh function is required for forward.

Initialize your new network to have in_size 2, hidden_size 6, and out_size 2

You are going to use the NeuralNet class that you just completed defining here.

Define your loss function

As with exercise 6, you can just try the 3 example loss functions to find a good fit for the expected answers.

Finishing up exercise 7

Most of the other parts for this question all fall into place based on the prior example.

Results

It would be accurate to say that this last unit took me way longer than the 90 minutes that Trailhead indicated. But dammit I earned those 100 measly points.

Monday, April 1, 2019

Trailhead Electric Imp plus plus

After completing the the Trailhead Electric Imp project I was left wanting an excuse to access a number of other features of the imp001 hardware. Namely the accelerometer and pressure sensor. Yet there didn't seem many opportunities for a static fridge to fully utilize them.

While I was experimenting with this back in May 2018 the #BeABuilder challenge happened and I got sidetracked using the hardware for that instead. Somewhere in that process I never got around to hitting publish on this blog post, so here we are a year or so latter with a better late than never post.

Firstly I expanded the data model in Salesforce so I'd have fields to receive the additional data. This included the acceleration and pressure readings.

Out of interest I tried getting the IoT Orchestration Traffic into a Lightning component. I didn't have much luck at the time and it didn't really liked to be iframed in.

Hunting through the Salesforce Labs I found the salesforce-iot-toolkit. This provided some great visualizations via platform events as the data came in from the fridge. It directly monitored the platform events to drive the graphs. I found I needed to use the source based version rather than the app exchange version, which was missing a number of features.

Dramatic reenactment of sensor usage

Modified Agent Nut Source

A few points of interest:

  1. My source was the version before they added proper refresh token support to handle expired sessions. So you probably don't want my version of getStoredCredentials()
  2. Added #require statements to access additional sensors.
  3. Dropped the READING_INTERVAL_SEC down to 1 second to improve the accelerometer data.
  4. Retrieved the X,Y, and Z axis acceleration values.
  5. Retrieved the pressure sensor and additional temperature reading from that sensor as well.
  6. Set the RGB color based on the current orientation indicated by the acceleration.
  7. The LIS2DH12 sensor needed a non-default I2C address of 0x32.

Modified Salesforce Platform Event Trigger Code

Nothing fancy here. Just collect the additional properties the map them to the new custom fields.

trigger SmartFridgeReadingReceived on Smart_Fridge_Reading__e (after insert) {
    List records = new List(); 
    for (Smart_Fridge_Reading__e event : Trigger.New) {
        SmartFridge__c record = new SmartFridge__c();
        record.deviceId__c = event.deviceId__c;
        record.temperature__c = event.temperature__c;
        record.humidity__c = event.humidity__c;
        record.door__c = event.door__c;
        record.ts__c = event.ts__c;
        record.accX__c = event.accX__c;
        record.accY__c = event.accY__c;
        record.accZ__c = event.accZ__c;
        record.pressure__c = event.pressure__c;
        record.temp2__c = event.temp2__c;
        records.add(record); }
    insert records;
}

Tuesday, March 12, 2019

FuseIT SFDC Explorer 3.11.19071.3 - Spring '19

Another roundup of some of the changes to the FuseIT SFDC Explorer since the 3.9.18190.1 release.

Bypass sObject types based of suffix to speed up metadata retrieval

With the update to v45.0 of the APIs there is now describe metadata coming back for change data capture with the suffix __ChangeEvent. This can be useful, except every custom object also gets a change event. If you have a large number of custom objects it can quickly become overwhelming.

I've added a filter option to the sObject metadata browser that defaults to only showing __c, __mdt, and __e. Other options can be turned on if you want to see them. A refresh is required for the changes to take affect.

Expanded Asynchronous Test Results view

After the async tests have finished running you will set the total duration by Apex class and method. This can be useful for identifying which test methods contribute the most to the overall test run time.

Expose the logging levels in the menu

You can alter the current logging levels for the current DebugLevel and then use "Update Current TraceFlag" to apply them to Salesforce.

The "Disable Logging" menu option will remove the current TraceFlag, which will stop further logging from occurring.

Wsdl2Apex

There have been a number of changes to the Wsdl2Apex processing to support more edge cases:

  1. Allow for schema imports via schema imports (nested imports). Includes checks for circular references.
  2. If the complex type can't be found in the current schema search the imported schema for it.
  3. If a SimpleType has a restriction that isn't primitive then search the underlying types for the primitive type that can be converted to Apex.
  4. Detect repeated imported schema target namespaces
  5. Use common code to create Apex class names from complex types.
  6. Add generated Apex comments to classes that extend a base type.
  7. Attempt to compile the XmlSchema to resolve groups so they can be automatically expanded.
  8. Include warning messages on web methods where the return type can't be resolved from the namespace.

Include Apex class name in code coverage breakdown

The Apex class name and test method name will be included with the class coverage breakdown details.

JWT Authentication for Salesforce sessions

It is now possible to use the OAuth 2.0 JWT Bearer Token flow to authenticate with Salesforce. I'll expand on the steps to do this shortly, which will include configuring the connected app and registering the X509 Certificate.

Other changes 3.11

  • Middle click to hide tab. Improve tab hiding.
  • Improve support for using the ToolingAPI in the EntityViewer.
  • If directly updating a nullable int in the SOQL results treat it as such (rather than a string)
  • Format a VF_PAGE_MESSAGE in the log timeline
  • Format display of ApexResult and HeapDump from the Tooling API SOQL Queries.
  • Allow for a CultureInfo.DateTimeFormat.TimeSeparator other than ":". Force the SOQL building code to use the InvarientCulture so the required format is applied.
  • Update core SalesforceSession to use v45.0 (Spring 19) APIs
  • For the Apex Log Stack, include the Soql Execute Count
  • Option to run the most recent async test cases again after a successful metadata deployment.
  • EntityViewer - Allow for ToolingAPI entities.
  • Event Log Viewer. Only show the first 4000 rows of data.
  • Log Timeline control. Show icons for Test.startTest() and Test.stopTest() if they are detected.
  • Handle failure to query for recent apex test runs.
  • When generating anonymous Apex to get the current Session ID include the SalesforceBaseUrl
  • Code Generation - If generating without custom fields, exclude picklist values containing "__". These typically represent relationships to custom sObjects
    Detect and skip duplicate picklist value entries.
    Fix the generated properties for *Picklist to set based on the enum value rather than the description attribute.
    Detect references to sObjects that aren't exposed via the metadata.
  • Regenerate sObjects (plus associated services and data sources) for API v44.0.
  • CodeGeneration - If configured, exclude relationships to custom fields.
  • Don't generate types for __ChangeEvent, __History, __Share
  • CodeGeneration - generate "extends" for Apex classes extending other classes.
  • Expand details for WebServiceCallout.invoke with comments.
  • ApexClass (for generation) - Track if the Apex classes extends a base class.
  • FieldService - When extracting child elements from a query result skip nodes that aren't XmlElements.
  • ToolingServiceWrapper - Add describeGlobalCached() that caches describeGlobal() to improve performance.
  • SalesforceSession - If an exception occurs with a WebRequest include the complete URL in the Exception.Data
  • Internal caching of the Tooling API DescribeGlobalResult Metadata
  • Auto-size columns in SOSL & single page SOQL results
  • Improve usage of direct Csv loading. Support bulk load for larger operations. Allow the primary sObject type to be defined.
  • Change the Apex log layout with the treeview between horizontal and vertical.
  • Define logging levels via common menu, Handle hitting the maximum debug log count
  • Capture SSL/TLS config issues and provide help link to dedicated help page.
  • When checking metadata deployment successes, highlight changed components vs. unchanged.