This publish is co-authored by Daryl Martis, Director of Product, Salesforce Einstein AI.
That is the second publish in a collection discussing the mixing of Salesforce Information Cloud and Amazon SageMaker. In Part 1, we present how the Salesforce Information Cloud and Einstein Studio integration with SageMaker permits companies to entry their Salesforce information securely utilizing SageMaker and use its instruments to construct, practice, and deploy fashions to endpoints hosted on SageMaker. The endpoints are then registered to the Salesforce Information Cloud to activate predictions in Salesforce.
On this publish, we develop on this matter to display the way to use Einstein Studio for product suggestions. You should utilize this integration for conventional fashions in addition to massive language fashions (LLMs).
Resolution overview
On this publish, we display the way to create a predictive mannequin in SageMaker to advocate the subsequent greatest product to your prospects through the use of historic information resembling buyer demographics, advertising and marketing engagements, and buy historical past from Salesforce Information Cloud.
We use the next sample dataset. To make use of this dataset in your Information Cloud, check with Create Amazon S3 Data Stream in Data Cloud.
The next attributes are wanted to create the mannequin:
- Membership Member – If the shopper is a membership member
- Marketing campaign – The marketing campaign the shopper is part of
- State – The state or province the shopper resides in
- Month – The month of buy
- Case Depend – The variety of circumstances raised by the shopper
- Case Sort Return – Whether or not the shopper returned any product throughout the final yr
- Case Sort Cargo Broken – Whether or not the shopper had any shipments broken within the final yr
- Engagement Rating – The extent of engagement the shopper has (response to mailing campaigns, logins to the web retailer, and so forth)
- Tenure – The tenure of the shopper relationship with the corporate
- Clicks – The common variety of clicks the shopper has made inside every week prior to buy
- Pages Visited – The common variety of pages the shopper has visited inside every week prior to buy
- Product Bought – The precise product bought
- Id – The ID of the file
- DateTime – The timestamp of the dataset
The product suggestion mannequin is constructed and deployed on SageMaker and is educated utilizing information within the Salesforce Information Cloud. The next steps give an summary of the way to use the brand new capabilities launched in SageMaker for Salesforce to allow the general integration:
- Arrange the Amazon SageMaker Studio area and OAuth between Salesforce and the AWS account
s. - Use the newly launched functionality of the Amazon SageMaker Data Wrangler connector for Salesforce Information Cloud to organize the information in SageMaker with out copying the information from Salesforce Information Cloud.
- Prepare a suggestion mannequin in SageMaker Studio utilizing coaching information that was ready utilizing SageMaker Information Wrangler.
- Bundle the SageMaker Information Wrangler container and the educated suggestion mannequin container in an inference pipeline so the inference request can use the identical information preparation steps you created to preprocess the coaching information. The actual-time inference name information is first handed to the SageMaker Information Wrangler container within the inference pipeline, the place it’s preprocessed and handed to the educated mannequin for product suggestion. For extra details about this course of, check with New — Introducing Support for Real-Time and Batch Inference in Amazon SageMaker Data Wrangler. Though we use a selected algorithm to coach the mannequin in our instance, you should utilize any algorithm that you just discover applicable on your use case.
- Use the newly launched SageMaker supplied undertaking template for Salesforce Information Cloud integration to streamline implementing the previous steps by offering the next templates:
- An instance pocket book showcasing information preparation, constructing, coaching, and registering the mannequin.
- The SageMaker supplied undertaking template for Salesforce Information Cloud integration, which automates making a SageMaker endpoint internet hosting the inference pipeline mannequin. When a model of the mannequin within the Amazon SageMaker Model Registry is accepted, the endpoint is uncovered as an API with Amazon API Gateway utilizing a customized Salesforce JSON Net Token (JWT) authorizer. API Gateway is required to permit Salesforce Information Cloud to make predictions in opposition to the SageMaker endpoint utilizing a JWT token that Salesforce creates and passes with the request when making predictions from Salesforce. JWT can be used as a part of OpenID Connect (OIDC) and OAuth 2.0 frameworks to limit shopper entry to your APIs.
- After you create the API, we advocate registering the mannequin endpoint in Salesforce Einstein Studio. For directions, check with Bring Your Own AI Models to Salesforce with Einstein Studio
The next diagram illustrates the answer structure.
Create a SageMaker Studio area
First, create a SageMaker Studio area. For directions, check with Onboard to Amazon SageMaker Domain. It is best to be aware down the area ID and execution position that’s created and will probably be utilized by your consumer profile. You add permissions to this position in subsequent steps.
The next screenshot exhibits the area we created for this publish.
The next screenshot exhibits the instance consumer profile for this publish.
Arrange the Salesforce linked app
Subsequent, we create a Salesforce linked app to allow the OAuth move from SageMaker Studio to Salesforce Information Cloud. Full the next steps:
- Log in to Salesforce and navigate to Setup.
- Seek for App Supervisor and create a brand new linked app.
- Present the next inputs:
- For Linked App Title, enter a reputation.
- For API Title, depart as default (it’s mechanically populated).
- For Contact E-mail, enter your contact e mail tackle.
- Choose Allow OAuth Settings.
- For Callback URL, enter
https://<domain-id>.studio.<area>.sagemaker.aws/jupyter/default/lab
, and supply the area ID that you just captured whereas creating the SageMaker area and the Area of your SageMaker area.
- Below Chosen OAuth Scopes, transfer the next from Obtainable OAuth Scopes to Chosen OAuth Scopes and select Save:
- Handle consumer information through APIs (api)
- Carry out requests at any time (
refresh_token
,offline_access
) - Carry out ANSI SQL queries on Salesforce Information Cloud information (Information Cloud_query_api)
- Handle Salesforce Buyer Information Platform profile information (Information Cloud_profile_api
- Entry the id URL service (id, profile, e mail, tackle, telephone)
- Entry distinctive consumer identifiers (
openid
)
For extra details about making a linked app, check with Create a Connected App.
- Return to the linked app and navigate to Shopper Key and Secret.
- Select Handle Shopper Particulars.
- Copy the important thing and secret.
Chances are you’ll be requested to log in to your Salesforce org as a part of the two-factor authentication right here.
- Navigate again to the Handle Linked Apps web page.
- Open the linked app you created and select Handle.
- Select Edit Insurance policies and alter IP Rest to Calm down IP restrictions, then save your settings.
Configure SageMaker permissions and lifecycle guidelines
Create a secret in AWS Secrets and techniques Supervisor
Allow OAuth integration with Salesforce Information Cloud by storing credentials out of your Salesforce linked app in AWS Secrets Manager:
- On the Secrets and techniques Supervisor console, select Retailer a brand new secret.
- Choose Different kind of secret.
- Create your secret with the next key-value pairs:
- Add a tag with the important thing
sagemaker:associate
and your alternative of worth. - Save the key and be aware the ARN of the key.
Configure a SageMaker lifecycle rule
The SageMaker Studio area execution position would require AWS Identity and Access Management (IAM) permissions to entry the key created within the earlier step. For extra info, check with Creating roles and attaching policies (console).
- On the IAM console, connect the next polices to their respective roles (these roles will probably be utilized by the SageMaker undertaking for deployment):
- Add the coverage
AmazonSageMakerPartnerServiceCatalogProductsCloudFormationServiceRolePolicy
to the service positionAmazonSageMakerServiceCatalogProductsCloudformationRole
. - Add the coverage
AmazonSageMakerPartnerServiceCatalogProductsApiGatewayServiceRolePolicy
to the service positionAmazonSageMakerServiceCatalogProductsApiGatewayRole
. - Add the coverage
AmazonSageMakerPartnerServiceCatalogProductsLambdaServiceRolePolicy
to the service positionAmazonSageMakerServiceCatalogProductsLambdaRole
.
- Add the coverage
- On the IAM console, navigate to the SageMaker area execution position.
- Select Add permissions and choose Create an inline coverage.
- Enter the next coverage within the JSON coverage editor:
SageMaker Studio lifecycle configuration gives shell scripts that run when a pocket book is created or began. The lifecycle configuration will probably be used to retrieve the key and import it to the SageMaker runtime.
- On the SageMaker console, select Lifecycle configurations within the navigation pane.
- Select Create configuration.
- Go away the default choice Jupyter Server App and select Subsequent.
- Give the configuration a reputation.
- Enter the next script within the editor, offering the ARN for the key you created earlier:
- Select Submit to avoid wasting the lifecycle configuration.
- Select Domains within the navigation pane and open your area.
- On the Setting tab, select Connect to connect your lifecycle configuration.
- Select the lifecycle configuration you created and select Connect to area.
- Select Set as default.
In case you are a returning consumer to SageMaker Studio, to be able to guarantee Salesforce Information Cloud is enabled, upgrade to the latest Jupyter and SageMaker Data Wrangler kernels.
This completes the setup to allow information entry from Salesforce Information Cloud to SageMaker Studio to construct AI and machine studying (ML) fashions.
Create a SageMaker undertaking
To begin utilizing the answer, first create a undertaking utilizing Amazon SageMaker Projects. Full the next steps:
- In SageMaker Studio, below Deployments within the navigation pane, select Initiatives.
- Select Create undertaking.
- Select the undertaking template referred to as Mannequin deployment for Salesforce.
- Select Choose undertaking template.
- Enter a reputation and optionally available description on your undertaking.
- Enter a mannequin group title.
- Enter the title of the Secrets and techniques Supervisor secret that you just created earlier.
- Select Create undertaking.
The undertaking might take 1–2 minutes to provoke.
You possibly can see two new repositories. The primary one is for pattern notebooks that you should utilize as is or customise to organize, practice, create, and register fashions within the SageMaker Mannequin Registry. The second repository is for automating the mannequin deployment, which incorporates exposing the SageMaker endpoint as an API.
- Select clone repo for each notebooks.
For this publish, we use the product suggestion instance, which will be discovered within the sagemaker-<YOUR-PROJECT-NAME>-p-<YOUR-PROJECT-ID>-example-nb/product-recommendation
listing that you just simply cloned. Earlier than we run the product-recommendation.ipynb pocket book, let’s do some information preparation to create the coaching information utilizing SageMaker Information Wrangler.
Put together information with SageMaker Information Wrangler
Full the next steps:
- In SageMaker Studio, on the File menu, select New and Information Wrangler move.
- After you create the information move, select (right-click) the tab and select Rename to rename the file.
- Select Import information.
- Select Create connection.
- Select Salesforce Information Cloud.
- For Title, enter
salesforce-data-cloud-sagemaker-connection
. - For Salesforce org URL, enter your Salesforce org URL.
- Select Save + Join.
- Within the Information Explorer view, choose and preview the tables from the Salesforce Information Cloud to create and run the question to extract the required dataset.
- Your question will appear like beneath and it’s possible you’ll use the desk title that you just used whereas importing information in Salesforce Information Cloud.
- Select Create dataset.
Creating the dataset might take a while.
Within the information move view, now you can see a brand new node added to the visible graph.
For extra info on how you should utilize SageMaker Information Wrangler to create Information High quality and Insights Stories, check with Get Insights On Data and Data Quality.
SageMaker Information Wrangler presents over 300 built-in transformations. On this step, we use a few of these transformations to organize the dataset for an ML mannequin. For detailed directions on the way to implement these transformations, check with Transform Data.
- Use the Handle columns step with the Drop column remodel to drop the column
id__c
. - Use the Deal with lacking step with the Drop lacking remodel to drop rows with lacking values for numerous options. We apply this transformation on all columns.
- Use a customized remodel step to create categorical values for
state__c
,case_count__c
, andtenure
options. Use the next code for this transformation: - Use the Course of numeric step with the Scale values remodel and select Customary scaler to scale
clicks__c
,engagement__score
, andpages__visited__c
options. - Use the Encode categorical step with the One-hot encode remodel to transform categorical variables to numeric for
case__type__return___c
,case__type_shipment__damaged
,month__c
,club__member__c
, andcampaign__c
options (all options besidesclicks__c
,engagement__score
,pages__visited__c
, andproduct_purchased__c
).
Mannequin constructing, coaching, and deployment
To construct, practice, and deploy the mannequin, full the next steps:
- Return to the SageMaker undertaking, open the product-recommendation.ipynb pocket book, and run a processing job to preprocess the information utilizing the SageMaker Information Wrangler configuration you created.
- Observe the steps within the pocket book to coach a mannequin and register it to the SageMaker Mannequin Registry.
- Be sure that to replace the mannequin group title to match with the mannequin group title that you just used whereas creating the SageMaker undertaking.
To find the mannequin group title, open the SageMaker undertaking that you just created earlier and navigate to the Settings tab.
Equally, the move file referenced within the pocket book should match with the move file title that you just created earlier.
- For this publish, we used
product-recommendation
because the mannequin group title, so we replace the pocket book withproject-recommendation
because the mannequin group title within the pocket book.
After the pocket book is run, the educated mannequin is registered within the Mannequin Registry. To study extra in regards to the Mannequin Registry, check with Register and Deploy Models with Model Registry.
- Choose the mannequin model you created and replace the standing of it to Accredited.
Now that you’ve got accepted the registered mannequin, the SageMaker Salesforce undertaking deploy step will provision and set off AWS CodePipeline.
CodePipeline has steps to construct and deploy a SageMaker endpoint for inference containing the SageMaker Information Wrangler preprocessing steps and the educated mannequin. The endpoint will probably be uncovered to Salesforce Information Cloud as an API by means of API Gateway. The next screenshot exhibits the pipeline prefixed with Sagemaker-salesforce-product-recommendation-xxxxx
. We additionally present you the endpoints and API that will get created by the SageMaker undertaking for Salesforce.
If you need, you may check out the CodePipeline deploy step, which makes use of AWS CloudFormation scripts to create SageMaker endpoint and API Gateway with a customized JWT authorizer.
When pipeline deployment is full, yow will discover the SageMaker endpoint on the SageMaker console.
You possibly can discover the API Gateway created by the undertaking template on the API Gateway console.
Select the hyperlink to seek out the API Gateway URL.
You will discover the small print of the JWT authorizer by selecting Authorizers on the API Gateway console. You may as well go to the AWS Lambda console to evaluate the code of the Lambda perform created by undertaking template.
To find the schema for use whereas invoking the API from Einstein Studio, select Data within the navigation pane of the Mannequin Registry. You will notice an Amazon Simple Storage Service (Amazon S3) hyperlink to a metadata file. Copy and paste the hyperlink into a brand new browser tab URL.
Let’s have a look at the file with out downloading it. On the file particulars web page, select the Object actions menu and select Question with S3 Choose.
Select Run SQL question and pay attention to the API Gateway URL and schema as a result of you have to this info when registering with Einstein Studio. In the event you don’t see an APIGWURL
key, both the mannequin wasn’t accepted, deployment remains to be in progress, or deployment failed.
Use the Salesforce Einstein Studio API for predictions
Salesforce Einstein Studio is a brand new and centralized expertise in Salesforce Information Cloud that information science and engineering groups can use to simply entry their conventional fashions and LLMs utilized in generative AI. Subsequent, we arrange the API URL and client_id
that you just set in Secrets and techniques Supervisor earlier in Salesforce Einstein Studio to register and use the mannequin inferences in Salesforce Einstein Studio. For directions, check with Bring Your Own AI Models to Salesforce with Einstein Studio.
Clear up
To delete all of the sources created by the SageMaker undertaking, on the undertaking web page, select the Motion menu and select Delete.
To delete the sources (API Gateway and SageMaker endpoint) created by CodePipeline, navigate to the AWS CloudFormation console and delete the stack that was created.
Conclusion
On this publish, we defined how one can construct and practice ML fashions in SageMaker Studio utilizing SageMaker Information Wrangler to import and put together information that’s hosted on the Salesforce Information Cloud and use the newly launched Salesforce Information Cloud JDBC connector in SageMaker Information Wrangler and first-party Salesforce template within the SageMaker supplied undertaking template for Salesforce Information Cloud integration. The SageMaker undertaking template for Salesforce allows you to deploy the mannequin and create the endpoint and safe an API for a registered mannequin. You then use the API to make predictions in Salesforce Einstein Studio for what you are promoting use circumstances.
Though we used the instance of product suggestion to showcase the steps for implementing the end-to-end integration, you should utilize the SageMaker undertaking template for Salesforce to create an endpoint and API for any SageMaker conventional mannequin and LLM that’s registered within the SageMaker Mannequin Registry. We look ahead to seeing what you construct in SageMaker utilizing information from Salesforce Information Cloud and empower your Salesforce functions utilizing SageMaker hosted ML fashions!
This publish is a continuation of the collection relating to Salesforce Information Cloud and SageMaker integration. For a high-level overview and to study extra in regards to the enterprise influence you may make with this integration strategy, check with Part 1.
Further sources
Concerning the authors
Daryl Martis is the Director of Product for Einstein Studio at Salesforce Information Cloud. He has over 10 years of expertise in planning, constructing, launching, and managing world-class options for enterprise prospects together with AI/ML and cloud options. He has beforehand labored within the monetary companies business in New York Metropolis. Observe him on https://www.linkedin.com/in/darylmartis.
Rachna Chadha is a Principal Options Architect AI/ML in Strategic Accounts at AWS. Rachna is an optimist who believes that moral and accountable use of AI can enhance society sooner or later and convey financial and social prosperity. In her spare time, Rachna likes spending time along with her household, mountain climbing, and listening to music.
Ife Stewart is a Principal Options Architect within the Strategic ISV phase at AWS. She has been engaged with Salesforce Information Cloud during the last 2 years to assist construct built-in buyer experiences throughout Salesforce and AWS. Ife has over 10 years of expertise in know-how. She is an advocate for range and inclusion within the know-how discipline.
Dharmendra Kumar Rai (DK Rai) is a Sr. Information Architect, Information Lake & AI/ML, serving strategic prospects. He works carefully with prospects to know how AWS will help them resolve issues, particularly within the AI/ML and analytics area. DK has a few years of expertise in constructing data-intensive options throughout a variety of business verticals, together with high-tech, FinTech, insurance coverage, and consumer-facing functions.
Marc Karp is an ML Architect with the SageMaker Service staff. He focuses on serving to prospects design, deploy, and handle ML workloads at scale. In his spare time, he enjoys touring and exploring new locations.