In this article, we are going to create a Machine Learning model on AWS SageMaker using Amazon SageMaker Studio and Autopilot to predict house prices
1. Upload dataset to Amazon S3
First, we need to upload the dataset for house pricing. You can download the dataset for training and testing, here.
From the search bar in our AWS Management Console, search S3.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Create a bucket and upload the training dataset.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
2. Setup SageMaker domain
Now, from the search bar, search for SageMaker Studio.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
We must have a SageMaker domain, so create one. Click on Create a SageMaker domain.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Choose Quick setup and type a domain name.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Below the same page, you must specify a profile name and create an Execution role.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Select the before role created a click on Submit.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Once the domain and profile are created, you can launch SageMaker Studio.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
3. Import the dataset on SageMaker and add some preprocessing steps
From SageMaker Studio, click on Data Wrangler from the left sidebar.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Click on Import data.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
From Data sources, choose Amazon S3.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Select the bucket where you uploaded the dataset.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Select the dataset and then click on Import.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
In the Data types * Transform section, we can add some steps for preprocessing. Click on + Add step.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
In the search bar, type "remove col" and click on the first result.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Select Drop column in Transform and Id in Columns to drop. Then, click on Add.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Click on Export and train.
![]()
In the Training tab, click on Export and train.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
4. Create an Autopilot experiment
Type an experiment name, toggle on Auto split data, and click on Next.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Select SalePrice as Target and the rest of the columns as Features. Then, click on Next.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Choose Auto and click on Next.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Type the endpoint name, select Regression as the problem type, and MSE as an objective metric. Then, click on Next.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Check if the configuration details are correct and click on Create experiment.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Now, wait until the experiment finishes. You can see the best trial at the top. In this case, the experiment made 100 trials.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
In the Job profile tab, you can see how much time it took the experiment to finish. In this case, it took 46 minutes.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
You can see the model details for the best trial. In the Trials tab, click on View model details. On this page, there is a chart about the feature importance.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
5. Deploy ML model
Go back to the Trials tab, check the best model from the table, and click on Deploy model.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Choose Make real-time predictions and type the endpoint name. Then, click on Deploy
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Now, wait until the endpoint is deployed. In this case, it took 19 minutes to finish the deployment.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
6. Invoke deployed ML model
In the AWS Settings tab you can copy the deployed URL.
In order to invoke the URL, we need the Access Key and Secret Key. So, we can create a user and assign the AmazonSageMakerFullAccess permission policy. To do that, follow these steps:
- Go to IAM Management Console
- Users
- Click on Add Users
- Type de username
- In the Set permissions step, check Use a permissions boundary to control the maximum permissions and search AmazonSageMakerFullAccess in the permissions policies list
- Review the details for the user and click on Create user
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
Once the user is created, from the user's list, click on this user. Then, go to the Security credentials tab and create an access key. Copy the access key and secret key.
We're going to invoke the URL from Postman, but you can do it from other similar software as well.
Select POST as the HTTP method. In the Authorization tab, select AWS Signature as type. Then, paste the access key and secret key. Also, type sagemaker as Service Name.
In the Headers tab, change the Content-Type to text/csv.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
In the Body tab, select raw and Text. Paste any record from the testing dataset. Don't forget to remove the id and sale price from the record. Now it's ready to invoke the URL.
![Deploy a Regression model through AWS SageMaker studio using Autopilot]()
NOTE: After this lab, if you want to delete the created endpoint, visit this page.
Thanks for reading
Thank you very much for reading, I hope you found this article interesting and may be useful in the future. If you have any questions or ideas that you need to discuss, it will be a pleasure to be able to collaborate and exchange knowledge together.