Skip to main content

Command Palette

Search for a command to run...

Building Machine Learning Models with AWS SageMaker

Published
β€’3 min read
Building Machine Learning Models with AWS SageMaker
S

πŸš€ Software Geek | DevOps Engineer πŸ› οΈ Hi, I'm Sahil Patil, a passionate DevOps wizard dedicated to transforming code into cash by building scalable, high-performing, and reliable systems. With a knack for solving complex problems, I thrive on turning chaos into cloud-based efficiency through the seamless integration of DevOps practices and cloud solutions.My toolkit includes Kubernetes 🐳, Docker πŸ‹, and Terraform βš™οΈ, which I use to design robust, secure, and efficient infrastructure. Linux 🐧 is my playground, where I excel in troubleshooting and optimizing environments. AWS ☁️ serves as my canvas for crafting innovative cloud architectures.πŸ† Achievements: πŸŽ“ Awarded with Prime Minister Scholarship with All India Rank 2032.πŸ’Ό Selected for an internship at LRDE DRDO, Bengaluru.πŸ… Received Gaurav Puraskar from Defence Welfare, India.πŸ“œ Received KSB Scholarships from Kendriya Sainik Board, New Delhi.🌱 What Drives Me: I'm committed to continuous learning and staying ahead in the ever-evolving tech landscape. I actively participate in DevOps and cloud community meetups 🀝 to network with industry experts and exchange insights, helping me refine my skills and broaden my perspective.Let’s connect and collaborate to build something remarkable! πŸš€

AWS SageMaker makes it easy to build, train, and deploy machine learning (ML) models in the cloud. It provides everything needed to develop ML models, from data preparation to deployment. Let's go through the process step by step. πŸš€


Step 1: Setting Up SageMaker πŸ’»

First, log in to your AWS Management Console and navigate to Amazon SageMaker. You'll find various options like notebooks, training jobs, and model deployments.

To get started, you can use SageMaker Studio (a web-based IDE) or SageMaker Notebook Instances (managed Jupyter notebooks).

  1. Go to SageMaker in AWS.

  2. Click "Notebook Instances" and create a new one.

  3. Choose an instance type (e.g., ml.t2.medium for beginners).

  4. Attach an IAM role with permissions for S3 and SageMaker.

  5. Start the instance and open Jupyter Notebook. πŸŽ‰


Step 2: Preparing the Data πŸ“Š

Good ML models need quality data. Amazon SageMaker integrates with Amazon S3, where you can store datasets.

  1. Upload data to S3:

    • Go to S3 Console

    • Create a bucket and upload your CSV or JSON dataset.

  2. Load data in a notebook:

import boto3
import pandas as pd

s3_bucket = "your-bucket-name"
file_key = "your-dataset.csv"

s3_client = boto3.client('s3')
obj = s3_client.get_object(Bucket=s3_bucket, Key=file_key)
df = pd.read_csv(obj['Body'])
df.head()
  1. Preprocess the data:

    • Handle missing values

    • Normalize numerical features

    • Convert categorical data into numerical form


Step 3: Selecting an ML Algorithm πŸ€–

AWS SageMaker provides built-in algorithms like XGBoost, Random Cut Forest, and DeepAR. You can also bring your own models using TensorFlow, PyTorch, or Scikit-learn.

For example, to use the XGBoost algorithm:

from sagemaker import session
from sagemaker.estimator import Estimator

sagemaker_session = session.Session()
role = "your-iam-role"

container = sagemaker.image_uris.retrieve("xgboost", sagemaker_session.boto_region_name, "1.5-1")

xgb = Estimator(
    image_uri=container,
    role=role,
    instance_count=1,
    instance_type="ml.m5.large",
    sagemaker_session=sagemaker_session
)

Step 4: Training the Model πŸ‹οΈβ€β™‚οΈ

Once the data is preprocessed and the algorithm is selected, we can start training the model. SageMaker automatically provisions compute resources and runs training efficiently.

To train the model:

xgb.fit({"train": "s3://your-bucket/train-data/"})

This will launch a training job on a SageMaker ML instance. Once the training is complete, the trained model is stored in Amazon S3.


Step 5: Deploying the Model πŸš€

Once the model is trained, you can deploy it as a real-time endpoint using SageMaker’s hosting services.

predictor = xgb.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.large"
)

This creates an API endpoint, allowing applications to send requests and get predictions.


Step 6: Making Predictions πŸ“ˆ

After deployment, you can send test data to the model and get predictions.

import json

test_data = [[5.1, 3.5, 1.4, 0.2]]  # Example input
response = predictor.predict(test_data)
print(response)

This will return the predicted result based on the trained model.


Step 7: Monitoring and Scaling πŸ“Š

AWS SageMaker offers monitoring tools to track performance, logs, and errors using Amazon CloudWatch.

You can also scale the endpoint by increasing instance counts for high traffic.

predictor.update_endpoint(initial_instance_count=2)

For cost savings, you can use serverless inference or batch transform instead of real-time endpoints.


Step 8: Cleaning Up 🧹

Once done, delete unnecessary resources to avoid costs.

predictor.delete_endpoint()

Also, stop any running notebook instances to save money.


Summary ✨

AWS SageMaker simplifies the entire ML workflow:

βœ… Data preparation (S3 storage, preprocessing)
βœ… Model training (built-in algorithms, custom models)
βœ… Model deployment (real-time API, batch inference)
βœ… Monitoring & scaling (CloudWatch, auto-scaling)

With SageMaker, you can build powerful ML models without managing infrastructure manually. πŸš€πŸ”₯

More from this blog

S

Sahil's Blogs

132 posts

πŸ‘‹ Welcome to my Hashnode blog! I'm a DevOps Engineer, and this blog simplifies Cloud DevOps concepts. Get easy-to-understand articles to help you master DevOps and Cloud Technologies! πŸš€