Prerequisites
Active AWS Account
Active Email Address
Introduction
One of the many tasks I have had to complete as a DevOps engineer is monitoring the URLs of apps in production and sending email alerts to stakeholders whenever the URLs go down. In this article, I will share how I achieved this using AWS Lambda Functions, EventBridge, and SNS.
When given a task, I usually evaluate the tools available and consider the best possible solution. Sometimes, I brainstorm with fellow DevOps engineers to refine ideas.
For this task, I considered two approaches:
Combine both monitoring and alerting in a single Python script.
Decouple monitoring and alerting, delegating the alerting functionality to a separate AWS service.
I chose the second option because it avoided the need to send emails directly from my script. Using my email address for this purpose would appear unprofessional, and stakeholders might ignore emails from an unfamiliar source. Instead, leveraging an AWS SNS topic with an email subscription ensured that alerts would be perceived as official and trustworthy.
The solution involved writing a Lambda function that runs every 5 minutes. The function sends HTTP requests to the specified URLs. If a URL returns a status code other than 200 after multiple attempts, a message is published to the SNS topic, which triggers an email to the subscribed stakeholders.
Step by Step Implementation;
Set Up the SNS Topic
Create a standard SNS topic in the AWS Console.
Add one or more email subscriptions to the topic, specifying the stakeholders' email addresses.
Copy the ARN of the SNS topic for later use.
Set Up the Lambda Function
Create a new Lambda function in the AWS Console with basic settings.
Prepare your Python code and dependencies:
The main code file must be named
lambda_function.py
.Install any required dependencies locally before packaging the function.
Zip the
lambda_function.py
file along with its dependencies and upload it to the Lambda function.
Main Lambda Function Code
Below is the core Python code for the Lambda function:```python import json import requests import boto3
IP_ADDESS = [ "", "", "", ]
client = boto3.client('sns')
def lambda_handler(event, context): for IP in IP_ADDESS: print(f'this is the endpoint {IP}') api_handler(f"{IP}") return { 'statusCode': 200, 'body': json.dumps('Hello from Lambda!') }
def api_handler(url): status_code = 0
print('api handler called') try:
Send GET request to the container
response = requests.get(url, timeout=40)
print(f"GET request to {url} returned {response.text}")
Get the response status code
status_code = response.status_code
Check if the response status is 500
if response.status_code != 200: message = ( f"⚠ {url} returned a status: {response.status_code} UNHEALTHY\n\n" f"NB: URL is DOWN!!\n" )
publish_to_sns(message)
client.publish( TopicArn='', Message=message, Subject='API Server Health Check!!!' )
except requests.exceptions.RequestException as e: message = f"❗Failed to reach {url}: {str(e)}" client.publish( TopicArn='', Message=message, Subject='API Server Health Check!!!' )
return { 'statusCode': 200, 'endpoint': url, 'body': json.dumps('Health check complete'), 'Application status': 'Healthy' if status_code == 200 else 'Unhealthy', } ```
Adjust Lambda Timeout
Increase the Lambda function timeout to 1 minute. The default timeout of 3 seconds is insufficient for making multiple requests to URLs, especially if any of them are slow to respond.Add a Trigger
Configure an AWS EventBridge trigger for the Lambda function.
Use a scheduled expression to run the function every 5 minutes. For example:
- Schedule Expression:
cron(0/5 * * * ? *)
- Schedule Expression:
Test and Deploy
Test the function with sample URLs that are known to be down.
Verify that stakeholders receive email alerts for any URLs that fail the health check.
Replace the sample URLs with the actual production URLs and deploy the function.
Conclusion
In this article, I’ve outlined how I successfully implemented URL monitoring using AWS Lambda and email alerting with AWS SNS. This approach is scalable, professional, and leverages AWS’s powerful serverless ecosystem.
By following the steps described, you can implement a similar solution for monitoring and alerting in dynamic environments. I hope you found this guide helpful and insightful. Thank you for reading!