
E-commerce Platform - Image Service GCP to AWS
What is Image Service:
It’s a asynchronous image resize service. When you access product image in the e-commerce platform, you’ll access image with https://image-service.com/image/1?size=100x200
. The service will try to find the image with id 1
and size 100x200
. If it can’t find an existing one, it sends out a job to resize the image in the background.
Why we did this:
- Reduce the data transfer fee between two cloud providers.
- Focus on AWS. As the team is mainly using AWS.
- Improve manually renew certification with AWS Certificate Manager.
In this project, I contributed to those items:
-
Moved architecture from GCP to AWS (GKE to EKS). Modified RoR application to use AWS Web Identity to get SQS permission and interact with SQS in EKS.
-
Moved GCP Cloud Function to AWS Lambda. This includes using Lambda Layer to store image processing library and implementing Lambda function to process image.
-
Solved Lambda faced limitation of disk space. Image Service uses disk to store temporary artifacts. However, the disk size has
/tmp
250 MB limitation. Considering using EFS would cost greatly. Therefore, we Terraform to deploy multiple Lambda functions. Because/tmp
disk only shared within same Lambda function. -
Used
k8s-cloudwatch-adapter
to scale out/in our service. I replacedcloudwatch-adapter
withKeda
afterward. The auto scaling mechanism can handle workload with avg. 50k rpm and max. 80k rpm. -
Optimized the CloudFront cache rule. In the beginning, our application rely on
User-Agent
Header to handle request. This can greatly impact cache. I changed our application to get the required data from Query String instead of Header. The result is increasing cache Hit Rate from 50% to 90%. -
Optimized memory usage. I observed the application and find the application has Memory Fragmentation issue. Then, I change our Memory Allocator to
jemalloc
. By doing so, we reduced 80% memory usage.
Architecture
Result
After we disabled CloudFront Forward Header, and cache started to work.
After we changed the Memory Allocator, memory usage decreased by 80%.