
E-commerce Platform - Image Service GCP to AWS
What is Image Service:
It’s a asynchronous image resize service. When you access product image in the e-commerce platform, you’ll access image with https://image-service.com/image/1?size=100x200. The service will try to find the image with id 1 and size 100x200. If it can’t find an existing one, it sends out a job to resize the image in the background.
Why we did this:
- Reduce the data transfer fee between two cloud providers.
- Focus on AWS. As the team is mainly using AWS.
- Improve manually renew certification with AWS Certificate Manager.
In this project, I contributed to those items:
-
Moved architecture from GCP to AWS (GKE to EKS). Modified RoR application to use AWS Web Identity to get SQS permission and interact with SQS in EKS.
-
Moved GCP Cloud Function to AWS Lambda. This includes using Lambda Layer to store image processing library and implementing Lambda function to process image.
-
Solved Lambda faced limitation of disk space. Image Service uses disk to store temporary artifacts. However, the disk size has
/tmp250 MB limitation. Considering using EFS would cost greatly. Therefore, we Terraform to deploy multiple Lambda functions. Because/tmpdisk only shared within same Lambda function. -
Used
k8s-cloudwatch-adapterto scale out/in our service. I replacedcloudwatch-adapterwithKedaafterward. The auto scaling mechanism can handle workload with avg. 50k rpm and max. 80k rpm. -
Optimized the CloudFront cache rule. In the beginning, our application rely on
User-AgentHeader to handle request. This can greatly impact cache. I changed our application to get the required data from Query String instead of Header. The result is increasing cache Hit Rate from 50% to 90%. -
Optimized memory usage. I observed the application and find the application has Memory Fragmentation issue. Then, I change our Memory Allocator to
jemalloc. By doing so, we reduced 80% memory usage.
Architecture

Result
After we disabled CloudFront Forward Header, and cache started to work.

After we changed the Memory Allocator, memory usage decreased by 80%.

