E-commerce Platform - Image Service GCP to AWS

E-commerce Platform - Image Service GCP to AWS

What is Image Service:

It’s a asynchronous image resize service. When you access product image in the e-commerce platform, you’ll access image with https://image-service.com/image/1?size=100x200. The service will try to find the image with id 1 and size 100x200. If it can’t find an existing one, it sends out a job to resize the image in the background.

Why we did this:

  • Reduce the data transfer fee between two cloud providers.
  • Focus on AWS. As the team is mainly using AWS.
  • Improve manually renew certification with AWS Certificate Manager.

In this project, I contributed to those items:

  • Moved architecture from GCP to AWS (GKE to EKS). Modified RoR application to use AWS Web Identity to get SQS permission and interact with SQS in EKS.

  • Moved GCP Cloud Function to AWS Lambda. This includes using Lambda Layer to store image processing library and implementing Lambda function to process image.

  • Solved Lambda faced limitation of disk space. Image Service uses disk to store temporary artifacts. However, the disk size has /tmp 250 MB limitation. Considering using EFS would cost greatly. Therefore, we Terraform to deploy multiple Lambda functions. Because /tmp disk only shared within same Lambda function.

  • Used k8s-cloudwatch-adapter to scale out/in our service. I replaced cloudwatch-adapter with Keda afterward. The auto scaling mechanism can handle workload with avg. 50k rpm and max. 80k rpm.

  • Optimized the CloudFront cache rule. In the beginning, our application rely on User-Agent Header to handle request. This can greatly impact cache. I changed our application to get the required data from Query String instead of Header. The result is increasing cache Hit Rate from 50% to 90%.

  • Optimized memory usage. I observed the application and find the application has Memory Fragmentation issue. Then, I change our Memory Allocator to jemalloc . By doing so, we reduced 80% memory usage.

Architecture

Image-Service-GCP-to-AWS

Result

After we disabled CloudFront Forward Header, and cache started to work.
image-1

After we changed the Memory Allocator, memory usage decreased by 80%.
---2022-02-02---11.14.46