Project

NetShare Web Interface for a Machine Learning IP Header Trace Generator

Built a REST web interface with React (Next.js), Tailwind CSS, Django, and SQLite to provide accessible and user-friendly interaction with the machine learning based header trace generator.

Implemented JWT for user authorization. Dockerized the web application for fast deployment.

Upgraded the machine learning backend to communicate with web interface via gRPC and run multiple training tasks simultaneously using Python multiprocessing.

Budget Desktop Application Development

Utilized Electron to develop a desktop application for expense and income tracking, achieved by React in TypeScript for the frontend, and Spring Boot with Maven for the backend.

Created an extensible backend framework that supported data plugins to store data in CSV files, MySQL, and MongoDB databases, and provided REST APIs for frontend display plugins to visualize data in various charts by EChartJS.

Conducted unit tests and test doubles with Jest, JUnit and Mockito framework, and automated testing with GitHub continuous integration.

Incorporated Proxy and Template design patterns along with inheritance and polymorphism into software construction.

Relational Database Management System Development

Built a RDBMS in C++17 with CMake and GTest, leveraging modern C++ techniques including smart pointers, move semantics, RAII, and OOP.

Developed a buffer pool manager to cache physical disk pages in memory using LRU-K replacement policy.

Created database index using extendible hash table that dynamically grows and shrinks to ensure efficient data retrieval.

Engineered a query execution engine based on Volcano processing model to optimize query plans and execute SQL queries.

Implemented multi-version optimistic concurrency control (MVOCC) to achieve snapshot isolation for transaction management.

AFS-style Distributed Caching File System

Engineered TCP-based RPC protocol adhering to idempotent at-most-once semantics.

Developed a RPC stub generator in C to serialize and deserialize parameters.

Built a distributed file system in AFS-1 style, featuring file freshness verification upon opening and a "late close after write wins" strategy for optimal data consistency.

Developed a Java multithread server that supports concurrent file updating and restoring old versions of files.

Constructed a client-side proxy with pluggable LRU cache by interpositioning the standard file system call through a custom shared library.

CloudFS: Hybrid File System on Local SSD and AWS S3 Storage

Engineered a hybrid file system utilizing the FUSE (File system in USEr space) framework, seamlessly integrating local SSD and AWS S3 cloud storage. Optimized storage efficiency by allocating small files on the local SSD and large files on the cloud.

Devised a sophisticated component for file segmentation and deduplication, leveraging the Rabin Fingerprinting algorithm, significantly reducing cloud storage usage.

Implemented a snapshot feature for the entire file system, achieving minimal cloud storage consumption through the use of references to file segments.

Enhanced performance and network efficiency by integrating an LRU cache of file segments on the local file system, minimizing network transfers.

Guaranteed fail-resilience by implementing a write-ahead logging mechanism to capture system states, ensuring system integrity and recovery capabilities across file system remounts and reboots.

Advanced AWS Auto Scaling

Orchestrated the auto-scaling of a machine learning application on AWS, optimizing resource allocation to handle peak demand while minimizing costs, using Terraform for infrastructure management. Implemented three distinct scaling methods:

Auto Scaling Group with Load Balancer and CloudWatch Integration: Configured an AWS Auto Scaling Group, supplemented by a Load Balancer, to dynamically scale based on CPU utilization metrics from Cloud Watch. Enhanced system resilience by fine-tuning health checks, ensuring robust handling of unexpected AWS EC2 instances crashes.

Lambda-Driven Scaling with AWS Step Functions: Developed a Python-based Lambda function to define scaling logic for Auto Scaling Group, triggered periodically by a finite state machine in AWS Step Functions.

Lambda and Docker Container: Dockerized the machine learning application and pushed to AWS ECR. Auto-scaled by the Lambda container invocation. Achieved the most cost-effective, responsive scaling that reacts to workload quickly and pays per use to reduces expenses significantly.

Scalable Distributed Machine Learning on AWS with Apache Spark

Orchestrated an Apache Spark cluster of 17 AWS c7g.xlarge spot instances to run a binary classification machine learning job using sparse logistic regression with an L2-regularized gradient descent algorithm.

Extracted, transformed, and loaded (ETL) a corpus of the first 400 WET files (70 GB) from the December 2016 Crawl Archive , creating a refined dataset for training a Latent Dirichlet Allocation (LDA) model and computing corpus statistics in just 19 minutes.

Trained the machine learning model with 882,774,562 features for 2 iterations in 33 minutes, leveraging inverted-index and partition-based join to optimize performance and memory usage on the 8G memory c7g.xlarge instances.

Page-mapping Log-structured SSD Flash Translation Layer

Designed and implemented a flash translation layer to maps continuous logical address to physical pages in the SSDs.

Leveraged log-structured mapping strategy in merely 214 KB memory to enhance SSD endurance and provide better write performance. Integrated with 4 garbage policies: FIFO, LRU, greedy, cost-benefit.

Reduced write amplification and further enhanced SSD endurance by transitioning from block mapping to page mapping granularity and greedy garbage collection policy.

High-performance Dynamic Memory Allocator

Developed custom implementations of malloc, calloc, realloc, and free. Achieved an exceptional average throughput of 10,804 Kops/sec and a high memory utilization rate of 74.2%.

Implemented Segregated Free Lists, employing doubly linked lists to efficiently categorize and manage free memory blocks based on their sizes, enhancing allocation speed and reducing fragmentation.

Optimized memory utilization by 15% through the innovative use of 16-byte mini blocks and the elimination of footers in allocated memory blocks, contributing to more efficient space usage.

Applied a Better Fit search algorithm for the segregated free lists, striking an effective balance between throughput and memory utilization.

Multiprocessing Linux Shell

Developed a Linux shell program in C, capable of executing and controlling both foreground and background processes utilizing multiprocessing and execve system call.

Integrated I/O Redirection functionality, enabling the shell to redirect input and output streams from and to files.

Managed child processes and ensured robust error handling using Linux signal handlers.