Interests
My interests lie in Machine Learning & AI, Natural Language Processing, Multimodal AI, and scalable, production-level machine learning systems.
Projects and Research
-
DeepFake Detection using Explainable AI
Implemented deepfake detection model using XceptionNet with explainable AI (LIME and GradCam). Processed FaceForensic++ and Celeb-DF datasets by extracting frames from videos. Focused on interpretability through visualization of model decision-making.
-
LOLgorithm
Humor is a fascinating and puzzling area of study in the field of computers understanding human language. The aim of this project was to understand how syntactic, semantic and contextual embeddings affect the performance of model and whether combining them together result into better model predictions.
-
News Classification with LLMs and Prompt Engineering
News classification is of importance in the field of information retrieval and media analysis. Dataset: AG News (AG’s News Corpus): : It contains 4 largest classes from the original corpus. Each class contains 30,000 training samples and 1900 testing samples and the total number of training samples is 120,000 and testing samples is 7,600. The classes are divided into categories: world, sports, business, and science/technology. Prompting Strategy Baseline: Large Language Models (LLMs) are typically trained on extensive datasets, allowing them to work effectively with direct inputs and outputs. When prompted directly, an LLM can often generate a reasonably accurate response in a single attempt. This approach is referred to as “zero-shot prompting” and is considered a baseline technique in prompt engineering. In simple terms, the model is presented with a straightforward question without any surrounding context or background information, and it is expected to provide a corresponding answer. The extended version would be “few-shot-prompting”. (1)Few-shot prompting: In few-shot prompting, only a small number of examples/shots are also provided in prompt. This helps model in decision making for new data and useful when annotated data is limited to guide its understanding of a particular task or generate desired responses. (2)Chain-of-thought (CoT): (3)Chain-of-thought (CoT) prompting enables complex reasoning capabilities through intermediate reasoning steps. To maintain a coherent flow of conversation. Instead of asking isolated or individual questions, this approach involves generating prompts that build upon the previous responses or questions, forming a logical and connected chain of thought in the conversation. Zero Shot Chain-of-thought (CoT): A novel concept, the zero-shot CoT (Kojima et al. 2022), introduces a distinct approach by incorporating the phrase “Let’s think step by step” into the original prompt. This strategy aims to enhance model performance. This approach empowers the model with the capacity to engage in systematic and logical reasoning. By introducing the directive to “think step by step,” it encourages the model to break down complex problems into manageable components, facilitating more coherent and reasoned responses.
-
EEG-based Activity Recognition
Electroencephalograms (EEGs) can record electricalactivity in the brain. They can be used to augment human sensory functions or control robotic devices. In order to perform these functions the Brain Computer Interface (BCI) must be able to classify EEG patterns as corresponding to a certain task and relay that information to control the device of interest. This project focuses on the BCI competition III Dataset V in which the goal is to classify three mental tasks online. There are 3 tasks: Imagination of repetitive self-paced left hand movements, (left, class 2), Imagination of repetitive self-paced right hand movements, (right, class 3), Generation of words beginning with the same random letter, (word, class 7). BCI Competition iii Dataset V: 32 Electrodes to collect EEG data. EEG: ElectroEncephaloGraphy - method to record an electrogram of the spontaneous electrical activity of the brain. Sampling rate is 512 Hz. Modeling Techniques used: SVM, kNN, Hidden Markov Models, LSTM, BiLSTM
-
3D Reconstruction with Single View Metrology
This project was a part of coursework ECE 558 - Digital Imaging Systems at NC State which mainly focuses on the implementation of the paper “single view metrology” (Criminisi, Reid and Zisserman, ICCV99) to convert a 2D image into 3D model. The Project is divided into 4 sub parts. The first part is the 3D perspective image acquisition. - Second part is to calculate the Vanish points in the given image related to objects. Third part focuses on calculating the Projection and Homograph Matrices. Fourth part focuses on getting a texture map for each plane using warping. At last, the visualisation of 3D object with Blender is done.
-
Blob Detection
Blob detection is a major component in the fields on Image Processing and Computer Vision. In this project, a blob detector with statistical steps was implemented from scratch. Generated Laplacian of Gaussian filter. Generated Laplacian Scale Space with Convolution. Performed Non-Max-Suppression. Drawn circles around maxima points.
-
Terrain Identification
The ability to walk efficiently, safely, and attentively is a natural human trait that is disrupted by lower limb amputations. To restore basic walking function, amputees often rely on prosthetic devices. This project focuses on developing a system that identifies the terrain using data from inertial measurement units (IMUs) in the prosthetic leg. The training data includes the following 4 classes: (0) indicates standing or walking on solid ground, (1) indicates going down the stairs, (2) indicates going up the stairs, and (3) indicates walking on grass. For different users, multiple sessions were conducted with a start time of 0. We combined all labels for visualization. It can be observed that there is a class imbalance with class 0 having the maximum number of instances. Classification using sequential data was done using LSTM and its variants.
-
Video Streaming Service on AWS
The primary objective of this project was to understand and delve deeper into the fundamental concepts of cloud computing and its associated components. Cloud computing, an ever-evolving field and is useful in Machine Learning and AI as well. This technical document is written from the perspective of a cloud architect, encompassing comprehensive aspects in the design of end-to-end cloud applications and services. The project unfolds across several sections, with a focus on developing a Video Streaming Platform-as-a-Service, same as Netflix: (1) Define business requirements and link them to corresponding technical requirements, justifying the necessity of each technical requirement for a specific business need. (2)Discuss trade-offs between business requirements, acknowledging the engineering constraints No free lunch in engineering. (3)Compare different major cloud service providers and select AWS based on varied criteria for the video streaming service. (4)Detail the foundational building blocks of the design, explaining the corresponding AWS services. Comparison of these cloud providers. (5)Reference of the AWS well architected framework. (6)Present an architectural diagram, discussing the six pillars outlined in AWS documentation. (7)Additionally, incorporate design principles and best practices. (8)Conduct an experiment using Kubernetes to explore load balancing and autoscaling. Employ Locust to generate load and observe the autoscaling of K8s pods in response to load fluctuations. The project extensively references AWS pillar documentation for comprehensive insights into cloud architecture.