Previous Quarters
Previous final examination defense schedule
An archive of some previous School of STEM master’s degree thesis/projects.
Select a master’s program to navigate to candidates:
Master of Science in Computer Science & Software Engineering
SUMMER 2024
Wednesday, July 24
TASNIM BASHAR
Chair: Dr. Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Enhancing Image Inpainting with a Novel Deep Learning Fusion
Image inpainting, the art of seamlessly filling missing or damaged parts of an image, has seen remarkable progress with deep learning. However, achieving consistently high-quality restorations across diverse image content remains a challenge. This project presents a novel image inpainting framework that harnesses the strengths of partial convolutions, Generative Adversarial Networks, and self-attention mechanisms. Our approach utilizes partial convolutions to effectively address irregular holes and preserve intricate image details. A Generative Adversarial Network architecture is incorporated to encourage the generation of realistic and visually plausible image content. Furthermore, integrating self-attention enables the model to capture long-range dependencies and contextual information within the image, leading to more coherent and higher-quality reconstructions. Evaluations using established image quality metrics demonstrate that our framework achieves superior performance compared to existing state-of-the-art methods, confirming the effectiveness of this innovative fusion in significantly enhancing image quality and restoration precision.
GREESHMA SREE PARIMI
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Enhancing Collaborative Language Analysis and Study in MeTILDA
The “Global Predictors of Language Endangerment and the Future of Linguistic Diversity” study predicts that over 1500 languages will become extinct by 2100. Blackfoot is one such endangered language majorly used in the regions of Alberta in Canada and Montana in the U.S.A. Blackfoot is a pitch accent language where the meaning of a word varies depending on the pitch used when speaking it. So, it is very challenging to document and teach the language. To address this, researchers from the University of Washington Bothell (UW) and the University of Montana (UM) have collaborated on an interdisciplinary project named MeTILDA (Melodic Transcription in Language Documentation and Application). It is a cloud-based system that analyzes the pronunciation of individual Blackfoot words, generates Pitch Art, and assists in documenting, teaching, and learning Blackfoot.
This capstone project primarily aims to enhance the MeTILDA application’s analysis, collaboration, and security features. To improve analytical capabilities, we developed the Pitch Art Version Control System, allowing users to work on and save multiple versions of the same Pitch Art. For improved collaboration, we integrated a communication mode within the application, enabling users to interact with both MeTILDA and non-MeTILDA users. To strengthen security, we implemented role-based access control for all features of the MeTILDA application and introduced email verification for new user registrations. Furthermore, substantial improvements were made to the project documentation, particularly in the areas of front-end components and database information.
HARPREET KOUR
Chair: Dr. Wooyoung Kim
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Predictive Modelling of Substance Abuse: Analysing Key Features with Machine Learning
Substance abuse remains a critical public health challenge with multifaceted implications for individuals, families, and communities worldwide. This project explores how machine learning techniques can predict and classify substance abuse behaviours, focusing on alcohol, tobacco, marijuana, and nicotine vaping. Leveraging data from the (2021-2022) Behavioural Risk Factor Surveillance System (BRFSS) in 2023, we aim to identify the key predictors of substance abuse and groups at higher risk, taking into account factors like adverse childhood experiences, financial conditions, and mental health status.
Prediction models were developed using four types of machine learning algorithms, including Linear Models (Logistic Regression & SVM), Tree based Models (Random Forest, XGBoost & ADABoost), Neural Networks (MLP & CNN) and Clustering algorithm. Respondents were randomly divided into training and testing samples. The performance of all the models was compared using accuracy, precision, recall, AUC and false positive rate. The study included 31060 respondents of whom, 5867 (19%) were found to be substance abusers. Of the respondents who reported substance abuse 62.93% were between the ages of 18-64, 60.61% were males and 84.76% were non-Hispanic Whites. Random Forest was the best performing model with AUC 0.86, followed by XGBoost (AUC 0.85). The most important factors for substance abuse were BMI, male sex, lower income levels, young adult age group, lower education levels, poor mental health and adverse childhood experiences. Data mining methods were useful in examining patterns across demographics, health conditions and lifestyle behaviours so as to understand the co-morbidities associated with substance abuse.
Another goal of this project was to highlight the importance of collaboration between domain experts and machine learning practitioners and assess the impact it has on the results compared to when domain experts are not involved. Their contributions in feature selection and data interpretability solutions were instrumental in achieving this enhancement. We prioritised model interpretability to foster trust and refine understanding. Additionally, our project introduces a novel approach to interpretability, analysing misclassified data, offering insights into substance abuse dynamics. These results can be used to generate further hypotheses for research, increase public awareness and help provide targeted substance abuse education.
Friday, July 26
JEFFREY MCCREA
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Enhancement of Agent Performance with Q-Learning
Graphs store data from various domains, including social networks, biological networks, transportation systems, and computer networks. As these graphs grow in size and complexity, single-machine solutions become impractical due to limitations in computational resources. Distributed graph computing addresses these challenges by leveraging multiple machines to process and analyze large-scale graphs collaboratively.
This capstone project investigates the enhancement of distributed graph computing performance in the Multi-Agent Spatial Simulation (MASS) library by integrating Q-learning for computing shortest path, closeness centrality, and betweenness centrality on distributed large-scale dynamic graphs. This approach is compared to traditional and agent-based graph computing algorithms. Previous approaches in the MASS framework relied on large populations of unintelligent agents to exhaustively traverse graphs to compute solutions, making them inefficient when faced with dynamic graph data. By leveraging Q-learning and the distributed agent-based graph capabilities of MASS, we aim to optimize the decision-making processes of distributed agents, thus improving computational efficiency and accuracy.
Experimental results demonstrate that the adaptive learning mechanism of Q-learning, coupled with the MASS library, allows agents to dynamically adjust to changing graph structures, leading to a more robust and scalable distributed graph computing solution. This research contributes to the field of distributed systems and artificial intelligence by providing an innovative approach to enhancing multi-agent intelligence for graph computing tasks.
Monday, July 29
SAURAV JAYAKUMAR
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: GraphConv: Geometric Deep Learning for Multiple Conformation Generation from Electron Density Images
In the realm of cryo-electron microscopy (cryo-EM) structural analysis, the precise prediction of molecular conformations within datasets stands as a fundamental endeavor. Despite strides made in deep learning methodologies, existing solutions often yield volumes of suboptimal quality. Addressing this critical limitation, our research introduces GraphConv, an innovative encoder model designed to embed particle images into a latent space, thereby supplanting the conventional encoder utilized by CryoDRGN. This novel approach employs a Graph Neural Network (GNN) architecture featuring multiple GraphConv and Convolutional layers, aimed at capturing richer information from particle images and faithfully reconstructing corresponding 3D volumes. Rigorous testing across two authentic datasets and three simulated datasets underscores the efficacy of our model, showcasing marked enhancements in reconstruction quality. Notably, our findings reveal enhancements in resolution by up to 20\% compared to CryoDRGN. By harnessing the power of GNNs, our methodology heralds significant advancements in the fidelity and accuracy of output volumes, thereby contributing to the ongoing refinement of cryo-EM structural analysis methodologies.
Tuesday, July 30
SUGAM JAISWAL
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Thesis: Development of Personality Adaptive Conversational AI For Mental Health Therapy Using LLMs
Many individuals with mental health issues cannot get access to professional help due to reasons such as lack of awareness, limited availability, and high costs. Conversational agents present a viable alternative to deliver mental health support that is accessible, affordable, and scalable. However, the effectiveness of these agents can vary among users, as different users have different personality types such as extroversion, agreeability, etc. which influence how users interact with chatbots. Therefore, it is important to develop therapy chatbots that adapt to individual personalities. In this study, we highlight the significant role of Personality Adaptive Conversational Agents (PACAs) in mental healthcare. We designed an architecture around traditional ML models and open-source LLMs to build a PACA for mental health (based on the existing iCare project). We built a functional prototype based on it and conducted a user study, which concluded that personality adaptability is a critical feature for mental health chatbots.
During this research, we were able to build a personality classifier that achieved an average F1-score of 0.96 across the Big Five personality dimensions – Agreeableness, Extraversion, Openness, Conscientiousness, and Neuroticism, and successfully integrated that with an open-source LLM to generate adapted responses. The remote user study demonstrated that 95% of the test users found the responses from the adaptive chatbot relevant to their situation compared to 30% for the non-adaptive chatbot, and 55% of users agreed that the responses felt suited according to their personality as opposed to 15% for the non-adaptive chatbot. With this study, we have shown that it is feasible to create free, accessible, and personalized mental healthcare solutions and that the adoption of PACAs could represent a pivotal step toward making mental healthcare more personalized and widely available.
ANI AVETIAN
Chair: Dr. Annuska Zolyomi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: The Views Application: Introducing Augmented Reality into Travel Applications
The travel and tourism industry is growing worldwide and experiencing a resurgence following the COVID-19 global pandemic travel restrictions. When people travel to unfamiliar places, they need to orient themselves to their surroundings, find their way around, and gain information about the local culture. Traditional travel applications, like websites and smartphone apps, are for arranging travel and providing information. However, the information and services provided by websites and apps are presented out-of-context with the real-world surroundings of the traveler. Emerging technology, specifically, Augmented Reality (AR), presents an opportunity to enhance the traveler’s experience with real-time information that is easy to access and contextually relevant.
Here we introduce a novel AR travel application called “Views.” This app uses image detection to allow users to easily learn about the landmarks they visit. Once detection is complete, users will have an AR environment where they can interact with their surroundings. A fact will be displayed in AR with the help of Artificial Intelligence (AI) working in the background. This provides an experience that a user can effortlessly incorporate into their travel plans and activities. Additionally, this project entailed identifying potential cybersecurity risks that come with these kinds of applications and developing a potential solution to them. We looked at attacks of clickjacking in AR environments and developed a solution to combat these attacks with image detection. As we envision a future where wearable devices may become the norm, applications like this will be at the forefront of these innovations. Using this application as a stepping stone, we begin to understand the important aspects of AR travel user experience and implementation approaches to build usable and secure AR features.
Thursday, August 1
HARIKA CHADALAVADA
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: I’poyít – Blackfoot Language Learning Application
The Blackfoot language, a vital part of the cultural heritage of the Blackfoot tribes of the Great Plains in North America, is facing imminent risk of extinction with only about 3,000 fluent speakers remaining. This paper details the development of an application aimed at revitalizing the Blackfoot language by leveraging modern technology to make learning accessible and engaging. The application was developed using the versatile Flutter framework, ensuring seamless cross-platform compatibility.
The primary goal of this application was to build a comprehensive solution that addresses and fills the gaps identified in existing Blackfoot language learning applications and provides a more effective and engaging learning experience. To achieve this goal, the application features distinct student and admin login interfaces, with Firebase serving as the backend to manage user data and interactions efficiently. The application provides a comprehensive suite of interactive learning modules that include vocabulary exercises with flashcards and audio pronunciation guides, along with phrase modules that facilitate practical language use in everyday contexts. To enhance the learning experience, the application incorporates gamified elements such as experience points, badges for progress, and quizzes that test and reinforce language skills. A leaderboard and a discussion forum are integrated to foster a community of learners who motivate each other through shared achievements and discussions. Significantly, the content within the application is curated by a linguistics expert known for her extensive research in Blackfoot phonology to ensure that the educational material is authentic and culturally resonant. This project not only offers a practical solution to the preservation of the Blackfoot language but also serves as a model for preserving other endangered languages, ultimately demonstrating how digital technology can be harnessed to safeguard linguistic diversity.
ARSHEYA RAJ
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: NeuroGaming: Innovative Rehabilitation for Upper Limb Neurologic Conditions Using Mixed-Reality Simulation Games and EEG/EMG Biofeedback
Recent advances in Augmented Reality (AR), Mixed Reality (MR), Electroencephalogram (EEG), and Electromyogram (EMG) offer significant opportunities in medicine and neuroscience. This research aims to use these technologies to aid stroke patients with upper limb extremity weakness. This work extends the Edge Computing Ecosystem for Neuroscience Patients’ Rehabilitation, part of the ‘Stroke Rehabilitation Project’ by University of Washington Bothell Engineering, University of Washington Seattle Neuroscience, and Rehabilitation Medicine at Harborview Medical Center (UWHM). Recently, UW Bothell’s CSSE has also contributed, focusing on solutions for stroke patient rehabilitation. Traditional motor rehabilitation is costly, resource intensive and often monotonous, reducing patient engagement. We introduce “NeuroGaming”, an interactive approach engaging elements of AR / MR games that can enhance rehabilitation programs and improve patient outcomes. Using augmented and mixed reality technologies, interactive environments can be created on mobile devices, providing engaging and motivating experiences for patients. AR simulates real-world scenarios, offering a safe and fun way to practice tasks and aid in
rehabilitation.
We used EEG and EMG sensors to conduct experiments and to collect data in a controlled environment targeting a reduced set of representative relevant motor tasks. The data were processed using various signal processing and statistical techniques, which in combination with the MR / AR game can be used to build a novel feedback and guidance system. This system is a building block of our “NeuroRehab” ecosystem, which will use various ML models and algorithms for sequential prediction, with the aim of guiding patients through an optimal rehabilitation path within the game environment.
Results showed that the combination of Frequency Filtering, ICA, and ERP with FNN and SVM models yield, so far, the best accuracy for classifying the motor tasks in EEG and EMG data. These findings contribute to the field of stroke rehabilitation of upper limb extremity weakness. It also contributes to a larger project that aims for better understanding and rehabilitation of other neurological ailments by offering insights to different hand gestures using EMG, and EEG data, and creating a framework for data processing, and feedback systems.
Friday, August 2
NOURA ALROOMI
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Automated Expansion of Sketch Datasets
Free-hand sketches offer a unique reflection of human perception and creativity, playing a crucial role in various computer vision and machine learning applications. These sketches are used in image and sketch recognition algorithms, sketch-based retrieval systems, and generative art neural networks, enhancing our understanding of artistic expression. However, the limited variety of object categories in existing sketch datasets restricts their applications and the development of robust machine-learning models. This limitation is primarily due to the manual curation required for these datasets, which is time-consuming and slows the addition of new sketches. Automated systems using advanced computer vision techniques present a solution to enhance the diversity and quality of sketch datasets efficiently.
In this capstone project, we introduce an Automated Dataset Expansion System designed to streamline and automate the process of adding new categories to sketch datasets. Our system employs a web-based platform that integrates user-generated sketches with an advanced auto-expansion pipeline. This pipeline consists of two phases: the first phase involves synthetically generating baseline sketches from photos using a pre-trained Photo-Sketching model, effectively capturing the structural attributes of simpler objects. The second phase evaluates the similarity of user-generated sketches to these baselines using both Structural Similarity Index Measure (SSIM)-based and Convolutional Neural Network (CNN)-based encoder similarity metric pipelines. Our findings indicate that the CNN-based encoder with the Cosine Similarity measure provides consistent and reliable performance across categories, achieving the highest precision and true positive rates. This reliable method was subsequently implemented in our system. The enriched datasets produced by this system can support innovative machine learning and computer vision applications, fostering advancements in these fields.
Monday, August 5
AISHWARYA PANI
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: I’poyít – Platform for Learning Blackfoot Language
Preserving endangered languages is essential for maintaining cultural identity. Blackfoot is categorized as endangered, with nearly 2,900 speakers remaining. Existing solutions for preserving Blackfoot lack engagement, accessibility, community learning, and gamified experiences. To address this, we developed “I’poyít – Blackfoot Language Learning Application,” a cross-platform app designed to facilitate learning and preservation. In collaboration with the University of Montana, the application features content from researchers and linguists.
The primary objective of our development was to establish a codebase that is both maintainable and robust. This culturally sensitive and interactive application utilizes the Flutter framework for cross-platform compatibility and Firebase for backend services, which provide real-time updates and data storage. State management was handled using Riverpod, ensuring optimization across the application. This approach led to enhanced performance by decoupling UI components from business logic. Loading performance was optimized through asynchronous programming, ensuring minimal delays and seamless navigation. The real-time database capabilities of Firebase provided instant synchronization and updating of user data. Flutter’s integration with Firebase enabled users to upload audio files, which were then instantly accessible to all users. The features were designed to be easily extendable, allowing new functionality to be added with minimal changes to existing code.
The key features of the application include vocabulary flashcards, a phrases learning module with audio translations from native speakers, and customizable quizzes with detailed visual analytics to provide an immersive experience. Gamified elements, such as XP points, a leaderboard, and a robust notification system, ensure daily study engagement. User profile management and discussion forums stimulate community engagement. A content management system was designed for efficient phrase management, batch uploading, and fault-tolerant multimedia integration. “I’poyít” demonstrates the potential for cross-platform accessibility to improve language learning and preserve the Blackfoot language. By combining modern technology with cultural insights, the application offers an engaging and effective learning experience, highlighting the potential for similar solutions to support other endangered languages.
GNANA SANJANA KILLI
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Facilitating Endangered Language Revitalization: An E-Learning Platform for Blackfoot
The preservation and revitalization of endangered languages are crucial for maintaining cultural diversity and heritage. This project introduces an innovative e-learning platform specifically designed for the revitalization of endangered languages, with a particular focus on the Blackfoot language, which is currently at risk of extinction. The platform aims to engage indigenous communities, educators specializing in language revitalization, and linguists through interactive language learning facilitated by a user-centric design.
The development of the platform was driven by the need for accessible, engaging, and pedagogically sound language learning tools that overcome the limitations of traditional methods. Key features of the platform include secure account creation and management, seamless access to courses, lessons, and assignments, an intuitive and adaptive user interface for various devices, developed using React.js, and robust data management capabilities using SQL databases ensuring the security and integrity of user data. The system’s architecture supports multimedia content managed through AWS, accommodating diverse teaching methods that appeal to different learning styles.
Throughout its development, the platform has evolved through continuous stakeholder feedback, leading to significant enhancements such as updated administrative controls for user management, enhanced security measures, and enriched interactive content. This adaptive approach has significantly improved the platform’s effectiveness. Future enhancements will focus on expanding the range of languages offered, integrating adaptive learning technologies, and enhancing peer-to-peer interaction features. This project not only contributes to the field of language revitalization but also serves as a significant model for the further development of educational technologies that respect and promote cultural heritage.
Tuesday, August 6
SONAL YADAV
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Clustered Federated Learning for Next-Point Prediction in Mobility Datasets
Individual trajectories vary significantly, making the task of next-point prediction challenging due to privacy concerns and potential convergence issues in machine learning models. To address these challenges, I added functionality to the Inflorescence framework, developed by Professor Mashhadi’s Lab, enabling trajectory datasets to leverage clustered federated learning (CFL) strategies. This project evaluates the performance of CFL on state-of-the-art mobility datasets, GeoLife and MDC, demonstrating its robustness compared to traditional federated learning. Additionally, the study analyzes which types of clients benefit the most from CFL, finding that high-entropy clients, characterized by more non-identical and random datasets, experience the greatest advantages. This enhanced functionality is now part of the open Python package Inflorescence, which extends Flower.
JOSHUA MEDVINSKY
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Exploring the Influence of Human Factors on the Effective Utilization of Password Managers
In today’s digital realm, password managers are important tools for securely managing login credentials. Amidst an array of options like LastPass, RoboForm, Dashlane, 1Password, and Bitwarden, these tools not only handle online account credentials but also extend their utility to desktop application credentials.
But despite their proven effectiveness, users rarely use password managers, which raises questions about online security given the growing threat of password attacks. This capstone project proposes an exploration into the impact of user education on enhancing the adept utilization of password managers. By addressing this challenge, the aim is to uplift security practices and bridge existing gaps in cybersecurity knowledge.
Through an extensive literature review, the project identifies a notable research gap regarding the influence of targeted user education on password manager effectiveness. By exploring user motivations and concerns, this study seeks to contribute valuable insights to the cybersecurity field.
The project’s methodology entails in-person interviews with students from the University of Washington Bothell, dividing them into groups receiving user education and a control group. Data collection suggests that targeted education may significantly enhance password manager adoption and usage.
This project shows significant improvements among participants who receive targeted education. It emphasizes the critical role of organizations in making sure their users have a good understanding of password manager benefits with practical application, which improves cybersecurity measures and mitigates data breach risks. Moving forward, the project aims to highlight the importance of exploring long-term educational intervention effects and scalability within organizational contexts to improve cybersecurity defenses effectively.
Thursday, August 8
SUNJIL GAHATRAJ
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Thesis: Enhancing Neurological Rehabilitation: Combining Gaming, Robotics and Machine Learning in an Edge Computing Environment
Recent advancements in Electroencephalography (EEG) technology has made it more accessible, opening new avenues for collecting and studying EEG data. In the context of neurological rehabilitation, it enables us to develop new strategies to enhance this experience for patients recovering from conditions such as stroke, spine surgery and nerve damage. On the other hand, the continued evolution of hardware and rapid onset of High Performance
Computing (HPC) has allowed for smaller devices to become more computationally powerful, making Edge Computing (EC) environments ubiquitous and suitable for many applications.
The goal of this research is to explore the combined use of Computer Vision (CV), Gaming, Machine Learning (ML) and robotics technologies to develop improved rehabilitation approaches for neurology patients. Specifically, our target is the gamification of some aspects of this area of rehabilitation to make it more engaging to help patients achieve their goals. Our approach is to investigate the feasibility of this strategy by studying EEG data collected
from healthy subjects to analyse brain signals associated with specific hand movements. The EEG data is then processed and ML models are used to predict these hand movements, which then can be used to guide a Raspberry Pi robot. In addition, a camera is used to supervise hand gestures. In the future, together with medical expert input, these experiments can help train such a robot to follow optimal paths based on patient hand exercises. In our experiments, we used a reduced set of hand movements and healthy subjects to serve as a proxy to real patients. Initial results using CNN and FNN models are encouraging, showing high precision and accuracy in predicting hand movements, indicating that our research serves as a proof of concept and continued exploration is justifiable.
This research is a novel work carried out as an interdisciplinary collaboration between computer science, engineering, neuroscience involving the Computing and Software Systems (CSS) division at the University of Washington Bothell collaborated with the University of Washington Seattle Neuroscience and Rehabilitation Medicine at Harborview Medical Center (UWHM).
HONGYANG LIU
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: A Comprehensive Categorization Framework for Interactive Fiction Games
Interactive Fiction (IF) games are digital experiences that merge storytelling with interactive gameplay, allowing players to navigate and influence story-driven adventures. These games have evolved significantly, integrating advanced visual and interactive elements alongside traditional textual narratives, making them an intriguing area of study. However, currently, there are few structured frameworks designed for the systemic classification of IF games and it can be challenging to analyze these games wholistically.
This thesis presents a comprehensive categorization framework for IF games, designed to facilitate systematic classification and analysis. Based on features derived from a human-computer interface, story genre, game mechanics, and business model, the framework supports the classification of IF games into distinct categories. This structured approach allows feature-based examination and facilitates the holistic analysis of IF games and their evolution.
Validation for the proposed framework involved three rounds of sampling and categorizing IF games. The first round sampled popular IF games developed based on well-established game engines to demonstrate the fundamental robustness of the framework. The second round sampled popular IF games over time for insights into potential trends as IF games continue to develop and evolve. The third round was based on popular IF game series and traditional action-adventure series to examine potential similarities between the two genres.
The three rounds of sampling and categorizing reveal potential patterns and trends that enhance our understanding of IF games. Key findings include the trend from text-only to image-based or even animation-based, the trend from single-defined endings to multiple-defined endings, the trend from no or little towards more sophisticated support for stats and resource management, and the potential overlapping and merging of IF and action-adventure games.
These findings demonstrate that the proposed framework is an effective tool for systematic analysis that can offer valuable insights into the development and trends of IF games. Since classification involves subjectivity, future work should repeat the process based on stakeholders with distinct backgrounds, e.g., publishers, developers, and gamers. Additionally, the proposed framework is but a first step and should be continuously reviewed and refined.
ESHA GAVALI
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Enhancing the Performance of GNN and Utilizing 3D Instance Segmentation for Ligand Binding Site Prediction
This study addresses the challenge of accurately predicting ligand binding sites (LBS) on proteins, a critical aspect of structure-based drug design. Ligand binding site prediction is crucial for designing effective drugs and understanding protein functions, benefiting pharmaceutical companies, biotechnologists, and researchers by accelerating drug discovery and improving therapeutic interventions. We employ and improve Graph Neural Networks (GNNs) and innovative 3D point cloud instance segmentation to refine and advance LBS prediction methods. This research demonstrates significant enhancements in predictive accuracy by evaluating these methods on widely used datasets. Our novel clustering algorithm, which combines density-based and fuzzy clustering, notably improves the definition and identification of ligand binding sites without prior knowledge of the number of clusters. This methodology allows for more precise predictions, effectively managing binding sites’ overlapping nature. Implementing instance segmentation further delineates individual binding pockets, offering a more granular understanding of ligand-protein interactions. The results illustrate that our approaches meet the current state-of-the-art for ligand binding site prediction and support their potential utility in real-world pharmaceutical applications. Future work will focus on refining these methods and extending their application to molecular docking studies.
Friday, August 9
KARAN BHATT
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Privacy Concerns and User Perception in Targeted Digital Advertising: A Survey Project
This study investigates the relationship between users’ perceptions of privacy control and transparency in data usage practices and their subsequent trust and acceptance of targeted advertising. The purpose of this research is to understand how perceived control over personal data and transparency in data usage influence user trust and acceptance of targeted ads, thereby aiding in the design of more user-friendly and trustworthy advertising strategies. Utilizing an extensive literature review and a comprehensive survey, data were collected on users’ awareness of data collection practices, perceived control over personal data, trust in advertisers, and acceptance of targeted advertisements. Additional questions addressed concerns regarding data privacy, the effectiveness of privacy-enhancing technologies, and user preferences for regulatory measures. The primary hypothesis posits that users’ perception of control over their data significantly influences their acceptance and trust in targeted digital advertising, while the secondary hypothesis examines how transparency in data usage and collection practices affects users’ trust and comfort with targeted advertising. The survey results highlight critical factors influencing user acceptance and trust in targeted advertising. The findings of this study contribute to the body of knowledge on digital marketing and privacy, offering practical recommendations for advertisers on designing more user-friendly and trustworthy advertising strategies. By understanding the importance of perceived control and transparency, advertisers can improve their practices to better align with user expectations, thereby enhancing the overall effectiveness and acceptance of targeted advertising.
NIKHITA TITHI
Chair: Dr. Wooyoung Kim
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Interpretable AutoML Web Application
The adoption of automated machine learning (AutoML) platforms across interdisciplinary fields, such as bioinformatics, has been limited despite their potential to broaden access to machine learning tools and accelerate scientific discovery. This paper investigates the specific challenges and barriers that hinder the widespread utilization of AutoML platforms among bioinformatics researchers. The iBioML framework, developed by Dr. Wooyoung Kim, aims to enhance the effectiveness and efficiency of machine learning applications in bioinformatics by providing an interactive and interpretable machine learning infrastructure. Despite the promise of AutoML to simplify and accelerate data analysis, its adoption remains constrained due to the complexity of setup, registration processes, subscription models of commercial platforms, and the steep learning curve associated with their advanced features.
Our research delves into the current state of AutoML applications, exploring the factors influencing researchers’ decisions to use or not use these platforms. Through a comprehensive analysis of commercial and open-source solutions such as Google Cloud AI Platform, Amazon SageMaker, Microsoft Azure ML, DataRobot, and H2O.ai, we evaluate their features, user experience, and interpretable capabilities. A survey was conducted among bioinformatics researchers to understand their needs and the barriers they face. Our findings revealed significant challenges related to the usability and accessibility of these platforms, with non-technical users particularly struggling with the initial setup, unclear subscription models, and the high cost of commercial platforms. Additionally, while commercial platforms offer advanced data analysis and visualization features, they often require extensive resources and commitment, which can be a deterrent. Non-commercial platforms like H2O and Orange provide essential AutoML functionalities but are limited by their user interface complexities and performance issues with large datasets.
To address these challenges, we developed a new application within the iBioML framework, integrating enhanced AutoML and interpretability features based on user feedback and comprehensive analysis. Our new application simplifies the setup process, offers clear usage guidelines, and provides robust data visualization tools, built-in AutoML capabilities, and extensive interpretability. By improving these aspects, we aim to empower bioinformatics researchers to leverage AutoML for more accurate, reliable, and interpretable models. This research contributes to the advancement of AutoML not only in bioinformatics but also in other fields, offering targeted strategies to enhance its accessibility, usability, and effectiveness.
SPRING 2024
Tuesday, May 14
NAIMA NOOR
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Discovery Hall 464
Thesis: Fairness in Continual Federated Learning
Continual Federated Learning (CFL) is a distributed machine learning technique that enables multiple clients to collaboratively train a shared model without sharing their data, while also adapting to new classes without forgetting previously learned ones. Currently, there are limited evaluation models and metrics for measuring fairness in CFL, and ensuring fairness over time can be challenging as the system evolves. To address this, our study explores temporal fairness in CFL, examining how the fairness of the model can be influenced by the selection and participation of clients over time.
We introduce novel fairness metrics—Delta Accuracy Fairness (DAF) and Delta Forgetting Fairness (DFF)—specifically designed to ensure temporal fairness in a CFL context. Additionally, we propose a set of client selection strategies that enhance the temporal fairness of the CFL model by addressing disparities in knowledge retention. Through comprehensive analysis, we demonstrate that while no single strategy guarantees perfect temporal fairness, the Low Participation and Low Average strategies consistently outperform others in terms of stability and equity. Furthermore, our findings underscore the adaptability of the Dynamic strategy, which shows significant promise in certain tasks. These insights pave the way for refining client selection strategies, enhancing CFL’s fairness, and fostering more equitable learning environments.
Wednesday, May 15
SHENYAN CAO
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Discovery Hall 464
Project: An Incremental Enhancement of Agent-Based Graph Database
In the domain of big data analytics, graph Database (DB) is vital for managing complex data structures. This project focuses on enhancing an existing agent-based graph DB within the MASS Java framework. Motivated by the limitations of the existing agent-based graph DB, this project aims to enrich its capability to handle data with more detailed property information, aligning with the Property Graph Model. Through a comparative analysis of popular industry graph DBs such as Neo4j, RadisGraph, JanusGraph, and ArangoDB, this project establishes design principles focusing on the adoption of the Property Graph Model, Cypher query language, in-memory distributed graph structures, and agent utilization. The project provides detailed insights into the design and implementation processes, including parsing Cypher queries to Abstract Syntax Tree (AST), planning execution strategies, and comprehensive testing to ensure system functionality and reliability. Overall, the project demonstrates the successful extension of the agent-based graph DB to handle complex and interconnected data structures, accurate execution of CREATE and MATCH cypher queries, and outlined plans for future development.
VEDANTI PAWAR
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Adversarial Defense: Implementing and Evaluating Multi-Layered Strategies Against Adversarial Attacks
Deep learning (DL) has become a cornerstone in image classification tasks across various industries, notably in the development of autonomous driving systems, where it significantly enhances vehicle perception and decision-making capabilities. However, reliance on single defense mechanisms often falls short in safeguarding these models against sophisticated adversarial attacks. This research investigates the potential of combining various defense strategies to enhance the robustness of DL models, focusing on the ResNet34 and ResNet50 architectures. By employing widely-used attack methods, this study simulates real-world threats to assess whether these combined defenses can improve model accuracy and security. Testing these strategies on different network architectures across various datasets, the analysis determines the impact of each defense combination along with their computational costs. The findings provide valuable insights into which strategies are most effective in different settings, guiding the development of more resilient DL systems against sophisticated attacks.
Thursday, May 16
WUBE ALEMAYEHU TUFFA
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Transfer Learning in Neural Machine Translation for Low-Resource Languages
This project paper explores the impact of transfer learning and pre-trained models on improving Neural Machine Translation (NMT) between the low-resource language, Amharic, and the resource-rich language, English. Given the unique challenges associated with NMT for low-resource languages, this study proposes to use two innovative architectures: the Concerted Training NMT (CTNMT) and a Bert-fused NMT model, aimed at improving translation quality. These models are evaluated against a conventional transformer model to determine their ability to effectively leverage pre-trained knowledge for language translation tasks.
The experimental approach employs the fairseq and neurST toolkit to conduct controlled experiments, with translation accuracy assessed through BLEU scores. The research consolidates two smaller corpora into an expanded Amharic-English dataset, ensuring robustness and integrity for model training and evaluation while safeguarding against data leakage into the test set. The CTNMT architecture utilizes rate-scheduling and dynamic switch to maximize learning from BERT through sophisticated training methodologies. Meanwhile, the Bert-fused model leverages BERT’s capabilities by embedding it within a custom-build sequence-to-sequence encoder-decoder framework.
The results suggest that both innovative models are effective, with the Bert-fused model achieving higher BLEU scores in both Amharic-English and English-Amharic translations compared to the baseline transformer. While the CTNMT model performed well in English-Amharic translation, it was not applicable for the opposite direction. These findings highlight the potential of pre-trained models to improve the quality of Neural Machine Translation (NMT), especially for languages with limited linguistic resources. Particularly the success of these models validates the hypothesis that integrating deep bidirectional language understanding can substantially enhance translation quality, presenting a notable advancement in the field of machine translation.
Monday, May 20
YUAN MA
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: An Implementation of Multi-User Distributed Shared Graph
Many real-world applications such as social or biological networks can be modeled as graphs. With the increasing size of graphs, graph databases also become a popular research area. Graph databases usually require maintaining the original structure of the graph over distributed disks or preferably over distributed memory to function. Compare to those popular data-streaming tools that need to disassemble the graph into texts before processing, it’s reasonable to introduce agent-based graph computing in which we deploy agents to graphs without modifying the original shape. In this research, we introduce using Multi-Agent Spatial Simulations Library (MASS) for graph computing. Currently, most agent-based modeling (ABM) libraries including MASS focus on parallelization of ABM simulation programs. However, database systems need to accept, handle, and protect many queries from different users simultaneously, while MASS hasn’t provided users with this capability. Therefore, this project aims at implementing a high-performance multi-user distributed shared graph and trying to add this feature to the MASS library. We have conducted research on many popular data streaming tools and distributed cache. By addressing the challenges they have on programming graph applications, we proposed and implemented a high-performance distributed shared graph structure within the MASS library. Through the performance and programmability comparison between MASS and Hazelcast (which has a distributed HashMap data structure, thus enables distributed graph construction), we demonstrated that MASS GraphPlaces has better speed when processing graph queries and at the same time offers an easier way to program graph applications such as Triangle Counting.
KENNETH TRAN
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: LSTAR Framework: Lightweight Framework for Standardizing Tests for Adversarial Robustness
The role of neural networks in various tasks has exploded in recent years, becoming prevalent in many safety-critical applications. However, improving neural network robustness has become a challenge due to the existence of adversarial examples—imperceptible perturbations to the inputs of machine learning models that mislead classifiers into producing incorrect outputs. While there have been numerous advancements in crafting adversarial attacks and defenses, research on the basis of adversarial examples has notably lagged behind, largely due to the computational difficulty of analyzing high-dimensional spaces. This inherent difficulty has led researchers to construct models for understanding adversarial examples divergent from conventional paradigms, with some relying on commonly used frameworks while others utilize their own tailored frameworks to meet their unique needs. Consequently, replicating and building upon research in this field presents a significant challenge.
In this paper, we present a modular, lightweight framework to assist researchers in addressing these challenges by providing a comprehensive approach to evaluating machine learning models through a standardized experimentation platform. We present several potential hypotheses regarding the basis of adversarial examples and utilize our framework to verify them more robustly under complex attacks and datasets through controlled experiments. Our experimental results indicate that geometric causes directly affect the robustness of machine learning models, while statistical factors amplify the effects of adversarial attacks. These findings provide a baseline for further studies to better understand the phenomenon of adversarial examples, allowing researchers to design more robust machine learning models.
Tuesday, May 21
UTKARSH DARBARI
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Discovery Hall 464
Project: Analyzing and Optimizing Fairness in Spatial Temporal Predictive Privacy Models
Sharing spatial-temporal data, which includes individuals’ locations and movements over time, requires careful privacy considerations to prevent re-identification from unique trajectories. This study addresses this challenge by integrating fairness principles into privacy preserving models for such data. Existing models prioritize the balance between privacy and utility (predictive accuracy), but often neglect the impact on different user groups. This can lead to potential discrimination against users based on their mobility patterns.
We propose FairMoPAE, a method that incorporates fairness metrics into the Mo-PAE model, a framework known for anonymizing spatial-temporal data. FairMoPAE leverages techniques to evaluate and improve fairness during anonymization. These techniques analyze the entropy difference between original and anonymized trajectories, ensuring a more balanced trade-off between privacy and utility fairness for all users. By incorporating fairness metrics and optimizing hyperparameters, FairMoPAE aims to mitigate potential biases in the Mo-PAE model and contribute to the development of more equitable and socially responsible practices for spatial-temporal data analysis.
Wednesday, May 22
ASHISH NAGAR
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Empowering Mobile Learning in Resource-Constrained Communities: A Technical Exploration of Luna mHealth’s Development Process and Mobile UI Solutions
The rapid proliferation of mobile phones in resource-constrained settings presents a unique opportunity to leverage mobile health (mHealth) applications to improve healthcare accessibility. This research addresses the critical question of how to optimize mobile learning in areas such as the Comarca Ngäbe-Buglé in Panama, where barriers to health care and education are pronounced due to low literacy levels, limited economic access to services, and the scarcity of healthcare facilities. The vision is to create a tailored mobile application framework that can enhance the delivery of health education in these underserved areas, supporting interactive and multimedia content even in offline environments.
The imperative for this study stems from the need to bridge the digital divide and extend health education to remote and marginalized communities, where traditional healthcare delivery models fail to meet local needs. In the Comarca Ngäbe-Buglé, for example, the maternal mortality rate is 58 times higher than the national average, and preventable diseases remain the leading causes of death. This region’s challenges underscore the potential impact of accessible and culturally relevant health education.
My contributions focused on a variety of modern software development techniques, such as using a component-driven architecture and focusing on efficient image file handling and content module parsing. I utilized JSON Schema for structured data interchange and employed Luna mobile app UI rendering. I followed SOLID design principles to ensure robustness and scalability. Through iterative testing, we have developed a robust foundation for the Luna mHealth framework that will, when complete, empower content creators, regardless of their technical expertise, to develop interactive mobile learning modules. This approach will help to democratize the development process and ensure that the modules are adaptable to the unique cultural and learning contexts of the target user base.
PURVA AVINASH PATIL
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Exploring the Dynamics of ‘Show and Discuss’ Sessions: Visualizing Interactive Dialogues and Rapid Feedback Cycles in a Software Engineering Course
The badge challenges in the Software Engineering Studio undergraduate course at the University of Washington Bothell are done in a ‘Show and Discuss’ format that provides a dynamic platform for collaborative learning. The interactive format in which students show their ongoing work and engage in adaptive and emergent discussions with professors allows for rapid feedback cycles that are known to be beneficial pedagogically and in other design context such as software engineering. Our research aimed to visualize and quantify the iterative and adaptive nature of those sessions to be better able to explore the cognitive work occurring in the ‘Show and Discuss’ sessions. Using Transana software, we established a categorization system of 24 types of dialogs found in these interactions between professors and students. We then used this categorization system to code four video recordings of badge challenges for each of three different badge levels, covering a total of 7 hours and 23 minutes across the twelve sessions. We then created a visualization to help uncover interaction patterns within and across the challenge sessions. In particular, the visualizations make apparent the highly interactive and probing nature of these ‘Show and Discuss’ sessions.
SHAHRUZ MANNAN
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Analysis and Improvement of MASS-based GIS
Geographical Information Systems (GIS) are important in several sectors due to their ability to perform the basic functions needed to capture, manage, analyze, and visualize spatial data. However, the increasing complexity and volume of the geospatial data throws a challenge to traditional GIS processing techniques. Therefore, enhanced computational strategies should be investigated to meet demanding requirements for timely and scalable analysis. This project focuses on improving the existing integration of the Multi-Agent Spatial Simulation (MASS) library with GIS, focusing particularly on computational geometry problems used within queries. The work includes a comprehensive analysis of the existing MASS-GIS system identifying the inefficiencies. It proposes strategies for improvement, implementations, and benchmarks of existing and newly identified computational geometry problems including range search, convex hull, largest empty circle, and Euclidean shortest path using both Message Passing Interface (MPI) and MASS for parallel processing, and conducting performance evaluations assessing CPU scalability, spatial scalability, and execution efficiency. The findings reveal that MASS implementations have enhanced the organization and execution of spatial queries. Findings reveal that while the MASS implementations enhance the organization and execution of spatial queries, they present challenges in CPU scalability compared to traditional MPI-based systems. Notably, the MASS framework demonstrated substantial improvements in managing the existing computational geometry problems with enhanced CPU and spatial scalability compared to previous implementations in the MASS-GIS system.
Thursday, May 23
THOMAS PINKAVA
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Discovery Hall 464
Thesis: Deep Reinforcement Learning for Data-Agnostic Post-Training Debiasing of Black-Box Machine Learning Models
As reliance on Machine Learning systems in real-world decision-making processes grows,
ensuring these systems are free of bias against sensitive demographic groups is of increasing
importance. Existing techniques for automatically debiasing ML models generally require
access to either the models’ internal architectures, the models’ training datasets, or both. In
this paper we outline the reasons why such requirements are disadvantageous, and present
an alternative novel debiasing system that is both data- and model-agnostic. We implement
this system as a Reinforcement Learning Agent and employ it to debias four target ML
model architectures over three datasets. Our results show performance comparable to data-
and/or model-gnostic state-of-the-art debiasers.
WARREN LIU
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Programmability and Performance Enhancement of MASS CUDA
Agent-based modeling (ABM) has proven valuable across various fields for capturing the intricacies and heterogeneity of real-world systems. However, as ABM simulations become more sophisticated and larger in scale, the need for efficient parallelization arises. Graphics Processing Units (GPUs) have emerged as a compelling alternative for parallelizing ABM simulations, offering high computational power and parallelism. In this project, we aimed to enhance the MASS CUDA library, a GPU-accelerated ABM framework, by improving its programmability and performance. We implemented essential agent functions, redesigned data structures to enable coalesced memory access, and introduced a dynamic attribute setting mechanism. These enhancements led to significant improvements in programmability and performance, as demonstrated through benchmarking against previous version of MASS CUDA and a competing library, FLAME GPU 2, using five diverse applications. The evaluation showcased MASS CUDA’s effectiveness in terms of programmability, performance, and scalability. The improved programmability and performance of MASS CUDA enable users to focus on the modeling aspects of their simulations while harnessing the computational capabilities of GPUs. By offering a scalable and accessible framework for GPU-accelerated ABM, MASS CUDA has the potential to accelerate scientific discovery and decision-making processes in numerous fields.
MICHELLE DEA
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Discovery Hall 464
Project: An Agent-Based Graph Database Benchmarking Program
Graphs can be used to represent complex relationships between different entities. These are stored as edges and nodes in a graph database. This type of data makes it easier and more flexible to query connected data items, and identify insights from those relationships. Two common graph database systems are Neo4j and ArangoDB. These systems store graph data differently as Neo4j is a native graph database, and ArangoDB is a multi-model database. Both database system rely on storing data in the disk, which can be slow for data retrieval when the data is not in memory. A graph database using the Multi-Agent Spatial Simulation (MASS) library is being implemented to pursue CPU and spatial scalability by leveraging distributed memory to store graph data. This project aims to provide a benchmarking protocol for performance testing of MASS compared to Neo4j and ArangoDB. This will identify the current strengths and weaknesses of MASS, and provide a standard benchmarking tool for future researchers to use. The work includes using data pulled from real-world applications as the foundation of a random graph generator that takes user input regarding the topology and size of the graph that will be generated, a standard set of queries both in Cypher and AQL, a manual for testing in Neo4j, and a script for testing in ArangoDB. The graph sizes used in testing are 1K nodes, 10K nodes, 20K nodes, and 30K nodes for spatial scalability evaluation via graph traversal. CPU scalability of MASS is performed on a cluster of eight computing nodes using a 10K node graph.
MINGJUN MA
Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Optimizing Continuity of Applications with Parallel Machine Learning Models During Edge Server Handovers
The development of deep learning has continually benefited various fields, such as autonomous driving and real-time video applications. As deep learning models become increasingly complex, applications composed of deep learning models often require substantial computational resources. This computing resource can be allocated within an edge computing network. However, due to the mobility of end devices, handovers can occur when an end device moves from one signal zone to another. This transition can interrupt the inference process of deep learning applications, leading to temporary service disruptions. Frequent handovers can increase the latency of services, affecting the overall user experience. This issue is critical for real-time applications, where timely and accurate data is essential for making immediate decisions. To improve the inference quality during handover, we designed a solution and a corresponding prototype system to address this challenge. Our objective is to optimize the scheduling algorithm for non-handover and handover scenarios. For non-handover scenarios, we have optimized the system by reducing inference times. During handovers, we focus on maximizing the benefits of inference. We evaluated our design against the greedy solution and found that our approach saves more inference time and yields greater benefits than the greedy solution. The results demonstrate that our solution improves inference quality in handover and non-handover scenarios.
Tuesday, May 28
ABDUL-MUIZZ IMTIAZ
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Detecting Toxic Emotes Across Twitch Channels
Twitch is a popular live-streaming platform with a niche language that is very different from traditional English. The language used in Twitch is marked with several grammatical errors, as well as by the abundant use of emotes, which are emoji-like icons that can be used to express different emotions. This means that someone unfamiliar with the language used in Twitch may not comprehend the content of chat messages.
Pioneering research in the field of natural language processing (NLP) in Twitch proposed different techniques for sentiment analysis of Twitch comments. This project extends that work by extracting toxic emotes in a Twitch channel from chat logs, and then uses an embedding space of emotes created using Word2Vec to detect toxic emotes in other popular channels.
PARKER FORD
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Real-Time Rendering of Atmospheric Clouds
Rendering realistic clouds is an important aspect of creating believable virtual worlds. The detailed shapes and complex light interactions present in clouds make this a daunting task to complete in a real-time application. Our solution, based on Schneider’s volumetric rendering and noise generation framework for low-altitude cloudscapes, supports increased realism and performance in cloud rendering. For efficient approximations of radiance measurements, we adopt Hillaire’s energy-conserving integration method for light scattering. To simulate the effect of multiple light scattering, we followed Wrenninge’s approach for computing the multi-bounce diffusion of light within a volume. To capture the details of light interreflection off microscopic water droplets, the complex behavior of Mie scattering is approximated with Jenderise and d’Eon’s phase function modeling technique. To capture the details with nominal computational cost, we introduce a temporal anti-aliasing strategy that unifies pixel and volumetric sampling. The pixel area sampling integrates a blue-noise distribution with an n-rooks offset, while volumetric samples follow a stratification strategy, amortizing results over n frames.
The resulting system is capable of rendering scenes consisting of expansive cloudscapes well within real-time requirements, achieving frame rates between 2 and 3 milliseconds on a standard laptop. Users can adjust parameters to control various types of low-altitude cloud formations and weather conditions, with presets available for easily transitioning between settings. Our unique combination of techniques adopted in the volumetric rendering process enhances both efficiency and visual fidelity where the novel approach to volumetric temporal anti-aliasing efficiently and effectively unifies the sampling of pixel areas and volumetric intervals. Looking forward, this technique could be adapted for real-time applications such as video games or flight simulations. Further improvements could refine the cloud modeling system, incorporating procedural generation for high-altitude clouds, thus broadening the range of cloudscapes that can be represented. Additionally, recent work by Schneider has shown the potential for voxel-based cloud modeling. This modeling approach could be paired with our volumetric rendering method to further improve the appearance of the clouds.
GURKIRAT SINGH GULIANI
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: MeTILDA and MultiLinguify for Language Learning
The capstone project focuses on the development and enhancement of language learning applications including MeTILDA ( Melodic Transcription in Language Documentation and Application) and MultiLinguify.
MeTILDA is an on-going project in our research group. It’s a web-based application for endangered language application and education. My work focused on enhancing the user experience, resolving major bugs and adopting Azure DevOps for project management. In addition, significant collaborative efforts were invested in developing comprehensive project documentation that provides detailed insights into the development process, feature implementations, and project management strategies.
The capstone also developed MultiLinguify, a cross platform mobile application from scratch. MultiLinguify supports the learning of different languages and enables multiple learning modes such as speaking and writing. With the target users being children aged 5-11, UI has been kept simple and intuitive. To promote self-learning, learners can draw characters and practice their pronunciations, and get real-time feedback. With scalability in mind, the current design framework accommodates scope for future enhancement like gamification, notification and additional regional languages.
AGUSTIN CASTILLO
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.; Discovery Hall 464
Thesis: Evaluating the Effectiveness of the Convolutional LSTM Neural Network for Simulations in Computational Fluid Dynamics
Computational Fluid Dynamics (CFD) is an important part of engineering design, with applications in diverse areas, such as aeronautics, civil engineering, and medicine. Although its practical application is widespread, critical challenges hinder its utilization. These challenges are mainly related to computing resource requirements and long execution times of the fluid flow simulations. Recently, Machine Learning has achieved success in multiple domains, and Deep Learning techniques have been applied to enhance CFD. However, most of these techniques address a small fraction of the CFD process, for instance, approximating parameters for the algorithm to reduce error or focusing only on principal component analysis of the fluid flow. Compared with DL applications in other fields like Computer Vision, this application is relatively unexplored, and little research has been done on complete end-to-end models to simulate the fluid flow evolution when interacting with an obstacle. This research evaluates the effectiveness of the Convolutional LSTM (ConvLSTM) neural network for Computational Fluid Dynamics simulations when creating Reduce Order Models (ROM) and simulating turbulent fluid flows interacting with an obstacle. We propose a novel end-to-end Artificial Neural Network (ANN) model architecture based entirely on ConvLSTM that can successfully predict the spatiotemporal evolution of a fluid flow. This data-driven approach achieves similar results to a classical CFD method with direct numerical simulation in a quarter of its execution time. The model can potentially accelerate CFD simulations, leading to a faster engineering development process. By providing rapid preliminary results for prototype testing, engineers can explore more design ideas without having to wait for days or even weeks for simulation results.
Wednesday, May 29
GAGNEET SACHDEVA
Chair: Professor Mark Kochanski
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Strengthening the Teaching Tools Platform through CI/CD deployment
The Software Engineering Studio at the University of Washington Bothell serves as an innovative platform where students engage in live software projects, fostering real-world experience in software development. The “Teaching Tools” website, a product of this studio, is designed as a full-stack solution that significantly enhances collaboration between students and faculty. It focuses on refining grading processes and optimising the exchange of feedback, directly addressing inefficiencies within the existing Canvas Learning Management System. This capstone project aims to further improve the educational experience at the University by identifying and resolving critical gaps in Canvas, specifically targeting issues that hinder user efficiency and complicate routine tasks.
My aim for this project is to construct a robust Continuous Integration (CI) and Continuous Deployment (CD) pipeline for the Teaching Tools website. This involves automating the build and testing processes and ensuring reliable deployment of the website to Azure cloud services, thereby extending its accessibility to a broader audience. The approach incorporates CI/CD best practices leveraged through the Azure DevOps platform to enhance deployment efficiency. A strategic branching model supports this framework, maintaining a stable main branch for production releases while facilitating ongoing development in feature branches. This pipeline will not only streamline updates and feature integrations but also enable quicker releases, ensuring that enhancements and bug fixes improve user experiences in a timely fashion. By automating tests and deployments, the project reduces manual errors and increases the productivity of the development team. This allows them to focus more on creating innovative features and less on the mechanics of the deployment process. Ultimately, this infrastructure supports a dynamic educational tool, adapting quickly to the evolving needs of educators and students at the University of Washington Bothell, making it an indispensable asset in the educational landscape.
SHAUN STANGLER
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Building the Foundations for Extending Local First Software Techniques with Luna mHealth
The Luna mHealth project aims to bridge the healthcare information divide by developing a framework of content authoring and rendering applications tailored for remote and underserved populations. Our focus lies on addressing challenges in accessing medical educational content in areas with limited technology, connectivity, and literacy levels.
In regions like the Comarca reservation in Panama, standard mobile application systems face hurdles in distribution and functionality due to limited internet connectivity and access to app stores. To mitigate these issues, we employ local-first software techniques, prioritizing data sovereignty and offline functionality. Our approach extends to building the foundations for local peer-to-peer distributed file transfer applications using Bluetooth networking for seamless operation in low-connectivity environments with content distribution and cloud telemetry propagation.
To achieve these goals, we performed a complete redesign of Luna architecture and built foundational components such as base storage systems, logging and telemetry libraries, core object model, and presentation parsing. We also built a mobile Bluetooth-based file transfer proof-of-concept application to showcase low-touch configuration-based device pairing and application-level Bluetooth service and characteristic control using custom native Flutter device plugins.
The Luna project is driven by a commitment to innovation and inclusivity in healthcare technology, aiming to empower remote communities with tailored and localized self-service medical education content. Through partnerships with organizations providing outreach services, Luna seeks to supplement health education in underserved communities, leveraging mobile applications to enhance training content for providers and empower families with self-education opportunities.
Thursday, May 30
SHUBHAM SHANTARAM PATIL
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Discovery Hall 464
Project: Performance Enhanced Drowsiness Detection for Drivers Using Inception V3 and Haar Cascade.
The work on performance increased driver drowsiness detection system is to establish a driving system for all drivers around the globe who are at risk of falling into sleepy or drowsy states that highly compromise road safety contributing to numerous road accidents in America every year. The study published by the NHTSA (National Highway Traffic Safety Administration) U.S. Department of Transportation estimates that in 2017, 91,000 police-reported crashes involved drowsy drivers. These crashes led to an estimated 50,000 people injured and nearly 800 deaths. For this capstone project, we propose to create such a drowsiness detection system which will help overcome challenges in the previous studies mentioned in related works. Hybrid state-of-the-art algorithms like Inception V3 Depp CNN algorithm and Haar Cascade have been employed by the implemented systems which effectively analyze face and behavioral positions as regions of interest. To train, a large dataset consisting of 100K images containing closed/open eyes with different people (human subjects), under various lighting conditions while driving was used so that this system could become more accurate at detecting such states. As a result, we saw that our system has achieved an accuracy of 92.35% on tests. The model’s performance can be evaluated by examining its Receiver Operating Characteristic curve (ROC). The ROC curve generated by our model has an Area Under the Curve (AUC) value of 0.70 which implies that our system performs better than random guessing with typical ranges being between 0.7-0.8 representing good performance although may have limitations under certain contexts as well.
Friday, May 31
KARAN CHOPRA
Chair: Professor Mark Kochanski
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Applying Software Engineering To Develop Features In The Teaching Tool
Canvas is an all-inclusive learning management system (LMS) that runs on the web and is intended to help with digital learning by giving institutions the ability to efficiently oversee online instruction. It gives teachers the resources they need to design, present, and evaluate online courses, and it gives students the chance to engage in classes, monitor their skill growth, and get feedback on their academic achievement. Canvas is a key platform in the field of digital education, with features designed to meet the various needs of educational institutions and their learning communities. Even with its extensive capability, there is still room for improvement to raise the bar for both teacher and student productivity and learning outcomes.
Seeing this room for growth, this capstone project takes the form of an effort to create a full-stack application that runs in the browser. Through feature enhancements and a comparative study of current functionalities, this project aims to deliver new features that follow strict software engineering principles. The development of multiple stand-alone features for the Teaching Tools program is the main goal of this project. These features include importing and exporting quizzes, refactoring previous code to make it more comprehensible, redesigning the architecture of the application, and working on the UI development of new features. It is intended for these elements to be smoothly integrated with Canvas, enhancing the University of Washington Bothell’s virtual learning environment. Additionally, acting as a mentor to undergraduate students and fostering communication and collaboration are key components of this initiative. The project intends to give all users a more dynamic and engaging learning experience by directly integrating these advancements into Canvas. In addition, this project is dedicated to applying good software engineering principles, guaranteeing an efficient system’s design and execution that places a premium on a clear and captivating user interface. With this project, the hope is to raise the bar for digital learning platforms and create a setting where education and technology meet to improve the educational experience.
WINTER 2024
Friday, January 26
MATTHEW WOERNER
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: Identifying and Addressing the Gap Between How Students and Professionals Read Code
This project investigated and addressed the questions of: a) how do students and professional software developers read novel codebases, and b) how can we help students learn to better read code.
Our Spring 2023 study used semi-structured interviews and code reading exercises to identify and quantify several differences in the ways students and professional software developers read novel codebases. Students tended to face more difficulty with these reading tasks than the professionals due to an apparent lack of structured code reading process and an over reliance on making unverified assumptions about the code. We focused on three particular anti-patterns. Our interview data also indicated that the lack of a structured code reading process complicates transitioning into a professional atmosphere post degree, requiring new professional software developers to learn these skills on the job.
Based upon the results, we developed a module to teach students a structured way to read code in novel codebases, and to assess their improvement. The module was integrated into the Fall 2023 quarter of CSS 390 (Software Engineering Studio). Students worked their way through a variety of formative exercises leading up to a final summative assessment where they were evaluated on their performance improvement throughout the module as well as how they compared to a prior group of students given a similar assessment in the Spring quarter. Comparing the number of code reading anti-patterns exhibited by both groups, we found that the students who completed the module were significantly less likely to trace into files outside of the code path, were more likely to follow all stack traces in a code reading challenge, and were less likely to make uncorrected misinterpretations about a codebase.
Wednesday, February 21
JARDI A. MARTINEZ JORDAN
Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: Graph-based Modeling and Simulation of Emergency Services Communication Systems
Emergency Services Communication Systems (ESCS) are evolving into Internet Protocol (IP)-based communication networks, promising enhancements to their function, availability, and resilience. This increase in complexity and cyber-attack surface demands a better understanding of these systems’ breakdown dynamics under extreme circumstances. Existing ESCS research largely overlooks simulation and the little work that exists focuses primarily on specific cybersecurity threats and neglects critical factors such as the non-stationarity of call arrivals. This paper introduces a robust, adaptable graph-based simulation framework and essential mathematical models for ESCS simulation. The framework uses a graph representation of ESCS networks where each vertex is a communicating finite-state machine that exchanges messages along edges and whose behavior is governed by a discrete event queuing model. Call arrival burstiness and its connection to emergency incidents are modeled through a cluster point process. The model applicability is demonstrated through simulations of the Seattle Police Department ESCS. Ongoing work is developing GPU implementations of these models and exploring the use of simulations in cybersecurity tabletop exercises.
Thursday, February 29
MOHAMMED KHALEELUR REHMAN
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Multi-tier System Performance Optimization
The College Affordability Model represents a critical data visualization tool tailored to assess the expenses associated with attending colleges in the United States. This web-based system serves as a valuable resource for policymakers, empowering them with comprehensive insights derived from exploring existing data. The architecture of the tool encompasses three distinct tiers: the database, backend, and frontend, each contributing to its functionality.
The existing College Affordability Model faced performance challenges due to additional processing during plot computations in various scenarios. Consequently, this project aimed at a thorough analysis, understanding, and optimization of the system’s performance. Drawing inspiration from industry case studies on performance optimization, a novel design solution was derived to address the identified issues. The project implemented this solution, incorporating appropriate tools for result analysis. It provides a promising verification system prototype that can be extended to the entire system that ensures marked improvement in user interaction responsiveness.
Friday, March 1
CHENGJUN XI
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Web Education Platform for Endangered Languages
Language extinction is a concerning problem in current world. More than 1500 languages may disappear by the end of this century. Melodic Transcription in Language Documentation and Analysis (MeTILDA) is a web toolset developed to help documentation and education of endangered pitch-accent languages. A pitch-accent language is a language where the change of pitch-accent may change the meaning of words. The MeTILDA system can visualize and document the pitch movement using a novel perceptual scale. Nevertheless, it is primitive in education functionalities, only supporting learning by syllable and word.
The project presented by this paper focuses on extending the education functions of the MeTILDA system. It is a Content Management System (CMS) including six sub-systems, namely course, lesson, discussion, assignment, quiz and grading. It supports four major aspects of language education including listening, speaking, reading and writing. Besides, it integrates the pitch art component in the MeTILDA system to facilitate visualization of pitch movements in audio recording. Moreover, it inherents the cloud-based architecture of the MeTILDA system, so it can be easily integrated with the existing MeTILDA system to better support language education. With this project, the MeTILDA system will have a complete functionality of language education with unique advantages of automatic pitch visualization.
Monday, March 4
GARY LAM
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Mushroom Identification Application Using Machine Learning
Mushroom foraging is a simple hobby that has few material requirements but demands a great deal of knowledge to perform safely and successfully. Attempting to forage without sufficient experience can lead to serious health consequences, or in rare cases, death. Typically, beginners gain this knowledge by learning from an already experienced guide or consulting extensive field guides. However, with new advances in image recognition and deep learning, a new tool for mushroom identification can be made possible.
This project presents such a tool, consisting of a backend prediction model and a front end application for users to interface with. Using existing databases of labeled fungi images that are further filtered and processed, a convolutional neural network is trained using transfer learning and employed as the prediction model. The front end application allows a user to upload an image and receive the most probable predictions. Unlike in existing fungus identification applications, this prediction is accompanied by detailed descriptions of the predicted species’ physical characteristics and other identifying features. In addition, it allows the user to input their own observations of the specimen and highlights matches within the known features of the predicted species. With this process, a burgeoning forager can learn to spot the distinguishing characteristics of certain species. In addition, more experienced users can employ the application to organize their notes and gain a reference to confirm their own identifications. The user inputs describing the image features are also saved, with user permission, and can be used to improve the machine learning model or build a new model capable of recognizing individual features of fungi.
MATTHEW MUNSON
Chair: Dr. Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Real-Time Evaluation of Simulated Aircraft Instruments Using Machine Vision
While digital displays and glass cockpits have become widespread in modern aircraft, analog instruments remain. These gauges can be challenging to digitize or integrate into automated safety systems. This work investigates the application of machine vision to evaluate aircraft instruments of varying complexity. For ease of acquisition, training data was recorded from a flight simulator and used to train neural networks. The resulting models have high accuracy when evaluating single pointer gauges in lighting conditions similar to the training data set, as well as with entirely different lighting conditions. Performance remains robust even with more complex instruments, such as dual pointer airspeed gauges and attitude indicators, although occasional misinterpretations of gauge pointers occur. Attempts to train models to identify instrument positions from panned and zoomed input video using labeled bounding boxes were not successful as the resulting models had low accuracy. Potential future work on this system includes applying it to real-life aircraft and integration with safety systems, including detection of instrument display failures.
XIANG LI
Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Agile Data Recording Architecture for Complex Scientific Simulations
Simulation development and eScience are driven by the complex questions that scientists and engineers want to answer. A simulation-driven or eScience investigation is an iterative process — as answers are found, new questions are created. Consequently, the development of simulation and eScience software involves rapid iteration, and the data that investigators want to capture from such software frequently changes.
Traditionally, new simulation data characteristics require development of new software modules or modification of existing ones to facilitate the recording of the updated data. This brings two disadvantages. First, scientists and engineers must invest significant time and resources into understanding and addressing data recording nuances with each iteration of their investigation. Second, the procedures developed during each iteration are of limited use in the next. This is particularly problematic in large-scale projects that involve various simulation types, where managing multiple data recording systems becomes a significant overhead. To address these issues, we have developed a flexible and scalable data recording architecture that supports a wide range of simulations and data types. This architecture was realized by redesigning the data recording subsystem within the Graphitti simulator, and we assessed the flexibility and reusability of this redesigned system by evaluating the lines of code (LOC) and examining its maintainability. We observed a complete elimination of lines of code (a reduction of 100 percent) in the updated data recording subsystem compared to the old one, specifically in the context of recording various new variables within existing simulations. This result shows that new architecture significantly reduces development needs for saving and updating simulation data across different simulation projects, as well as modifying variables within existing simulation models. Additionally, we demonstrate that this approach can easily record more data types with minimal changes (2 lines of code), thus broadening its ability to support additional fundamental data types that were not previously accommodated by the data recording subsystem. Overall, our new lightweight data-recording architecture met our project goal of supporting various simulations without requiring the development of additional software.
Wednesday, March 6
NIKHIL CHAUHAN
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: GhostCrowdShare : A privacy preserving video sharing application
GhostCrowdShare introduces a unique approach to secure video sharing. This project implements the client-server architectures described in https://faculty.washington.edu/lagesse/publications/ppvs.pdf to enhance privacy while sharing media online. Its main aim is to allow private video sharing among specific groups based on mutual presence at an event. It ensures confidentiality until co-presence is confirmed, which holds immense value for both personal and professional realms.
Its standout feature is computing similarities between homomorphically encrypted videos, a method grounded in the above research paper. This involves video encryption and maintaining computing similarity between encrypted videos. The system is designed to be infinitely scalable horizontally as well as vertically. It can perform tasks ranging from user authentication, event creation and joining, video compression and video encryption, similarity calculation between encrypted videos and allowing eligible parties to download videos, all communicating via APIs.
This project also solves a major pain point for previous researchers using Microsoft Seal for homomorphic encryption where they had to either write wrappers on top of the original seal library or use py-seal which is a not-so-good implementation of the original library.
In essence, GhostCrowdShare paves the way for future secure video sharing, bridging the gap between privacy needs and content sharing in today’s digital age, and it also acts as a framework for any other application with similar requirements.
Thursday, March 7
PAUL JINWOO LEE
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Detecting Nasal, Glottal, and Breathy Tones in Acoustic Speech using Phonetic Features and Machine Learning
Discerning subjective tones in languages with limited speech data presents a considerable challenge. This study utilizes machine learning models to analyze and demonstrate potential trends in nasal, glottal, and breathy tones, aiming for applicability across diverse speakers and languages with limited speech data.
The research leverages larger datasets to uncover patterns in acoustic features related to the tones across an array of word-based speech samples. Categorization of word samples is based on A1-P0 amplitude shift calculations for nasality and harmonic amplitude slopes for glottal and breathy tones. Manual labeling of datasets facilitates the training of machine learning ensembles, which are then tested against both split datasets and external datasets, including those from different languages.
Utilizing a silence-splicing algorithm on the Common Voice’s Development library, we extract a dataset of 7,397 English words for feature calculations related to tone detection. Machine learning models are trained on 70% of the features and tested on the remaining 30%, demonstrating initial accuracy rates of 99.99%, 99.99%, and 96.50% for nasal, glottal, and breathy tones, respectively. Testing against the Common Voice’s Test library of 7,208 English words yielded varying prediction accuracies (68.10%, 94.56%, and 95.38%) for the respective tones. Testing against the limited samples of the endangered Blackfeet language produced prediction accuracies of 55.55%, 100%, and 100% for the respective tones.
While glottal and breathy tones demonstrate more consistent performances across different datasets, nasality calculations exhibit a wide range of standard deviation values possibly due to the subjective nature of this feature. The findings emphasize the nuanced nature of calculating nasality across numerous speakers of even the same language. Challenges persist in precise tone distinctions across different languages, underscoring the ongoing need for refinement and additional audio features.
Friday, March 8
CHLOE MA
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Beyond Current Boundaries: Integrating Deep Learning and AlphaFold for Enhanced Protein Structure Prediction from Low-Resolution Cryo-EM Maps
Constructing atomic models from cryo-electron microscopy (cryo-EM) maps is a crucial yet intricate task in structural biology. While advancements in deep learning, such as convolutional neural networks (CNNs) and graph neural networks (GNNs), have spurred the development of sophisticated map-to-model tools like DeepTracer and ModelAngelo, their efficacy notably diminishes with low-resolution maps beyond 4 Å. To address this shortfall, our research introduces DeepTracer-LowResEnhance, an innovative framework that synergizes a deep learning-enhanced map refinement technique with the power of AlphaFold. This methodology is designed to markedly improve the construction of models from low-resolution cryo-EM maps. DeepTracer-LowResEnhance was rigorously tested on a set of 37 protein cryo-EM maps, with resolutions ranging between 2.5 to 8.4 Å, including 22 maps with resolutions lower than 4 Å. The outcomes were compelling, demonstrating that 95.5% of the low-resolution maps exhibited a significant uptick in the count of accurately predicted residues. This denotes a pronounced improvement in atomic model building for low-resolution maps. Additionally, a comparative analysis alongside Phenix’s auto-sharpening functionality delineates DeepTracer-LowResEnhance’s superior capability in rendering more detailed and precise atomic models, thereby pushing the boundaries of current computational structural biology methodologies.
AUTUMN 2023
Thursday, November 30
Arun Sarma
Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Biosignal Based Side Channel Attack to Infer Android Pattern Lock Using Deep Learning
The growing popularity of wearable Internet of Things (IoT) devices has led to significant security and privacy concerns. The health data that these devices collect can be used to infer private and sensitive user information via side channel attacks. This is especially true for users of Brain-Computer Interfaces (BCI) which measures brain activity via Electroencephalography (EEG) signals, and sometimes muscle movement from Electromyography (EMG) signals in Human-Computer Interaction (HCI) applications. Studies show attacks have been constructed to infer various sensitive information such as PINs and passwords from BCI users’ biosignal data. However, to our knowledge, no side channel attacks have been demonstrated on popular, alternative authentication methods such as Android Pattern Lock. Existing research shows that Android Pattern Lock can be cracked using video-based and acoustic side channel attacks. However, these attacks require direct observation of the victim or access to the victim’s smartphone sensors which can be locked behind strict device permissions. Motivated by the vulnerabilities of consumer-grade IoT devices that record biosignal data, we propose a novel side channel attack where recorded EEG and EMG signals are analyzed using deep learning techniques to infer a victim’s Android Pattern Lock. Our experiment results show that our side channel attack detects when a user is unlocking their phone via Pattern Lock with a 98.97% accuracy and infers the drawn pattern with a 99.97% accuracy. General swipe directions of the user’s finger drawing the unlock pattern is inferred with a 93.64% accuracy.
Thursday, December 7
CHENGCHENG YANG
Chair: Dr. Annuska Zolyomi
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: User Centered Re-design of Pediatric Online Portal
Clinicians, educators, and families need access to trustworthy information about developmental-behavioral pediatrics (DBP); however, credible online information is limited due to the scarcity of medical professionals specializing in this field. To address this issue, a pediatrician with extensive knowledge of DBP created an online portal specifically designed to provide clinicians, educators, and parents with access to DBP-related resources. This project focuses on redesigning the portal to effectively cater to the diverse needs and values of multiple stakeholders, each possessing unique knowledge and information requirements regarding developmental and behavioral disabilities. Through the implementation of user-centered design principles, the portal was enhanced to promote trust and alleviate anxiety, particularly among families with children facing developmental and behavioral challenges.
This project established guidelines for redesigning a health portal based on a comprehensive literature review and competitive analysis. Key design guidelines included conveying the organization’s ‘real-world’ aspect, ensuring clarity of purpose, user-friendly navigation, consistent layout and color scheme, enhanced visual appeal with bright images, effective use of fonts and colors for conveying information, and ensuring accessibility for dyslexia and color blindness. Additionally, a thorough understanding of user needs was gained by analyzing key stakeholders, developing personas, crafting scenarios, constructing journey maps, and employing information architecture techniques.
To continuously refine the design throughout the process, participatory design, interviews, and usability testing were employed to identify and address emerging design challenges and user requirements. Primary users shared common requirements for user-friendliness and intuitive navigation. However, parents placed greater emphasis on the search bar function, as they were not familiar with DBP terminology and sought a quick and efficient way of locating relevant information. Additionally, parents expressed a strong preference for websites displaying the logo of a well-known hospital, as it instilled a sense of trust and credibility. By iteratively refining design solutions, the project improved the user experience, ultimately ensuring that the final design of the portal effectively met the needs of its real-world diverse user base in a trustworthy manner. The outstanding achievements, including a 300% reduction in search time and task incompleteness, and the significant improvements in System Usability Scale (SUS) scores for both primary users (174%) and the general public (226%), underscore the impactful and transformative nature of the user-centered design approach.
WEN-JUI CHENG
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Facilitating Physical Elements of Fine Arts Education Through Motion-Controlled Simulation
For traditional, in-person academia, students often face learning difficulties due to uncontrollable conditions in the real world. In particular, students who cannot attend classes in-person due to sickness, family or important obligations lose the opportunity to practice physical skills that cannot be experienced in remote settings, but are nevertheless essential to their field of study. An optimal solution should allow students to learn through physical interactions with class elements without being present in a classroom setting. This study provides a starting point by developing a virtual, motion-controlled environment that facilitates the process of drawing or painting with various art tools. Using Virtual Reality (VR), we created a simulation that enabled users to draw or paint using seven unique tools and ten different colors. By measuring our simulation’s usability and immersion through five central metrics (Effectiveness, Efficiency, Satisfaction, Fidelity and Quality), we discovered that our virtual environment provided an adequate user experience as a software, but lacked accuracy to real-world elements. Given more time in the future, we hope to enhance the immersion and intuition of our simulation to seamlessly translate virtual experiences to the real world.
JESSE LEU
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Exploring Data Preprocessing and Statistical Analysis Strategies for Intracranial Hemorrhage Detection Based on Ultrasound Tissue Pulsatility Imaging
Traumatic Brain Injury (TBI) is a serious health concern, impacting brain function with potential consequences ranging from temporary challenges to severe, life-threatening intracranial hemorrhages. Timely detection is crucial for ensuring prompt and targeted care that leads to improved patient outcomes. While conventional diagnostic methods such as Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) have constraints such as limited portability, high costs and the need for skilled technicians to gather the data, the investigation into Ultrasound Imaging, specifically Tissue Pulsatility Imaging (TPI), offers a viable alternative as it overcomes the aforementioned issues. However, unlike CT and MRI techniques that capture static images (essentially “snapshots” of tissues), Ultrasound technology gathers a continuous series of measurements over time, akin to a movie with multiple frames. Subsequently, these dynamic measurements must undergo processing to simulate static images resembling those obtained through CT or MRI. Hence, the data collected through ultrasound remains challenging to process and interpret; specifically, it poses difficulties for immediate utilization by Machine Learning (ML) strategies and data analysis methods such as component analysis. Consequently, additional preprocessing steps are essential to extract aspects from this data relevant to our work.
This project focuses on analyzing data collected from experiments involving patients that have suffered TBI, leveraging TPI to examine brain and tissue displacements across cardiac cycles. The primary goal is to study tissue displacement patterns in both, healthy and injured brains to try to find features and metrics that can help us differentiate between them using ML and data analysis techniques. It is of particular interest to look at TBI patients that have suffered critical bleeding. The overarching objective of this project is to enhance and automate existing methodologies and consists of two main phases:
In the initial phase, a comprehensive study of the data processing pipeline was conducted to identify bottlenecks and optimization opportunities. Given the multidisciplinary nature of the project, the data is intricate, requiring substantial effort and external knowledge for relevant content extraction. Notably, the project automated the process of downloading data from the cloud drive, organized it into folders by patients, and generated representative data for further analysis. In the subsequent phase, we explored the potential of identifying and analyzing displacement metrics to enhance intracranial hemorrhage detection. This involved the use of statistics, data visualization, and other techniques such as Component Analyses, and ML spatial models. Furthermore, we extended previous research by exploring how the identification of certain displacement values (such as minimum and maximum) can contribute to the identification and differentiation of intracranial hemorrhage.
During the course of this project, we have identified opportunities and limitations for optimizing the data preprocessing pipeline, mainly related to the dependencies and structure of the collected data. From a data analysis perspective, as we built upon prior research, we observed that the peak displacement values can be utilized to distinguish between TBI and healthy patient data. Some of such findings lay the groundwork for further investigation and refinement of ML models to enhance the accuracy of intracranial hemorrhage detection.
SUMMER 2023
Monday, July 24
SREJA BABU
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Gardening Mobile Application Powered by Machine Learning and Artificial Intelligence Technologies
There are many research studies that show that gardening is a helpful hobby to improve physical and mental health in human beings. However, it is a well-known fact that gardening itself requires a lot of time, effort, and knowledge to be successful. While most of the information needed to be good at gardening is spread across various plant resources online, it needs a lot of effort from the gardeners to gather all in one place for the plants of their interest.
In this study, we researched different models and features to identify a combination of a model and features that provide higher accuracy in plant species identification, thereby offloading some of the human hard work to automation and technology, resulting in green gardens and happy people. We built a plant database cross-platform application based on image identification research and Generative Pretrained Transformer (GPT). The application allows users to identify plant species using digital image processing and machine learning techniques and to automatically add the plants into a “Virtual Garden.” For the plant recognition aspect, we are able to identify the plant species with an accuracy of 91.6%, using the Support Vector Machine (SVM) model. The SVM model plugs directly into our backend REST server. The key advantage of our backend implementation is the possibility to swap our current classifier model with an improved model at any time, therefore, keeping the option to improve our accuracy with support to more plants. Currently, our application is capable of identifying 32 different plant species.
Once the user adds plants to the Virtual Garden, our application will be able to fetch detailed information about each plant from the plant database that we built. This information comes from our automatically populated plant database built on top of Generative Pre-trained Transformer, Large Language Models (LLM). The GPT-3.5-Turbo LLM has been integrated with our backend application. We have a continuous process (DB populator) that updates our internal plant database continuously. The DB populator, research on models and features, and the questionnaire prompt set are the key contributions in our research. The DB populator uniquely enables our plant database to scale for new plants and species when more data becomes available later. We leverage the fine granular pruned gardening information with the help of GPTs to notify users when to water or replenish the soil and other useful information. These notifications will offload a lot of planning that a gardener typically has to perform and convert them to a set of simple instructions to follow easily.
The uniqueness of our application when compared to other applications is that it leverages the GPT LLM models to automate the process of information gathering as opposed to relying heavily on user contributions, and plant experts like the existing applications. Based on our research, our application uses an SVM model with a unique set of features that has a higher accuracy than the rest of the other feature sets used. The results from the usability study proved that this application reduced a significant amount of work and time taken to gather all the information. It also provided crisp information as opposed to the information gathered manually. For future work, we intend to add more social features and also measure the impact of the longevity of sticking with gardening with the app as compared to without using it. Overall, this application serves as a very useful tool that facilitates an enjoyable gardening experience for gardeners.
Thursday, July 27
KANIKA SARASWAT
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Quickcheck Application for Mobile Platforms: Architectural and Release Planning
According to the Visual Health Initiative (VHI) of the Centers for Disease Control (CDC), 6.8% of children under the age of 18 have been diagnosed with visual impairment. It is estimated that 60% of children with learning disability actually have undiagnosed vision problems and 80% of learning is visual. It is crucial to identify and diagnose any potential vision impairments and eye illnesses as early as possible since they cannot self-report their vision impairment. When children struggle with visual tasks, they may not be able associate their difficulty with an issue with their eyes. Adults generally exacerbate the issue by misconstruing the symptom as a learning disability. A quick examination termed a vision screening, commonly known as an eye test, searches for suspected vision issues and eye conditions.
QuickCheck is a mobile application that enables school nurses to diagnose students for suspected vision health concerns. It is not intended to replace an eye exam, but rather to provide the students with a “quick” check-up so that potential patients can be referred to professionals, so that children who show signs of vision issues can have the necessary eye exams.
The application underwent comprehensive testing using specialized tools, including UI automation tools and load testing tools such as Apache JMeter. This rigorous testing process facilitated an in-depth evaluation of the app’s performance, user interface, and responsiveness, ensuring its reliability and efficiency in various usage scenarios. Based on a series of rigorous test runs using these criteria, it was determined that the application has reached a state of readiness suitable for initial launch so that we may proceed with clinical testing.
Friday, July 28
MARY EYVAZI
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Efficient Receipt Understanding Using Model Compression Techniques For a Cost-Sharing Application
In today’s fast-paced and interconnected business world, managing expenses and processing receipts efficiently is a pressing concern, not only for organizations but also for individuals navigating personal and shared expense systems. The traditional manual approach to receipt processing is laborious and error-prone, resulting in inefficiencies and potential financial discrepancies that can adversely affect both individuals and businesses alike.
This project seeks to address these challenges by focusing on the development of a cost-effective solution to automate receipt processing, with a specific emphasis on cost-sharing systems. We also shed light on the complexities associated with managing shared expenses and identify the limitations of existing deep learning models, particularly concerning their high computational and storage demands.
To overcome these obstacles, we propose the utilization of two model compression techniques: knowledge distillation and model quantization. By leveraging these cutting-edge methods, we aim to significantly reduce the size and computational requirements of visual document models utilized in receipt processing. Through rigorous evaluation, we assess the effectiveness of these techniques, ensuring that the resulting solution maintains exceptional accuracy and performance standards.
To demonstrate the practicality and efficiency of our proposed solution, we develop a cost-sharing application that showcases its seamless integration into real-world scenarios. Our ultimate objective is to democratize receipt processing by providing a more accessible and affordable solution for end-users, empowering both individuals and organizations to streamline expense management processes and make informed financial decisions with confidence. With this project, we aspire to foster greater financial transparency and alleviate the burden of manual receipt processing, thus enabling individuals and businesses to thrive in today’s dynamic economic landscape.
Thursday, August 3
CHRISTIAN ROLPH
Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Semaphore: Mobile Application for the Hearing-Impaired Using Peer-To-Peer Connections in an Ad Hoc Network
Few mobile applications exist for the deaf and hearing impaired to be able to communicate, and those that do exist typically rely on the Internet to be able to function. This creates a problem for the deaf community when they want to use their app in a location that has poor or no Internet service. This capstone project aims to develop a mobile application for the deaf that can be used without the Internet. The proposed solution uses Bluetooth Low Energy (BLE) for the underlying network protocol to allow direct peer-to-peer message passing. The CoreBluetoth framework, provided by Apple, serves as the primary interface between the application and BLE functionality. The project builds on this protocol to create an ad hoc mesh network, allowing peers that are not directly within Bluetooth range of one another to communicate. The implementation uses the iOS operating system and a mobile platform to be easily usable for most users using their smartphone. It allows for real-time translation of speech to text, and two-way communication between a network of connected users.
The application was tested in several key areas including transcription accuracy, scalability, usability, and resource efficiency. Transcription testing primarily focused on ensuring that the speech-to-text functionality of the application was of a high enough quality to support everyday conversation. The application takes a heavy dependence on the voice-to-text APIs provided by the iOS operating system that operate on-device, which generally performed very well. Scalability testing focused on how well the application could handle multiple users in a single chatroom, and how many chatrooms could be created simultaneously without interfering with one another. Usability testing was conducting using a beta test with real users and asking them to evaluate their experience on a feedback form. Finally, resource efficiency testing focused on evaluating the application’s impact on battery life compared to that of other popular apps.
Overall, this project met its goal to provide a usable offline communication mechanism for the deaf community. It demonstrates that BLE is a reasonable choice as an underlying network protocol for this purpose. This project’s ad hoc network demonstrates potential for applications in other areas including disaster relief, military applications, and Internet of Things devices. Future research and work can build on this project to expand the use of Bluetooth to create such networks.
Monday, August 7
ZUODONG WANG
Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Multiple Vehicle Task Scheduling for Vehicle Based On-Demand Mobile Edge Server
The rapid growth of mobile devices and the increasing demand for real-time data processing have led to the emergence of mobile edge computing (MEC) as a promising solution to address the limitations of traditional cloud computing. MEC leverages the proximity of edge servers to mobile users to provide low-latency and high-bandwidth services. In this context, the efficient dispatch and scheduling of vehicle-based, on-demand mobile edge servers (VOMES) have gained significant attention. This report proposes a vehicle movement and task allocation approach for VOMES.
The objective is to maximize the total operating profit while considering the operational costs and mobility constraints of the VOMES. To achieve this, we develop a mixed-integer linear programming (MILP) formulation that considers various parameters, including the computational capacity of the VOMES, the processing requirements of the tasks, the vehicle mobility patterns, and the operational costs. By formulating the problem as an MILP, we enable the use of optimization techniques to find the optimal task allocation and scheduling solution. To handle the dynamic nature of the VOMES environment, we propose two approaches. In the first approach, an initial schedule is generated based on the current knowledge of the tasks and the VOMES locations. In the second approach, the schedule is updated periodically to adapt to the changes in task arrivals and VOMES availability. To facilitate dynamic scheduling, we employ a heuristic algorithm that considers the task needed capacity, VOMES mobility patterns, and the proximity of the VOMES to the task locations.
The proposed approach has been evaluated through extensive simulations using realistic mobility and operation constraints. The results demonstrate that the proposed approach achieves significant improvements in terms of operation profit compared to baseline scheduling strategies.
Wednesday, August 9
RAMI H ABED
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Skurupuru – A Secure, Mobile-First Schoolpooling App
Traffic congestion is wasteful of time and emissions. Around schools, it creates hazards to students and staff and it clogs arterials for other commuters during drop-off and pick-up times. In response to this problem, the city of Bellevue developed a TDM program called SchoolPool. However, Bellevue’s Districtwide Travel Survey reveals that the same proportion of people carpooled in 2022 as did in 2017 – 11%. This despite 42% of parents expressing interest in carpooling in another 2017 survey. While considering various approaches to increasing carpooling over the years, Bellevue schools still lack a viable technical solution to address the problem.
We develop「スクールプール」- Skurupuru – a secure, mobile-first, featureful, and brandable cross-platform app built on a Firebase backend. Skurupuru primarily aims to facilitate carpooling to and from schools. Skurupuru is designed in response to requirements elicited from city of Bellevue staff, incorporates stakeholder input from school staff and parents, addresses limits in previous technical solutions, and is mindful of findings in carpooling and schoolpooling studies.
Friday, August 11
REMYA MAVILA KIZHAKKEVEETTIL
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Icare – A Virtual Assistant for Mental Health Powered by AI
This presentation describes the design and implementation of a virtual assistant to support mental health, using artificial intelligence and machine learning. Currently, mental health related issues are increasing among individuals because of various reasons. Sharing their feelings with someone who cares about them plays a major role in resolving these issues. Virtual assistants that can simulate human conversations using artificial intelligence can be very effectively used to communicate with individuals facing challenges. iCare is an application integrated with a virtual assistant or chatbot intended to provide support for individuals suffering from mental-health-related issues. The iCare virtual assistant provides a safe, private, virtual environment for users to share their feelings, and get empathetic response that improves their mental condition. The virtual assistant relies on machine learning algorithms to formulate the response for users. The bot understands the user’s query and triggers an accurate response as text or speech with the help of natural language processing. iCare implements a different approach from current solutions, by using a combination of multiple techniques to provide accurate responses to its users. The project is designed to provide support for a range of users like those who are suffering from anxiety, depression, individuals who are unhappy and need some help to improve their present feeling.
HARLEEN KAUR BHAMRAH
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Data Modeler: UWB Web Based Learning Tool for Database Modeling
Object-Role Modeling (ORM) serves as a robust technique for teaching and implementing database and object-oriented design. Its visual representation of real-world entities and emphasis on semantics make it a valuable resource for students and professionals seeking to grasp database design concepts swiftly. Similarly, Logical Data Modeling (LDM) is widely embraced for its ease of learning and adaptability to changes, as it is supported by numerous modeling tools and frameworks. However, many available modeling tools lack comprehensive support for ORM, and Microsoft VisioModeler is incompatible with new operating systems. To address this limitation, the project’s focus is on developing a web-based application that supports ORM, LDM, and SQL conversion and generation, following software engineering principles, enhancing features, and conducting comparative analysis. The main goal is to implement essential features for building ORM models in the initial phase of the database modeling process, while also diligently examining the system to address bugs and non-functional aspects effectively. Additionally, we will prioritize clean code practices, Test-Driven Development (TDD), logging, exploring and implementing exception handling enhancements. The project also emphasizes the learnings and decisions made throughout the tasks.
APURVA SHARMA
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Data Modelling Tool: A Tool to Create Database Models and their Automatic Conversion
The tool primarily focuses on ORM (Object Role Modeling) and LDM (Logical Data Model) techniques. It offers students a practical and visual approach to explore various modeling concepts, create ORM models, convert them to LDM representations, and generate SQL scripts. The tool bridges the gap between theoretical concepts and real-world applications, providing students with hands-on experience and a deeper understanding of database designing principles. Developed as an initiative by Professor Mark Kochanski, the “”Data Modeling Tool”” offers an interactive and visual learning environment for students, facilitating their understanding, application, and exploration of various modeling techniques. Project is inspired by VisioModeler 3.1 which supported ORM modeling comprehensively but is no longer available to use.
This paper presents the development of features and functionalities of the tool with implementation of new architectural design to enhance the forward and reverse engineering of ORM, LDM and SQL models.
SPRING 2023
Tuesday, May 2
AMITA RAJPUT
Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Checking Security Design Patterns In Source Code
A big challenge for software developers, engineers, and programmers is that the software they write may be subject to attacks by hostile actors. One way to address this problem is to use Security Design Patterns (SDP), but many software engineers are unaware of these patterns or do not have the proper understanding of them. Currently, our research group has been working on finding existing SDPs in source code, to help software engineers determine if they are missing any SDPs. My project builds on this by not only finding additional SDPs in source code but also checking whether they are correctly implemented. During these studies, I will deep dive and find the bigger issue of whether software developers are unaware of the SDPs or they know about them but wrongly implement them. An improvement in this area of research will be helpful for programmers to identify errors in both existing and new programs quickly and fix the vulnerabilities faster and more efficiently. Hundreds of thousands of software engineers and programmers working at big tech companies such as Norton, Microsoft, Oracle, and Adobe, and writing thousands of lines of source codes every day will be highly benefited from my research. An automated process will help them save hundreds of their man hours every week and put them into more value-adding tasks. It brings higher productivity and efficiency to the organizations and also ensures a more robust firewall against outside attacks on the organization’s proprietary data. This helps to safely keep the users’ private data which eventually helps the organizations retain their credibility and market share among their customers.
Tuesday, May 16
YIFEI YANG
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Agents Visualization and Web GUI Development in MASS Java
Multi-Agent Spatial Simulation (MASS) is an agent-based modeling (ABM) library that supports parallelized simulation over a distributed computing cluster. Places is a multi-dimensional array of elements, each called Place, which are dynamically allocated within the cluster. Agents is a set of execution instances that can reside on a Place and migrate to any other Place with global indices. MASS UI consists of two parts: InMASS and MASS-Cytoscape. InMASS allows users to execute commands line by line in an interactive way and provides users with additional features. MASS-Cytoscape enables users to visualize Places and Agents in Cytoscape. However, the current implementation of InMASS hacked MASS too much and became incompatible with the latest versions of MASS. The current visualization is limited to a single computing node and to agent existence. Moreover, the recent MASS does not have a web interface to simplify operations. To address these problems, the goals of this project are: (1) re-engineering the current implementation of InMASS; (2) developing place visualization of 2D continuous space, Binary Tree, and Quad Tree. Improve current Agent visualization; and (3) designing an all-in-one WEB GUI for InMASS design. We adopted the existing features to accomplish the first goal, re-implemented InMASS features, including dynamical loading, checkpointing/rollback, and agent history tracking; and optimized the current codebase. These modifications open the possibility of the future expansion of InMASS and allow InMASS to serve all MASS users. The project extended the current Places and Agents visualization for distributed settings and more descriptive Agents information, optimized the operation logic of MASS control panel. These additions and optimizations made it easy to use and analyze simulations. The implementation of the web interface enables users to monitor their clusters. And it provides a basic frame for future developers to add on more practical functions.
POOJA PAL
Chair: Mark Kochanski
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Enhancements in Teaching Tools: An Application to Simplify the Complexities in Course Management
Canvas is a web-based learning management system, or LMS, that allows institutions to manage digital learning, educators to create and present online learning materials and assess student learning, and students to engage in courses and receive feedback about skill development and learning achievement. Canvas features specifically designed to meet a variety of institutional, educational, and learning needs. However, Canvas can be improved with new features to increase the productivity of instructors and students using the system. This capstone project is part of a team effort developing a browser-based full stack application that supports new features by following software engineering principles, performing feature enhancements, and comparative analysis. The project’s focus is to build several independent features in the Teaching Tools application as a first step towards making it as a component to be embedded within Canvas to have a great digital learning experience at the University of Washington – Bothell along with practicing software engineering principles to ensure efficient system design and user experience.
Thursday, May 18
BRANDON VASSION
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Investigating Constrained Objects in AR for Validation of Real-life Models
Augmented Reality (AR) studies the approaches that enhance reality by integrating virtual content into the physical environment in real-time. In the simplest form, virtual objects in the physical environment are stationary, where AR applications serve as powerful tools for visualization. The support of interaction with objects in the environment brings the AR application from being passive for observing the augmented world to one where the user can actively explore. When the interactions follow intuitive physical world constraints, an AR application, or constraint-based AR, can immerse users in a realistic augmented world.
We categorize existing constraint-based AR by the relationship between and interaction of the objects being constrained: virtual objects constrained by virtual objects, physical by virtual, and virtual by physical. This straightforward classification provides insights into the types of and potentials for useful applications. For example, virtual by virtual can describe the pages of a virtual book being constrained where the corresponding interaction would be the flipping of the virtual pages. In contrast, physical by virtual would mean placing a physical coffee cup over the virtual book. Lastly, virtual by physical would be placing and pushing the virtual book on an actual physical desktop. The subtle and yet crucial differences are that in the first case, the objects and the interactions can also be carried out in a pure virtual 3D world, physical by virtual has practical implementation challenges, and that, virtual by physical presents an interesting opportunity for immersing and engaging users.
This project investigates using virtual by physical constraint-based AR to validate the functionality and visuals of real-life models. We observe and identify common and representative real-world interaction constraints to include: 1D sliding, 2D planar sliding, hinged rotation, and the potential for combining these constraints. The project examines the functionality, interactability, and integration of these constraints in practical applications, in this case, a home decoration setting. With the results from an initial technology investigation, aiming to achieve accuracy and reliability in interactions, we have chosen marker-based AR through Vuforia with Unity3D. We have derived a systematic workflow for creation and have demonstrated successful integration of virtual objects into the real world with relevant constraints by corresponding physical objects. Our prototype results are various versions of an augmented room with distinct decorative virtual objects that are constrained by relevant physical objects where the interactions are intuitive and integrations essentially seamless. These rooms support multiple constrained objects functioning in the same environment.
Our categorization points to a well-defined AR application domain, virtual by physical, for investigation. The success of the augmented rooms demonstrates the usefulness of this category of constraint-based AR applications in validating functionality and visuals. Lastly and significantly, our formulated workflow for constructing virtual by physical constraint-based AR applications serves as an efficient and effective template for future investigations into this domain.
Monday, May 22
IRENE LALIN WACHIRAWUTTHICHAI
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Multi-Stream, Multi-Modal Dataset Viewer That Supports the Navigational Work of Wide-Field Ethnography Research
Wide-Field Ethnography (WFE) refers to an approach of gathering and analyzing large datasets of videos, audio files, screen capture, photos, and other artifacts related to the intricate intermingling of human subjects with computer systems and the social relationship and collaborations among these entities. WFE datasets are high in volume, containing multiple types of data and multiple data sources capturing the same events or moments of interest. For instance, the BeamCoffer datasets has 6 terabytes of video and audio recordings of software developers at work, videos of their computer screens, and thousands of photographs. The sheer volume of data gathered and its modal diversity make it hard to navigate the dataset to find the moments that are meaningful to the research question, especially if one wants to simultaneously play more than one video or audio file to concurrently see and hear different perspectives of the action unfolding at a particular moment of time. There are currently no tools that offer a reasonable way to navigate a WFE dataset. This project describes a software system built to help researchers navigate through large multi-stream, multi-modal datasets effectively and efficiently: the WFE Navigator.
KOROSH MOOSAVI
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Domain-Specific NLP Recommendation System for Live Chat
Twitch.tv is one of the oldest and most popular livestreaming platforms in use today, where a unique culture of emote usage and niche language has developed. Emotes are custom-made images, or GIFs, used in chat with varying degrees of access. Emotes render standard forms of English NLP ineffective, even when using models trained on data from social media posts including traditional emoji. The largest prior study created a Word2Vec model of the 100 most popular emotes across Twitch for sentiment analysis. This project branches from this work by creating a chat recommender system with a model trained on more recent data. The system finds similar emotes in a new channel for users based on their available emotes, allowing for easier onboarding and moderation in the chat. Users are recommended new channels based on the usage of emotes in a channel they are already familiar with.
Tuesday, May 23
SAHANA PANDURANGI RAGHAVENDRA
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Agent-Based GIS Queries
A Geographical Information System (GIS) is a vital software tool used across numerous domains to help store, manage, analyze, and visualize geospatial data. One of the core functions of the GIS is its ability to query, enabling scientists and researchers to analyze and discover underlying patterns and associations among various data layers. However, it is extremely time and computationally intensive to process complex spatial GIS queries on a single standalone system sequentially. Therefore, in this capstone project we parallelize GIS queries using an agent-based parallelization framework, Multi-Agent Spatial Simulation (MASS), and further explore the idea of incorporating computational geometry algorithms such as closest pair of points, range search and minimum spanning tree to GIS queries using agent propagation.
The major motivation behind integrating MASS library and GIS queries stems from the results of previous research in comparing MASS with other popular big data streaming tools. This research observed that agent-based computation using MASS yielded competitive performance and intuitive parallelization when introduced into data structures such as graphs. To verify this hypothesis of agent’s superiority, we now would like to utilize MASS Agents in GIS queries where agents utilize computational geometry problems to find results of GIS queries through propagation over MASS Places spread across different computing nodes.
The significant contributions of this capstone project are to demonstrate GIS queries as a practical application of agent-based data analysis. Further, this project focuses on migrating the previous implementation of MASS-GIS system from Amazon Web Services (AWS) to the University of Washington Bothell computational clusters consisting of 24 computing nodes to achieve scalability and fine-grained partitioning of the GIS datasets suitable for agent-based parallel GIS queries. Sequential and parallel, attribute and spatial GIS queries are designed and implemented in this project using contextual query language (CQL) modules from GeoTools (open-source GIS package) and MASS. Additionally, we also extend and integrate the previous research on computational geometry algorithms using MASS to GIS queries. Algorithms such as the closest pair of points are incorporated into GIS queries to find the closest cities within a certain distance from a given city. Likewise range search is used to find all the cities in a given country given the range of geographical bounds of a country and minimum spanning tree is extended to find the shortest path between two points on a map. Lastly, we evaluate the performance of parallel agent-based GIS queries implemented using MASS. The results show that agent-based GIS queries using MASS-CQL and the closest pair of points algorithm are time efficient. Furthermore, MASS based GIS queries using computational geometry algorithms of the closest pair of points and range search provide 100% accuracy. However better optimization techniques need to be applied to improve the performance of agent-based GIS queries using the range search algorithm.
JASKIRAT KAUR
Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Improving the Accuracy of Mapping Vulnerabilities to Security Design Patterns
The increasing incidence of software vulnerabilities poses a significant threat to businesses and individuals worldwide. According to a threat report by Nuspire, 2022 was a record-breaking year for cyber threats, thus making mitigating vulnerabilities even more important. Identifying and mitigating vulnerabilities is challenging due to their complexity and the varied and increasing number of potential security threats that threaten the integrity of the software. Many researchers have proposed methods to identify vulnerabilities. Seyed Mohammad Ghaffarian and Hamid Reza Shahriari used data mining and machine learning techniques to discover vulnerabilities in their paper Software Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey. Similar to their work, a substantial amount of work has been done on discovering vulnerabilities and doing analysis on them but not much has been done to predict security patterns to mitigate vulnerabilities.
To discover security design patterns for security vulnerabilities, Sayali Kudale developed a project for predicting security patterns by using keyword extraction and text similarity techniques. This capstone study extends her work. It proposes techniques and measures different similarity metrics, to improve precision by extending the Common Weakness Enumeration (CWE) dataset by including the Top Ten standards of the Open Worldwide Application Security Project (OWASP) data in each CWE vulnerability description. We have also manually verified the ground truth data using the mitigations described by LeBlanc et al. in the book “24 deadly sins of software security”. To draw comparisons we worked on 4 datasets: 1. The security design document; 2. The Common Weakness Enumeration (CWE) vulnerabilities; 3. The extended dataset includes both CWE and OWASP data; and 4. Ground truth data.
To implement this we have executed the keyword extraction technique, Rapid Automatic Keyword Extraction (RAKE) using which we extracted keywords from the security pattern and CWE description and mapped them to each other. After this, different similarity measures have been applied to calculate the similarity metrics of the mapping. We then used the ones that gave the best results and tested them again on the two datasets to compare precision. The evaluation results indicated that the extended dataset gave better precision and accuracy.
CONNOR BENJAMIN BROWNE
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Evaluating the Effectiveness of Preprocessing Methods on Motor Imagery Classification Accuracy in EEG Data
Classification of motor imagery tasks is of significant interest in brain-computer interfacing today. Electroencephalograph data contains a large amount of noise obfuscating the signal associated with these motor imagery tasks. Various preprocessing techniques exist to increase the signal-to-noise ratio allowing for more accurate classifications. The effectiveness of these techniques varies between motor imagery tasks and in different environments. There is a need to evaluate these different techniques in many different environments and with different motor imagery tasks. This thesis investigates the effectiveness of several preprocessing techniques and classification models for classifying four different motor imagery tasks from EEG data. Specifically, Frequency Filtering, ICA, and CSP are evaluated using Naive Bayes, kNN, Linear SVM, RBF SVM, LDA, Random Forest, and a MLP Neural Network.
To control for the environment data was collected from student volunteers in short sessions designed to demonstrate either eye blinking, eye rolling, jaw clenching, or neck turning. Each task had its own procedure for the session. Motor imagery tasks in data were evaluated for frequency and amplitude commonalities using continuous wavelet transforms and Fourier transforms. Preprocessing Techniques were then iteratively applied to these datasets and evaluated using an ML model. The evaluation metrics used were Accuracy, F1, Precision, and Recall.
Results showed that the combination of Frequency Filtering, ICA, and CSP with the Naive Bayes model yielded the highest accuracy and F1 for all motor imagery tasks. These findings contribute to the field of EEG signal processing and could have potential applications in the development of brain-computer interfaces. It also directly contributes to a greater project in spatial neglect rehabilitation by providing novel insights to common artifacts in EEG data, as well as to the creation of a framework for data processing in real-time and offline.
Wednesday, May 24
ANIRUDH POTTURI
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Programmability and Performance Analysis of Distributed and Agent-Based Frameworks
In big data, the importance shifts from raw text presentation of data to structure and space of the data. Computational geometry is an area of interest, particularly for the structure and distribution of data. Thus, we propose using Agent-Based Modelling (ABM) libraries for big data to leverage the benefits of parallelization and support the creation of complex data structures like graphs and trees. ABMs offer a unique and intuitive approach to solving problems by simulating the structural elements over an environment and using agents to break these problems down using swarming, propagation, collisions, and more. For this research, we introduce using Multi-agent Spatial Simulations (MASS) for big data. We compare the programmability and performance of MASS Java against Hadoop MapReduce and Apache Spark. We have chosen six different applications in computational geometry implemented using all three frameworks. We have conducted a formal analysis of the applications through a comprehensive set of tests. We have developed tools to perform code analysis to compute metrics like identifying the number of Lines of Code (LoC) and computing McCabe’s cyclomatic complexity to analyze the programmability. From a quantitative perspective, in most cases, we found that MASS demanded less coding than MapReduce, while Spark required the least. While the cyclomatic complexity of MASS applications was higher in some cases, components of Spark and MapReduce applications were highly cohesive. From a qualitative viewpoint, MASS applications required fine-tuning resulting in significant improvements, while MapReduce and Spark offered very limited performance enhancement options. The performance of MASS directly correlates with the data, unlike MapReduce and Spark, whose performance is not affected by the distribution of data.
BALAJI RAM MOHAN CHALLAMALLA
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Yeast Analysis Tool
This paper presents an improvement of an application called the Yeast Analysis tool, which was developed by Dr. Lagesse to assist a group of biology researchers in their yeast analysis research project. The researchers aim to understand the cell division and chemical composition of yeast cells using fluorescent proteins. To achieve this, they need to examine the microscopic images of yeast cells and measure the distance between their nuclei. However, this is a tedious and error-prone task. They must segment the images manually, input the data, and check the accuracy. And even then, they are not sure about their outcomes. They require a better approach. That is why Dr. Lagesse created the Yeast Analysis tool, an automated image analysis method that can perform the task for them. It can segment yeast cells in images and measure the distance between their nuclei with high precision and speed. It uses deep learning techniques to learn from the data and enhance its performance. It is a valuable tool for researchers.
However, the Yeast Analysis tool is not perfect. It has large methods and most of the code is written in a single file, which makes it complicated and obscure. It has bugs that cause errors and crashes. It has some limitations and needs some refinements. That makes it hard to use and maintain. The paper focuses on re-architecting, refactoring, improving GUI, and resolving the bugs of the project. Followed best practices such as developing iteratively, managing requirements, and agile software development model to work on this project.
Proposed plugin-based architecture where an application can be created from the collection of different, reusable components that don’t rely on one another but can still be assembled dynamically using these components. It helps to extend the functionality of the application without affecting the core structure of the application. Refactoring the code included making the methods modular, and removing the code duplicates. It helped increase the readability of code and increase in the performance by 7.4% of the application. Improving GUI and making the application bug free helps the user to use the application easy to use and increases user productivity. Performed a GUI survey where users said the new GUI is user-friendly and rated 4.3 out of 5. In conclusion, the Yeast Analysis tool is now more user-friendly, reliable, and efficient. It will help the researchers achieve their goals faster and easier. It will advance science and technology in various fields.
SIDHANT BANSAL
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Refactoring Virtual Reality-Based Orthoptic Toolkit
According to the Centers for Disease Control (CDC) under the Vision Health Initiative, it is noted that approximately 6.8% of children under the age of eighteen years in the United States are diagnosed with vision problems. Vision problems can severely impact a child’s learning.
Strabismus (crossed eyes) is one of the most common eye conditions in children. If left untreated, it can lead to amblyopia, commonly known as lazy eye. To regain binocular vision, a person with strabismus requires training in five levels of fusion skills, each level indicating progression in ability and vision complexity. The existing toolkit uses virtual reality (VR) to provide an environment for individualized, supervised therapy for children suffering from strabismus to regain binocular vision. The toolkit has the following four applications that may be useful for improving vision: luster, simultaneous perception, sensory fusion, and motor fusion. Since each of these applications are a separate application right now, it doesn’t adhere to the non-functional requirements of the overall toolkit. This project aims to evaluate and provide an architecture that will support the nonfunctional requirements i.e., maintainability, portability, and extensibility.
Thursday, May 25
HARSHIT RAJVAIDYA
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: An Agent-based Graph Database
Graph databases are a type of NoSQL databases that use graph structures to store and represent data. Unlike traditional relational databases that use tables and rows to represent data, graph databases use nodes and edges to represent relationships between data items. This allows for more flexible and efficient querying of complex and connected data items. Graph databases provide us with functional capabilities of querying a large number of interconnected data schemas, such as social networks and biological networks. In this project, we aim to build a Graph database using the MASS (Muti-Agent Spatial Simulation) library that relies on Places and Agents as the core components. The MASS library has already supported graph data structure (GraphPlaces) which is distributed on a cluster of computing nodes. However, the current implementation worked on specific graph types. This project implements graph creation using CSV files as generic inputs as possible. We also implement a query-parsing engine that takes OpenCypher queries as inputs and parses it to method calls of MASS GraphPlaces. On top of that we have implemented four types of queries (including where clause, aggregate type, and multi relationship queries) in order to perform verification of the graph database and to perform query benchmarks. Each benchmark measures the query latency, graph creation times, and spatial scalability of all the queries. The performance measurements are performed on a cluster of eight computing nodes, and the spatial scalability is measured using a Twitch monthly dataset, which contains more than 7k nodes and more than 20k edges. The research presents significant improvements in query latency and spatial scalability as we increase the number of computing nodes.
VENKATA RAMANI SRILEKHA BANDARU
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Parallelization of Bio-inspired Computing Algorithms Using MASS JAVA
The exponential growth of big data has posed significant challenges for traditional optimization algorithms in effectively processing and extracting meaningful insights from large-scale datasets. In this context, bio-inspired computing has emerged as a promising approach, drawing inspiration from natural systems and phenomena. By mimicking biological processes such as evolution, swarm behavior, and natural selection, bio-inspired algorithms offer innovative solutions for optimizing data processing, pattern recognition, classification, clustering, and other tasks related to big data analytics.
Parallelizing bio-inspired computing algorithms is crucial for achieving improved performance and scalability. This accelerates the optimization process and enhances the efficiency of solving challenging problems. Multi-Agent Spatial Simulation (MASS) is an agent-based modelling library that has been used in great extent to parallelize a variety of simulations and data analysis applications. Building on this foundation, the implementation of Bio-inspired Computing algorithms project is an exploration into the advantages of using MASS Java to parallelize computationally complex algorithms.
This project presents the applications of algorithm designs for agent-based versions of Swarm Based Computation, Evolutionary Computation and Ecological Computation Algorithms. In addition to the designs of the algorithms, we present an analysis of programmability and performance comparing MASS Java to another agent based modelling framework named Repast Simphony.
FIONA VICTORIA STANLEY JOTHIRAJ
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Phoenix: Federated Learning for Generative Diffusion Model
Generative AI has made impressive strides in enabling users to create diverse and realistic visual content such as images, videos, and audio. However, training generative models on large centralized datasets can pose challenges in terms of data privacy, security, and accessibility. Federated learning is an approach that uses decentralized techniques to collaboratively train a shared deep learning model while retaining the training data on individual edge devices to preserve data privacy. This paper proposes a novel method for training a Denoising Diffusion Probabilistic Model (DDPM) across multiple data sources using federated learning techniques. Diffusion models, a newly emerging generative model, show promising results in achieving superior quality images than Generative Adversarial Networks (GANs). Our proposed method Phoenix is an unconditional diffusion model that leverages strategies to improve the data diversity of generated samples even when trained on data with statistical heterogeneity (Non-IID data). We demonstrate how our approach outperforms the default diffusion model in a federated learning setting. These results are indicative that high-quality samples can be generated by maintaining data diversity, preserving privacy, and reducing communication between data sources, offering exciting new possibilities in the field of generative AI.
Friday, May 26
JEFFREY ALEXANDER KYLLO
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Inflorescence: A Framework for Evaluating Fairness with Clustered Federated Learning
Measuring and ensuring machine learning model fairness is especially difficult in federated learning (FL) settings where the model developer is not privy to client data. This project explores how the application of clustered FL strategies, which are designed to handle data distribution skew across federated clients, affects model fairness when the skew is correlated with privileged group labels. The study report presents empirical simulation results quantifying the extent to which clustered FL impacts various group and individual fairness metrics and introduces a Python package called Inflorescence (“a cluster of flowers”) that extends Flower, an open-source FL framework, with several clustered FL strategies from the literature.
PRIANKA BANIK
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Automatic Fake News Detection
With the proliferation of fake news, automatic fake news detection has become an important research area in recent years. However, teaching computers to differentiate between fake and credible news is complex. One of the main challenges is to train computers with an abstract understanding of natural languages. This project introduces a web framework that is capable of classifying fake and real news, employing three different approaches. The first approach uses a TF-IDF vectorizer and a Multinomial Naive Bayes classifier to identify fake news based on the significance of words appearing in the text news. The second approach uses a count vectorizer in place of TF-IDF vectorizer which emphasizes the frequency of words occurring in the news article. As a third strategy, LSTM (long short-term memory networks) neural network is implemented along with the word embedding technique to improve classification accuracy. Experimental results compare these three models with some of the existing works and a comparative analysis of multiple fake news detection techniques is presented to justify the effectiveness of the proposed system.
Tuesday, May 30
SANJAY VARMA PENMETSA
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Collaboratative Rhythm Analysis For Endangered Languages
Nearly 40% of 7000 languages in the world are expected to become extinct if no efforts are made to preserve them. To preserve the indigenous language and the heritage and culture associated with it, there is a significant need to analyze and document these languages. Blackfoot is an endangered language with approximately 3,000 native speakers left. It is used in the regions of Alberta in Canada, and Montana in the U.S.A. As a pitch accent language, meaning of Blackfoot words is dependent on the pitch patterns in addition to the spelling of the words. This makes it especially difficult to learn and teach the language. To address this need, MeTILDA (Melodic Transcription in Language Documentation and Application) was developed by Dr. Min Chen’s research group in collaboration with researchers at the University of Montana. It is a cloud-based platform to support the documentation and analysis of the Blackfoot language.
The primary goal of this capstone project is to enhance collaboration and data reuse on the MeTILDA platform. To achieve this goal, we have implemented several key features that are designed to improve user engagement and increase the overall efficiency of the platform. Firstly, to achieve improved collaboration, our project allows multiple users to work together for creating a Pitch Art, on the Create page. Secondly, we introduced enhancements to the file system, that includes the ability to share files with different levels of access, on the My Files page. Finally, to improve data reusability, we made significant improvements to the way Pitch Arts are saved to the Collections page. Specifically, allowing users to have the ability to modify the saved Pitch Art.
To ensure the quality of our implementation, we conducted extensive unit and load testing to identify any bugs or performance issues that could impact user experience. Additionally, we conducted a usability study with a diverse group of population to evaluate the effectiveness of the new features. The results of the study indicated that our improvements help in streamlining the workflow and improve the overall productivity on the MeTILDA platform. Furthermore, we published a paper at ACM ICMR 2023 with details to replicate and evaluate several main MeTILDA functions . Given the urgency in endangered language research, our ICMR paper helps share resources and knowledge among interested individuals in academic and local communities, and enables the operation, customization, and extension of our toolsets.
MEGANA REDDY BODDAM
Chair: Dr. Wooyoung Kim
Candidate: Master of Science in Computer Science & Software Engineering
3:30 PM.
Project: Interpretation of a Residual Neural Network and an Inception Convolutional Neural Network Through Concept Whitening Layers
Deep Learning models are difficult to interpret because they are complex, non-linear, and high dimensional algorithms. This paper’s goal is to contribute to interpreting one of these deep learning models: convolutional neural networks. Interpretive analysis is performed in the context of predicting Hepatocellular Carcinoma (HCC), the most common type of primary liver cancer, from liver tissue histopathology images. The convolutional neural network models analyzed are a 50 layer residual neural network and an inception convolutional network. The results from the predictive training and testing of the models show that the accuracy of models remains the same regardless of adding the interpretive training technique of concept whitening layers. Additionally, the results also show a greater interpretive power with concept whitening layers added to the model through post-hoc analysis methods, specifically inter-concept similarity rating, intra-concept similarity rating, concept importance rating, and feature vector displays.
TYLER CHOI
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Enhancing Search and Rescue Operations: A Pragmatic Application of User-Centered Development
This capstone paper investigates the development of a software solution tailored for search and rescue (SAR) operations, with a particular emphasis on evaluating the implementation and effectiveness of user-centered development (UCD) principles. Initially, the project aimed to create a Virtual Reality (VR) Interactive Topographical Mapping System. This phase resulted in the research and development of a sophisticated VR prototype, incorporating a comprehensive suite of features that facilitated live, interactive topographical mapping within a 3D virtual environment.
The objectives of UCD involve placing users’ needs and requirements at the forefront of the design process, ensuring that solutions not only possess technical prowess but also deliver value and impact for the target audience. However, despite the numerous technical accomplishments of the VR project, end-user feedback from stakeholders, such as forest firefighters, revealed the necessity for a solution that better aligned with their real-world requirements. These users required direct observation of ground and vegetation conditions to make informed decisions about mission trajectories, a capability unattainable with the VR application. This insight led to a pivotal shift in our approach, redirecting the project towards the development of a targeted desktop application explicitly designed to address the operational needs of Search and Rescue (SAR) personnel.
The resulting product is a desktop application accessible through both a Graphical User Interface (GUI) and a Command Line Interface (CLI), with development centered on continuous end-user engagement and feedback. This solution offers two distinct interfaces catering to different end-users, prioritizing a concise UI and output while avoiding unnecessary complexity and irrelevant details.
In evaluating the implementation of UCD, the project demonstrates that adopting a user-centric approach can enhance the efficiency and effectiveness of SAR operations, emphasizing users’ preference for utility over visual and graphical elements. Furthermore, the project’s evolution from a cutting-edge VR system to a specialized desktop application provides insights into the broader fields of computer science and emergency response.
In future work, this report investigates potential enhancements, illustrating a sustained commitment to continuous improvement and alignment with user requirements. The accomplishments of the VR project, despite the pivot, attest to the importance of innovation and exploration in software development. Additionally, the project underscores the vital role of UCD in crafting solutions that combine technical utility with a focus on addressing real-world challenges.
Wednesday, May 31
RAGHAV NASWA
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Conversational AI to Deliver Family Intervention Training for Mental Health Caregivers
Mental health issues are prevalent in the United States, affecting 22.8% of adults in 2022. Unfortunately, a significant proportion (55.6%) of these adults did not receive treatment. Effective communication characterized by empathy is essential for enhancing the well-being of individuals with mental health issues. Family intervention training can empower friends and family members to provide in-home treatment to mental health patients. However, many caregivers lack the necessary training to engage with patients in a compassionate and understanding manner. To address this issue, a conversational AI chatbot was developed to train caregivers in empathetic communication. The chatbot engages in interactive conversations with caregivers and offers guidance on compassionate and empathetic communication. The chatbot was designed to be interactive, user-friendly, and accessible to caregivers. Our study demonstrates that conversational AI can serve as a valuable tool for training caregivers, leading to improved patient outcomes through enhanced communication skills.
FAHMEEDHA APPARAWTHAR AZMATHULLAH
Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Design of Energy-Efficient Offloading Schemes for Opportunistic Vehicular Edge Computing
In edge computing, computation tasks are offloaded to the edge servers, which can be either stationary edge servers or mobile edge servers. Stationary Edge Servers (SES) are usually located at the edge of a network, such as a cellular tower. SES can provide computing resources for nearby users or devices with low communication latency. When more devices connect to the stationary edge servers they cannot easily scale up due to limited design capacity, resulting in degraded performance. In contrast, Vehicular Edge Servers (VES) are a type of mobile edge computing server that is usually deployed on vehicles. VES can provide low-latency computing services by bringing computing resources even closer to the users with higher flexibility. VES also overcomes the drawback of stationary edge servers by providing services to the areas where stationary edge servers may be unavailable to reach. These benefits ideally satisfy the performance requirement of latency-sensitive but computing-intensive mobile applications such as pervasive AI, augmented reality, and virtual reality. When designing computing offloading strategies for vehicular edge computing systems supported through opportunistic VES, one challenge is handling the tradeoff between the time-varying availability of VES resources and the limited energy of mobile devices. In this project, we formulated an optimization problem, which considers VES’s capacity constraints and mobile users’ energy constraints, for solving the offloading problems in opportunistic vehicular computing systems. To solve the formulated problem, we designed and implemented a solution using CVXPY, a convex optimization problem solver along with three heuristic approaches: greedy, round-robin, and moderate offloading methods. We conducted extensive simulations, and the obtained results demonstrated the effectiveness of the proposed algorithm in improving mobile users’ energy usage while maintaining an expected quality of computing tasks.
CAMERON KLINE-SHARPE
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Thesis: Technical and Clinical Approaches For Implementing a Vision Screening Tool
Detecting vision problems is a challenging task, especially in children and in underserved or rural communities. This is in part due to the difficulty of obtaining useful indications of vision problems, which may cause a child to be sent to an eye doctor. Modern vision screening approaches are either hard to scale, expensive, or limited in applicability. The aim of this thesis was to clinically test QuickCheck, a vision screening mobile application aimed to combat these limitations, and determine future development and testing plans based on the results of those tests. This was accomplished through continuing the development of QuickCheck from past work to a clinically testable state, completing several clinical tests of the application in different settings, analyzing the results of those trials, and determining what future work needed to be done on the application and in future clinical trials to get the application ready for distribution.
After four clinical tests across two different testing sites, QuickCheck’s performance was measured using testing time, test accuracy, specificity, and sensitivity, and an analysis of error types and causes was also performed. While QuickCheck was able to detect most individuals who had vision problems, this work determined that further testing and development is needed to decrease the false negative error rate, improve testing time, and increase study sample size to ensure that QuickCheck is ready for deployment as a screening tool.
DIVYA KAMATH
Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Migrating a Complex, Cpu-Gpu Based Simulator to Modern C++ Standards
Software engineering encompasses not just the act of writing code, but also the act of maintaining it. Maintainability can be improved in a number of ways; one such way involves updating the codebase to incorporate newer language features. This project focuses on updating Graphitti, a graph-based CPU-GPU simulator, to leverage modern C++ features and idioms. The objectives include enhancing reusability, reducing technical debt, and addressing serialization and deserialization limitations. All this while monitoring performance impact due to these changes.
The updated Graphitti codebase demonstrates improved memory management, enhanced reusability, and reduced technical debts, without sacrificing performance. This project has also paved the way for smoother integration of serialization and deserialization for all objects within Graphitti.
Thursday, June 1
NAYANA YESHLUR
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Thesis: Using Data Analysis to Detect Intracranial Hemorrhage Through Ultrasound Tissue Pulsatility Imaging
Traumatic Brain Injury (TBI) is a type of injury that affects how the brain functions. TBI can lead to short-term problems or more long-term severe problems including various types of intracranial hemorrhage, some of which can even result in death. For this reason, finding ways of detecting intracranial hemorrhages early in patients can help to provide faster and more appropriate care, potentially improving patient outcomes. While CT and MRI are more traditional methods of diagnosing intracranial hemorrhage, they have certain drawbacks which ultrasound imaging can overcome. This work utilizes data collected from experiments on TBI patients using an ultrasound technique known as Tissue Pulsatility Imaging (TPI), specifically data about brain and other tissues displacements over the cardiac cycle. The aim of this research is to use such data to understand the differences between healthy brain displacement and brain displacement of TBI patients (with dangerous bleeding in their brain). In addition, we explore if and how the identification of the points of maximum and minimum displacement can be used to further aid in the identification of intracranial hemorrhage. The identification of these displacement points has emerged as a significant objective in this study, as they hold the potential to uncover crucial distinctions between states of wellness and illness. Furthermore, their utility in future research lies in assessing the consistency of these discoveries when applied to a broader dataset.
KISHAN NAGENDRA
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Maya: An Open-Source Framework for Creating Educational Mobile Applications for Low-Tech Communities
This capstone project introduces “Maya,” an open-source framework aimed at assisting content creators in developing mobile applications to disseminate educational and awareness-related information to members of low-tech and low-literacy communities with limited or no internet access. The framework automates the transformation of PowerPoint presentations into mobile applications. The framework currently consists of two stages. In the first stage, the content creator feeds a PowerPoint presentation file to a user-friendly executable software system that extracts relevant information from the presentation and generates a single extract folder that contains both: a) a JSON file containing the metadata from the PowerPoint file, and b) a sub-folder with all the media from the PowerPoint file. The metadata includes information such as text, fonts, hyperlinks to pages, and paths to the media in the sub-folder. In the second stage, this extracted folder serves as input to another application that uses that information to create a mobile application that replicates the layout, images, text, and features from the original PowerPoint presentation.
The design for the Maya framework is based on the specifications provided for the Luna mhealth project (Luna), an initiative by Eliana Socha and Jon Socha. Luna aims to develop and deploy a low-tech mobile application to raise awareness about prenatal and postnatal health among the indigenous tribes in the Comarca Ngäbe-Buglé region of Panama. The development of Maya was based on the insights gained during the design and development of a non-generic mobile app that implemented the functionality in the original PowerPoint mock-up provided by Eliana and Jon Socha. Developing the Luna mobile app motivated the creation of the generic Maya framework. By utilizing the Maya framework, educational content creators without knowledge of mobile development can create powerful educational mobile applications for underserved communities across the globe, without the need to write any code.
JASON CHEN
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
11:30 A.M.
Thesis: Protein Structure Refinement via DeepTracer and AlphaFold2
Understanding the structures of proteins has numerous applications, such as vaccine development. It is a slow and labor-intensive task to build protein structures from the experimental electron density maps through manual effort, therefore, machine learning approaches have been proposed to automate this process. However, most of the experimental maps are not atomic resolution, so the densities of side-chain residues are insufficient for computer vision-based machine learning methods to precisely determine the correct amino acid type when the sequence of the protein is not provided. On the other hand, methods that utilize evolutionary information from protein sequences to predict structures, like AlphaFold2, have recently achieved groundbreaking accuracy but often require manual effort to refine the results. We propose a method, DeepTracer-Refine, which automatically splits AlphaFold’s structure and aligns them to DeepTracer’s model to improve AlphaFold’s result. We tested our method on 39 multi-domain proteins and we increased the average residue coverage from 78.2% to 90.0% and average lDDT score from 0.67 to 0.71. We also compared DeepTracer-Refine against another method, Phenix’s AlphaFold refinement, to demonstrate that our method not only performs better when the initial AlphaFold model is less precise but also exceeds Phenix in run-time performance.
YIWEI TU
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Interactive Watercolor Painting
Watercolor, as a well-known style of artistic painting, is appealing due to the translucent patterns formed by the spreading of its coloring water-based solution. These translucent patterns are produced by two basic brushing techniques, wet-on-wet and wet-on-dry, and due the stochastic nature of liquid mixture motion, each watercolor painting is distinctive. For newcomers, challenges in creating watercolor paintings include soiled canvas and wrong brush strokes that cannot be corrected due to the limitations of the paper and inconvenient tools.
To offer more freedom to watercolor creation, simulation of watercolor painting has been extensively studied including physically-based methods. The physically-based approach presents watercolor patterns by emulating the physical dynamics of paint and water flow and rendering an image based on the simulated results. The Lattice-Boltzmann method (LBM) is favored by researchers in watercolor fluid dynamic simulation due to its computation efficiency and stability in dealing with complex boundaries, incorporation of microscopic interactions, and potentials for parallel implementation.
The project follows Chu and Tai’s approach of modeling hydrodynamics with LBM and using the Kubelka-Munk (KM) reflectance model rendering method proposed by Curtis et al. Compared to other methods, this approach strikes a balance between accuracy of the simulation model and execution time. The LBE models fluid flow as a continuous propagation and collision process on a discrete lattice where the multiple lattices can be processed in parallel. By trading-off realism, LBE is capable of presenting relatively realistic watercolor results, while the parallel processing significantly reduces the simulation time.
The implementation of the physically-based simulation requires a platform that supports parallel processing at the per-pixel level, user’s interactive drawing activities, and rendering of the simulation results. Unity3D, as a cross-platform game engine, is chosen for the implementation of the system because of its support for user-defined HLSH shaders that can process pixel operations in parallel, a well-designed editor that can accommodate complex parameter adjustments, and pre-built pipelines that can render simulation results.
The implemented simulation system consists of four components: fluid injection, fluid flow simulation, pigment movement, and pigment composition. The first component receives the water applied to the digital canvas from the brush, and updates the edges between the wet and dry canvas. The fluid flow simulation component then simulates the diffusion of the fluid with a semi-rebounding scheme of LBE. The pigment movement component calculates the movement of the paint. As the final step, the rendering component renders the resulting image based on the KM model with the transmittance and reflectance of the pigment layers as input parameters.
The provided watercolor simulation system allows for the manipulation of brush, paper on canvas, and simulation settings. The system can make images based on the two basic brush approaches with popular watercolor patterns such as edge darkening, purposeful backrun, and pigment granulation by adjusting the brush settings. The mixture of multiple pigments might result in new and distinct colors when the KM model is used. Paper and simulation parameters can be adjusted to allow painting on canvases that do not exist in the physical world.
A novice painter can use the features of our system to make simple watercolor paintings. The system proved that the KM model can render watercolor visuals with color blending on the canvas, and that LBM can reduce the computational load of fluid simulations while maintaining simulation realism. The results of this project lay a solid platform for additional, in-depth research into watercolor simulation.
JIAQI ZHAN
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Refactoring of EYE Toolbox, a Web-Based Medical System
The EYE Toolbox is a web-based medical system offering services for testing, diagnose and therapy of patients’ learning related vision problem, and has been evolving for 10 years. It has been implemented and is currently being used in a single clinical setting, but now needs to be updated and prepared for use for wider distribution across multiple clinics. To prepare this software for more general use and proper maintenance, detailed analysis and review is required – and a new, refactored version needs to be developed to assure that all functional and non-functional requirements of this system may be maintained properly – and HIPAA-compliant and use modern-day cloud-based architectures.
After review of the existing code base, the importance of a focused refactoring effort was determined because during previous software development cycles (using code and fix processes/iterative development), the system did not appear to leverage standardized data structures and refined programming methods such as extendable classes mapping different types of users, and generalized methods covering multiple using cases. In addition, with limited focus on code maintenance, the system development over a long time span has problems of code redundancy and legacy code for deprecation. Facing the widening application environment the system supports with a greater number of users and new use demands, the system has a necessity to be reviewed and refactored to ensure it meets the required standards of adaptability, scalability, maintainability and performance to ensure its stable supports for the updating working environment.
This project determines at figuring out how to tailor a practical and suitable refactoring plan for the EYE Toolbox system, generating automatic tests for verifying the accuracy of refactored code based on its front-end and back-end effects, evaluating to what degree the generated refactoring plan could improve the system, and analyzing whether the refactored system could meet the requirement of system evolution and distribution in broader application environment or not. To design, implement, and evaluate the customized refactoring plan, this project applied a code review approach, used refactoring methods for PHP language to process back-end logic and HTML language to design web page, and measured the refactoring performance through derived metrics of adaptability ,scalability, maintainability and performance.
Based upon the evaluation result, the benefit and limitation of this refactoring plan was analyzed, and the further improvement direction including the further refactoring tasks and recommended evaluation metrics were discussed. This project enriches the exploration of the systematic refactoring approach for a medical system, and could inspire researchers and offer guidance for new team members in their future refactoring work of similar or related medical system.
Friday, June 2
POOJA NADAGOUDA
Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Scaling and Parallelizing LDA-GA (Latent Dirichlet Allocation – Genetic Algorithm) for Data Provenance Reconstruction
The task of ensuring the reliability of sources and accuracy of information is becoming increasingly challenging due to the Internet’s ability to facilitate the creation, duplication, modification, and deletion of data with great ease. Hence the importance of Data Provenance Reconstruction, which attempts to create an estimated provenance of existing datasets when no provenance information has been previously recorded. The Provenance-Reconstruction approach proposed by the ”Provenance and Traceability Research Group”, is based on the Latent Dirichlet Allocation Genetic Algorithm (LDA-GA), which uses a Mallet library, was implemented in Java and achieved satisfactory results when applied to small datasets. As a result of the increase in datasets, performance degraded. To improve accuracy and performance, the GALDAR-C++, a Multi-Library Extensible Solution for Topic Modeling in C++, was developed. As compared to a Java implementation, this solution using WarpLDA offered satisfactory results. To improve the performance further, a parallel computing strategy, Message Passing Interface (MPI), was applied on both the serial Java and C++ versions of code by parallelizing the LDA calls in each generation of the LDA Genetic algorithm. Both parallel Java and C++ implementations gave extraordinary performance improvement compared to their respective serial implementations. But both the parallel solutions were limited to using 9 nodes in parallel as the Genetic algorithm supported 9 populations. In order to further scale the parallel solution, we implemented a scaled genetic algorithm to support 12 and 24 populations using 12 and 24 computing nodes for both Java and C++ versions. Also, the previous serial and parallel solutions did not provide much improvement in terms of accuracy. For bigger datasets of size 5K articles, the accuracy was as low as 8%. Hence we further extended our Scaled Parallel LDA-GA Java version to improve accuracy. We optimized the existing LDA-GA strategy by providing the genetic algorithm with initial LDA parameters (topic count and iteration count) proportional to the size of the dataset and applying a cosine filter to LDA-GA clusters. This strategy provides accuracy improvement of more than 3 times based on the dataset size in comparison to previous serial and parallel solutions and performance improvement of 4x to 8x based on the dataset size in comparison to the previous serial solution. The results obtained make this a viable solution for future studies on provenance reconstruction, especially for larger datasets.
WINTER 2023
Friday, March 3
PURRNIMA DAYANANDAM
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: A Layered Model for Embedding Knowledge Management within Project Management: A Case Study
Knowledge management systems (KMS) and project management (PM) techniques are important for providing effective and high-quality products and services. These methods are evolving to be more complex and impactful, especially with more recent efforts to identify and build solutions that create a synergistic approach between the two.
This study aims to explore the unique needs of existing projects and the need for an efficient onboarding process of new students to recommend solutions to improve PM, onboarding process, and software development practices at EYE Research Group [1].
The study consists of two parts: 1) Classification and evaluation of ongoing projects. And, 2) Performing descriptive statistics and inductive coding on an electronic-survey data conducted among 20 existing and previous students at EYE Research Group [1].
First, the ongoing projects were classified into five different cases. These cases vary along key attributes such as technology, stage of project, environmental setting, testing procedures and end user to describe the uniqueness of each project. Second, an electronic-survey collected data to analyze the overall satisfaction rate and recommendations to improve existing onboarding processes, manage software development projects and improve software development work practices.
Results from the classification of ongoing projects into five unique cases provided an understanding of the project strategy of each project in terms of refactoring, clinical testing, new feature development, and new project planning. Next, the project attributes provided an initial insight into the existing knowledge base of the project by providing the stage of the project, the technology the software was developed, testing methods followed in each project, and the targeted users for each project. Finally, the current project goals provided an understanding of the direction of each project.
Results from the electronic survey suggests that while the majority of the new students that were onboarded with the existing system were satisfied, several changes need to be made to the system in order for it to become more efficient. These additions include the need for a project manager to set and follow up on goals, to have at least three features in the onboarding system, having a set meeting agenda, having clearly defined team and leadership roles as well as the use of a tracking tool to outline tasks and deadlines. The respondents also saw it best to have recent project documents, and for a better document storage system. Tutorials to the software development software were also to be offered in the form of videos.
Recommendations were made for implementing a layered project knowledge management (PKM) model for managing project knowledge. This layered model is a collection of PKM plan [2], KMS tools [3], knowledge processes (KP) [4] and project knowledge base [5]. Also, recommendations were made for an onboarding portal that introduces a three-step onboarding plan [6] to direct the student to the appropriate core knowledge elements (and associated resources) needed for their particular project type and specific attributes. It includes these three steps; 1) Getting Started 2) Overview of the Projects, Project Documents and Onboarding Buddy; and 3) Settling In.
Overall, embedding an effective KMS approach within project management (PM) might enhance the project management knowledge and promote an efficient onboarding process of new students.
Monday, March 6
LEUNG TSAN NG
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Parallelization of computational Geometry Algorithms
Computational geometry is a study of efficient algorithms for solving geometric problems. The aim of it is to optimize the algorithms as it always involves very large dataset in the real world. For such a dataset, a little improvement of the time complexity of algorithms can save a lot of time. Besides this, parallelization is another way to shorten the computational time. In stead of optimizing the time complexity, we can assign the sub-problems to different computing nodes and let them compute the problem individually. As a result, the division of workload can reduce the computational time largely. However, many computational geometry algorithms, such as Convex Hull, Voronoi Diagram, Closest Pair, have been researched for many years, and their performance had been tuned up to be very good in sequential version. It is very difficult to compare the parallel version and sequential version in this case. One major advantage for parallel programming is the amount of space. As we scale up the number of computing resources, the space such as memory and disk space increase. Therefore, parallel programming can handle much larger dataset. In this project, we focus on extending the previous projects including Euclidean Shortest Path and Voronoi diagrams that is done by previous students. We improve these two problems by using different approaches to handle larger amounts of data. The implementation is done with MapReduce, Spark, and Multi-Agent Spatial Simulation (MASS). These three frameworks have their own features. This project shows their performance on different geometric problems and compares their advantages and disadvantages.
Thursday, March 9
SATYAVADA VENKATA NAGA SIRISHA
Chair: Dr. Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Explore UWB In Augumented Reality
This paper presents a smartphone application for Android Platform that uses Augmented Reality (AR) to build an immersive experience for people visiting UW Bothell campus to understand and interact with the surroundings. The application has two main features: 1. Display nearby points of interests (POIs) as the user points the camera in a certain direction 2. Show navigation instructions in AR when the user selects a POIs. The challenge in AR applications in geo location mapping is to render AR content tied to a specific GPS coordinate. In this project, our solution, based on Google AR Core’s geospatial APIs, which uses Visual Positioning System (VPS) to precisely locate the user’s real world GPS Coordinates based on the surrounding images from Google’s Street View and tie AR content to the detected location. Existing methods for AR navigation store a virtual map in memory to render location-specific AR content thereby having high space and computational complexity. The use of the Geospatial API alleviates this problem. This project serves as a proof of concept for outdoor AR navigation in campuses. Currently, the application shows only navigation signs in AR. This project can be further extended to show richer AR and graphical content. It can be combined with existing techniques for indoor navigation to offer seamless AR experience anywhere within campus.
Friday, March 10
AHMED NADA
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Visual Question Answering
Visual Question Answering (VQA) is an Artificial intelligence (AI) and Computer Vision (CV) comprehensive task to answer questions about the visual content of an image. For instance, questions like “What color is the bus?” or “How many people are in the photo?” can be answered using VQA. The task has been shown to be important and useful in various domains, such as autonomous vehicles to find parking spots and understand parking signs, medical imaging applications to identify tumors, as well as virtual assistants and search engines. Researchers have developed models that are trained to solve VQA problems and effectively aid in tasks related to VQA to meet the need for such tools. In this project, we present a web-based VQA tool that consists of two components: a front-end application for submitting an image and a question, and a back-end application that utilizes a trained model to respond to the user’s request. In this research, VQA v2 dataset and state-of-the-art pre-trained models to extract image features and question embeddings were used to train a VQA model on Google Colab. Without resorting to partial credits or word similarity metrics, this approach achieved an accuracy rate of 48%, representing a 9% improvement over the results obtained with static word embedding.
Wednesday, March 15
POOJA CHHAGANLAL TANK
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
8:00 A.M.
Project: Continuation of Model Databases: A tool for designing and learning database systems
Object-Role Modeling (ORM) is a powerful technique for teaching database and object-oriented design. Its visual representation of real-world entities and focus on semantics help students understand database design concepts quickly. Logical data modeling (LDM) is another widely accepted technique that is easy to learn and adapt to changes, and that is supported by many modeling tools and frameworks. Many tools that are readily available and easy to use for students, don’t support ORM as comprehensively as Microsoft VisioModeler 3.1, which is no longer compatible with new operating systems. To address this issue, this capstone project is part of a team effort to develop a browser-based app that supports ORM, LDM, and SQL conversion and generation by following software engineering principles, performing feature enhancements, and comparative analysis. The project’s focus is to provide forward and reverse engineering for ORM, LDM, and SQL, ensuring efficient and accurate software system design.
AUTUMN 2022
Tuesday, November 15
SAMRIDHI AGRAWAL
Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Optimized Provenance Reconstruction for Machine-generated data
A lot of data is created, deleted, copied and modified easily over the Internet which makes it difficult to identify the authenticity and credibility of the data. It is important to reconstruct the provenance of data which has lost its provenance information. There are techniques which helps in recovering the metadata from which provenance can be reconstructed. However, many systems fail to capture provenance due to lack of provenance capture mechanisms such as the Source file repositories or file storage system. The Provenance-Reconstruction approach proposed by the “Provenance and Traceability Research Group” has numerous projects on reconstructing provenance.
The current research (OneSource) captures various reconstruction techniques for machine generated datasets with attributes such as file size, semantic meaning of the content, and word count in the files. OneSource improves provenance reconstruction for git commit history as machine generated datasets. OneSource algorithm uses multi-funneling approach which includes techniques such as data cleaning in python, topic modelling and cosine similarity for clustering, and lineage algorithm with endpoints known to achieve higher accuracy in recovering valid provenance information. OneSource generates ground truth data by extracting commit history and file versions of a git repository. To assess OneSource model performance, the model is evaluated on various datasets with varying data size and count of files. OneSource reconstructs provenance of clusters and relationship of files (cluster derivation) within the cluster. The evaluation results indicate that OneSource can reconstruct provenance of cluster by attaining 90% precision and reconstruct provenance of cluster derivation by attaining 66% precision with cosine similarity as the clustering method. OneSource yields improvement in accuracy of 60% for cluster derivation than the existing technique. In the future, research studies may use parallelization for larger datasets as well as optimizations in lineage algorithm may improve the model performance.
Thursday, December 1
DI WANG
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Real-time Cloth Simulation
Computer animated cloth is commonplace in video games and film making. Believable cloth animation greatly improves the sense of immersion for video gamers. Simulated cloth can seamlessly blend into live-action footage, allowing filmmakers to generate desired visuals by adjusting parameters. Real-time cloth simulation allows a user to interact with the materials, making the virtual cloth try-on experience possible. All these use cases prefer fast solutions, ideally achieving believable cloth animation in real-time. It is, however, challenging to simulate cloth both accurately and efficiently, due to its infinite deformation scenarios and complex inner mechanics. This project studied two solutions to tackle real-time cloth simulations: 1) an iterative approach using a linear solver, which fully exploits the GPU’s parallel processing architecture and efficiently solves the cloth material with thousands of vertices in real-time. 2) a nonlinear solver, which is based on the Projective Dynamics global-local optimization technique. We implemented an interactive application to demonstrate the two solutions, and assessed their qualities based on generality, correctness, and efficiency of the results. Our results show that both solvers are capable of generating believable real-time cloth animations in a wide range of testing scenarios: they interactively react to the changes in cloth attributes, internal and external forces, can be properly illuminated, texture mapped, and are capable of interacting with other objects in the scene, e.g., proper collision. The known conditions where the solvers could generate incorrect results were investigated: the instability of the linear solver with overly stiff spring constants, and the stiff self-bending of the nonlinear solver resulting in inability to support realistic wrinkles. By experimentation and theoretical reasoning, we identified that the linear solver’s instability is inevitable, while the nonlinear stiff-bending can be improved by a more sophisticated energy definition. To evaluate the both solvers’ efficiency, we recorded actual runtime and derived each performance function with respect to the cloth resolutions. Our results verify the expected algorithmic complexity: within GPU supported range, linear solver runtime can maintain constant as the resolution increases. The nonlinear’s running-time grows at a O(N3) rate.
KEVIN WANG
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Performance and Programmability Comparison of Parallel Agent-Based Modeling (ABM) Libraries
Agent-based modeling (ABM) allows researchers in the social, behavioral, and economic (SBE) sciences to use software to model complex problems and environments that involve thousands to millions of interacting agents. These models require significant computing power and memory to handle the high numbers of agents. Various research groups have implemented parallelized ABM libraries which allow models to utilize multiple computing nodes to improve performance with higher problem sizes. However, it is not clear which of these libraries provides the best-performing models and which is the easiest to develop a model with. The goal of this project is to compare the performance and programmability of three current parallel ABM libraries, MASS C++, RepastHPC, and FLAME. The Distributed Systems Lab at the University of Washington Bothell developed Multi-Agent Spatial Simulation (MASS) C++ as a C++-based ABM library. Different research groups developed RepastHPC and FLAME before MASS C++, and SBE researchers have successfully used these libraries to create agent-based models. To measure performance, we designed a set of seven benchmark programs covering various problems in the SBE sciences, and implemented each of them three times using MASS C++, RepastHPC, and FLAME. We compared the average execution times of the three implementations for each benchmark to determine which library performed the best. We found that certain benchmarks would perform better with MASS C++ compared to RepastHPC, while for other benchmarks the opposite was true. However, we found that across all benchmarks FLAME had the worst performance since it could not handle the same parameters given to the MASS C++ and RepastHPC implementations. To measure programmability, we performed a static code analysis and manual code review of each benchmark implementation to assess the three libraries quantitatively and qualitatively. We found that in terms of quantitative metrics, none of the three libraries was conclusively more programmable than the others. However, MASS C++ and RepastHPC may have more desirable qualities for developing agent-based models compared to FLAME.
Friday, December 2
JASON KOZODOY
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Automobile Retrieval Price Predictive System using LightGbm with Permutation Feature Importance
We see different powered vehicles of hybrid, gasoline, and electric models in the vehicle market. The varying vehicle features in these types of vehicles creates a unique problem in vehicle selection and price prediction. We create an automobile retrieval price predictive system to enable users to access different powered car results that have similar vehicle features. Our system focuses on selecting multiple powered vehicles for users and predicting prices on the selected vehicles. Our system also enables similar recommendations based on past vehicle selections. Our capstone project compares four different regression models: LightGbm, FastForest, Ordinary Least Squares, and Online Gradient Descent. The four models cover ensemble machine learning models and linear machine learning models on automobile datasets
for price prediction. We select a model to use for our system after comparisons. We select and use the LightGbm regression model for our personalized support retrieval prediction system. The LightGbm regression model achieved a price prediction accuracy of .97 within our regression evaluation results with cross-validation. Furthermore, we record permutation feature importance scores within our system to signify how feature importance scores differ after predictive learning. The system displays these rankings by car results to the user. This allows the user to learn about how different features influence the prediction of prices. This gives the user insight into the structure of importance by showing users low, medium, and high rankings for vehicle features that influence price predictions.
LUYAO WANG
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Thesis: Real-Time Hatch Rendering
Hatching has been a common and popular artistic drawing style for centuries. In computer graphics rendering, hatching has been investigated as one of the many Non-Photorealistic Rendering solutions. However, existing hatch rendering solutions are typically based on simplistic illumination models, and real-time 3D hatch-rendered applications are rarely seen in interactive systems such as games and animations. This project studies the existing hatch rendering solutions, identifies the most appropriate one, develops a real-time hatch rendering system, and improves upon existing results in three areas: support general illumination and hatch tone computation related to observed artistic styles, unify spatial coherence support for Tonal Art Maps and mipmaps, and demonstrate support for animation.
The existing hatch rendering solutions can be categorized into texture-based and primitive-based methods. These solutions can be derived in object or screen space. Based on our background research, we chose to examine the texture-based object-space method presented by Praun et al. The approach inherits the advantage of object-space temporal coherence. The object-space spatial incoherence is addressed by the introduction of the Tonal Art Map (TAM). The texture-based solution ensures that the rendering results resemble actual artists’ drawings.
The project investigated the solution proposed by Praun et al. based on two major components: TAM generation as an off-line pre-computation and real-time rendering via a Multi-Texture Blending shader.
The TAM construction involves building a two-dimensional structure, vertically to address spatial coherence as projected object size changes and horizontally to capture hatch tone changes. This unique structure enables the support for smooth transitions during zoom and illumination changes. We have generalized the levels in the vertical dimension of a TAM to integrate with results from traditional mipmaps to allow customization based on spatial coherence requirements. Our TAM implementation also supports the changing of hatch styles such as 90-degree or 45-degree cross hatching.
The Multi-Texture Blending shader reproduced the results from Praun et al. in real time. Our rendered results present objects with seamless hatch strokes and appear natural and resemble those of hand-drawn hatch artwork. Our implementation integrated and supported interactive manipulation of effects from general illumination models including specularity, light source types, variable hatch and object colors, and rendering of surface textures as cross hatch. Additionally, we investigated trade-offs between per-vertex and per-fragment tone computation and discovered that the smoothness in hatching can be better captured in the per-vertex computation with the lower sampling rate and interpolations. Finally, the novel integration of TAMs and traditional mipmaps allow customizable spatial coherence support which allows smooth hatch strokes and texture transitions in animations during object size and illumination changes.
Monday, December 5
JONATHAN LEE
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Psychosis iREACH: Reach for Psychosis Treatment using Artificial Intelligence
Psychosis iREACH aims to optimize the delivery of evidence-based cognitive behavioral therapy to family caregivers who have a loved one with psychosis. It is an accessible digital platform that can utilize the user’s intent and entities to determine the appropriate response. The platform is implemented based on an artificial intelligence and natural language understanding (NLU) framework, RASA. We developed the web application of the platform, and the chatbot has been integrated into the platform to collect data and evaluate performance. The link to the website is https://psychosisireach.uw.edu/.
Thursday, December 8
MEGHNA REDDY
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Audio Classifier for Melodic Transcription in Language Documentation and Application (MeTILDA)
There are about 7000 languages around the world, and 42% of these languages are considered endangered due to the decline in the number of speakers of the language. Blackfoot is one such language and is primarily spoken in Northwest Montana and Southern Alberta. MeTILDA (Melodic Transcription in Language Documentation and Application) is a collaborative platform created for researchers, teachers, and students to interact, teach, and learn endangered languages. It is currently being developed based on the Blackfoot language. Deep learning has progressed rapidly in the field of audio classification and has shown potential to serve as a tool for linguistic researchers in documenting and analyzing endangered languages. This project focuses on creating a web application for researchers of Blackfoot to assist their continuous research efforts in supporting deep learning research for MeTILDA. This application focuses on the automatic classification of different sounds in Blackfoot, specifically vowels and consonants, and provides three main functionalities. The dataset preparation section allows the user to create datasets of vowels and consonants easily and reduces manual effort. The feature extraction section allows the user to extract their choice of audio features such as Mel-Frequency Cepstral Coefficients, spectrograms, and spectral features for further processing and re-training models, and the audio classifier section allows the user to automatically obtain instances of vowels and consonants in user-provided audio files within Blackfoot language. The audio classifier uses an optimized ANN with audio spectral features and Mel-Frequency Cepstral Coefficients as input features and provides an accuracy of 89%. This application lowers the manual efforts and time-intensive tasks for researchers of Blackfoot and can be extended to classify other sounds in the future.
Friday, December 9
CAROLINE TSUI
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Agent-based Graph Applications in MASS Java and Comparison with Spark
Graph theory is constantly evolving as it is applied to mathematics, science, and technology, and it has active applications in communication networks, computer science (algorithms and computation), and operations research (scheduling). Research on the realization and optimization of graph theory is of great significance to various fields. However, due to the increasing size of databases today, the volume of datasets (which can be represented as graphs for graph theory analysis) in real academia and industry has reached the level of petabytes (1,024 terabytes) or level of exabytes (1,024) petabytes. Analyzing and processing massive graphs has become a principal task in different fields. It is very challenging to process and compute rapidly growing, huge datasets (graphs) in a reasonable amount of time with limited computing and memory resources. In order to meet the need of improving performance, some solutions have emerged one after another – parallel frameworks that support graph computing. However, it is unclear how these parallelization libraries differ in performance and programmability for graph theory research or graph application development. The goal of this project is to compare the performance and programmability of two parallel libraries, MASS Java developed by the DSLab at the University of Washington Bothell, and Spark (including Spark GraphX) developed by the AMPLab at the University of California Berkeley, for graphics programming. In order to balance performance and programmability, we used MASS Java and Spark to design and develop Graph Bridge, Minimum Spanning Tree, and Strongly Connected Components respectively, for a total of six graph applications. After three rounds of running the applications and comparing their performance, the results show that for the Graph Bridge application, the performance of Spark is slightly better than that of MASS Java, and for Minimum Spanning Tree and Strongly Connected Components applications, MASS Java performs slightly better. Because MASS Java provides agents, they can more flexibly handle vertex-based regional operations and pass data to other agents; but Spark is not an agent-based library. However, for Graph Bridge applications that require depth-first traversal to obtain results, the agent advantage of MASS cannot be reflected. To measure programmability, we perform quantitative and qualitative evaluations respectively. The results show that the programmability of the two libraries is similar, but from the user’s point of view, MASS Java is more intuitive and suitable for developing graphical applications.
CHRIS LEE
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Extending a Cloud-Based System for Endangered Language Analysis and Documentation
With 40% of the world’s 7000 languages considered endangered, there is a significant need to document and analyze these languages to preserve the language and its associated culture and heritage. Blackfoot, spoken by approximately 2800 speakers in Alberta, Canada and Montana in the United States, is one such endangered language. Classified as a pitch accent language, the meaning of a Blackfoot word is not based exclusively on the spelling of the word but on the pitch patterns of the spoken word, which makes it challenging to teach and learn. To overcome this challenge and aid in the revitalization of the Blackfoot language, we collaborated with researchers at the University of Montana in developing a cloud-based system known as MeTILDA (Melodic Transcription in Language Documentation and Application). The goal of this project is to modernize the technologies originally used in the MeTILDA system, extend its analytic capabilities to incorporate the study of rhythm, and improve its data reuse and collaboration capability by persisting the data used in creating the visual aids called Pitch Art. The proposed features will benefit linguistic researchers in furthering their understanding of the Blackfoot language. It will facilitate teachers in developing curriculum for language acquisition, and help students take advantage of a teachers guided learning plan. With this system, we aim to provide an extensible platform to support future development to support the documentation and preservation of other endangered languages.
SUMMER 2022
Monday, August 1
JUNJIE LIU
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: COVID-19 Fake News Detector
COVID-19, caused by a coronavirus called SARS-Cov-2, has triggered a pandemic impacting people’s everyday life for more than two years. With the fast spreading of online communication and social media platforms, the number of fake news related to COVID-19 is in rapid growth and propagates misleading information to the public. To tackle this challenge and stop the spreading of fake news regarding COVID-19, this project proposes to build an online software detector specifically for COVID-19 news to classify whether the news is trustworthy. Specifically, the intellectual contributions for this project are summarized below:
- This project specifically focuses on fake news detection for COVID-19 related news. In general, it is difficult to train a generic model for all domains, the general practice is to fine-tune a base model to adapt the specific domain context.
- A data collection mechanism to obtain fresh COVID-19 fake news data and to keep the model fresh.
- Performance comparisons between different models: traditional machine learning models, ensemble machine learning models, and state-of-the art models – Transfer models.
- From engineering perspective, the project will be the first online fake news detection website to focus on COVID-19 related fake news.
ANDREW NELSON
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Project: Real-time Caustic Illumination
Caustic illumination is the natural phenomenon that occurs when light rays bend as they pass through transparent objects and focus onto receiver objects. One might notice this effect on the ocean floor as light rays pass through the water and focus on the floor. Rendering this effect in a simulated environment would provide an extra touch of realism in applications that are meant to fully immerse a user in the experience. Traditionally, caustic illumination is simulated with offline ray tracing solutions that simulate the physical phenomenon of transporting photon particles through refraction and depositing the results on the receiving object. While this approach can yield accurate results, it is computationally intensive, and these ray tracing solutions can only be rendered in batches. To support caustics in real time, the calculations must simulate the natural phenomenon of photons traveling through transparent objects in every rendering frame without slowing down the application. This project focuses on rendering caustics in real time using a multi-pass rendering solution developed by Shah et al. Their approach constructs a caustic map in every frame which is used by subsequent rendering frames to create the final effect. The goal of this project was to develop an application that renders caustics and supports user interaction in real time. Our implementation uses the Unity game engine to successfully create the desired effect while maintaining a minimum frame rate of thirty frames per second.
Thursday, August 4
KALUAD ABDULBASET SANYOUR
Chair: Dr. Wooyoung Kim
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: The Role of Machine Learning Algorithms in Editing Genes of Rare Diseases
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), is an adaptive immunity mechanism in prokaryotes. Scientists have discovered that it is a programmable system that could be used to edit the genes of various species, which allows us to edit genes causing some rare diseases. CRISPR is associated with Cas9 protein causing double-stranded breaks in DNA. Cas9 binds to a gRNA that guides the Cas9 to a specific site that can be edited. Although gRNA is versatile and easy to design, it lacks accuracy in determining the editable sites. This can misguide Cas9 to a wrong location, causing changes in different genes. Hence, CRISPR process needs to find the ideal gRNA that can guide Cas9 to on-target, and avoid off-target. Various machine learning (ML) algorithms can play an important role in evaluating gRNAs for the CRISPR mechanism, and recently many computational tools have been developed to predict the cleavage efficiency of gRNA design process. Here, the project aims to provide an overview and comparative analysis of various machine and deep learning (MDL)-based methods that are effective in predicting CRISPR gRNA on-target activities. Comparison results show that hybrid approach combining deep learning and other ML algorithms presented excellent results.
Monday, August 8
SYED ABDULLAH
Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Remote Onboarding for Software Engineers: From “Forming” to “Performing”
Onboarding is defined as the process when a new employee joins, learns about, integrates into and becomes a contributing member of a team. A successful onboarding is essential for moving a team from Forming to Performing stage. It helps increase the new hire’s job satisfaction, improve the team’s performance, and reduce turnovers (which bring the team back to the forming stage). With remote work being the new norm in software engineering, remote onboarding brings a unique set of challenges.
In this project, I aim to identify the main challenges faced during remote onboarding for software engineers, specifically for role-specific onboarding that happens in the team domain, and provide recommendations on improving this onboarding process. To achieve these aims, I conducted a qualitative interview study and activity exercise with software engineers who have gone through remote onboarding. Nine interviews were conducted with software engineers ranging from junior software engineers to senior software engineers and software engineering managers. I analyzed these interviews to gain insights into factors affecting onboarding. From the interviews, I identified a hierarchy of needs, in which I classified the needs of the new hire into basic needs and needs required for excellence. Needs such as access to tools, clarity of tasks and knowledge were categorized as basic needs to do the work, whereas mentorship, relationship building, and collaboration transform the onboarding into an excellent experience. I then further linked these needs to 5 main themes that emerged during the interviews for having an effective onboarding: (i) having an effective onboarding buddy; (ii) the ability to create relationships with team members and other stakeholders; (iii) being provided with up to date and organized documentation and onboarding plan; (iv) the manager’s ability to listen and adapt to remote needs; and (v) a team culture which enables team members to communicate effectively and get unblocked quickly. Based on the interviews’ analysis together with insights from the literature, I developed checklists for recommended best practices for effective onboarding. A checklist was developed for each of the main onboarding stakeholders i.e., manager, onboarding buddy and new hire, along with a template of an onboarding plan. Using these checklists will help improve the effectiveness and consistency of remote onboarding for software engineering new hires.
Tuesday, August 9
ASHWINI ARUN RUDRAWAR
Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Evaluating the impact of GPU API evolution on application performance and software quality
Researchers and engineers have started to build parallel computing systems using sequential CPU + parallel GPU programs. In recent years, there has been an increasing number of hardware GPU devices available in the market along with a number of software solutions that support these hardware devices. A substantial amount of work is required in identifying the best combination of hardware and software for building heterogeneous solutions. One of the combinations developers use is NVIDIA GPUs and CUDA APIs. With the rapid architectural changes in GPU hardware, the related functioning of APIs also changes. There is considerable regression in the development of applications built using prior versions of APIs due to backward compatibility limitations. This thesis evaluates the evolution of NVIDIA GPU and CUDA APIs with the help of Graphitti, a graph based heterogeneous CUDA/C++ simulator. This thesis identifies the advantages, limitations, and underlying functioning of a subset of APIs. This research explores these APIs in the context of performance, compatibility, ease of development, and code readability. It discusses how this process helped to implement a software change compatible with the simulator. This thesis documents the implementation of two APIs, ‘separate compilation’ and ‘link time optimization’ on the simulator, and how the implementation will help users to write modular code in Graphitti. It also shows there is almost no performance overhead over one of the largest neural network simulations in Graphitti. The implementation offers flexibility and scope to enhance the heterogeneous nature of Graphitti which will help to simulate much larger networks.
Wednesday, August 10
ANDREW HITOSHI NAKAMURA
Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: Macromolecular Modeling: Integrating DNA/RNA into DeepTracer’s Prediction Pipeline
DeepTracer is a fully automatic deep learning-based method for fast de novo multi-chain protein complex structure determination from high-resolution cryoelectron microscopy (cryo-EM) density maps. The macromolecular pipeline extends DeepTracer’s functions by including a segmentation step and pipeline steps to predict nucleic acids from the density. Segmentation uses a Convolutional Neural Network (CNN) to separate the densities of the two types of macromolecules, amino acids and nucleotides. Two U-Nets are trained to predict amino acid and nucleotide atoms in order to predict the structure from the density data. The nucleotide U-Net was trained with a map sample size of 163 cryo-EM maps containing nucleotide density, and identifies phosphate, sugar carbon 4 and sugar carbon 1 atom positions. When compared to Phenix’s pipeline, amino acids show favorable RMSD metrics, and nucleotide show comparable phosphate and nucleotide correlation coefficient (CC) metrics. The trained nucleotide U-Net model primarily focuses on double stranded DNA/RNA. Future work involves utilizing more density map data in training the nucleotide U-Net to detect single stranded DNA/RNA and removing phosphate outliers in postprocessing to improve the nucleic acid prediction.
ALEX XIE
Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Improving the Quality of Inference during Edge Server Switch for Applications using Chained DNN Models
Recent advances in deep neural networks (DNN) have substantially benefited intelligent applications, for example, real-time video analytics. More complex DNN models typically require a more robust computing capacity. Unfortunately, the considerable computation resource requirements of DNNs make the inference on resource-constrained mobile devices challenging. Edge intelligence is a paradigm solving this issue by offloading DNN inference tasks from mobile devices to more powerful edge servers. Due to user mobility, however, one challenge for such mobile intelligent services is maintaining the quality of service during the handover between edge servers. To address this problem, we propose in this report a solution to help improve the quality of inference for real-time video analytics applications that use chained DNN models. The scheme comprises two sub-schemes: (1) a non-handover scheme that determines the optimal offloading decisions with the shortest end-to-end inference latency, and (2) a handover scheme that improves the inference quality by maximizing the usage of mobile devices for the most useful inference outcomes. We evaluated the proposed scheme using a DNN-based real-time traffic monitoring application via testbed and simulation experiments. The results show that our solution can improve the inference quality by 57% during handovers compared to a greedy algorithm-based solution.
Thursday, August 11
MICHAEL J. WAITE
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Mobile-Ready Expression Analysis
The field of computerized facial expression analysis has grown fast in recent years, with multiple commercial solutions and a plethora of research being produced. However, there has not been much focus on this technology’s use in disability assistance. Studies have shown that an inability to read facial expressions can have a drastic negative impact on a person’s life, presenting a clear need for tools to help those impacted. Most work in this field focuses on analytic performance over computational performance. This project aims to create an application that can be used by the disabled to read facial expressions in situations where they cannot, with a focus on computational performance to allow for real-time analysis. By utilizing a simplified methodology inspired by classic object detection such as SIFT and SURF, we found that our emotional analysis can achieve a computational performance of 100 milliseconds per image while retaining an overall accuracy of 64% when evaluated on the CK+ database. We hope that in the future our system can be further developed to produce greater accuracy with minimal loss in computational performance using machine learning.
SPRING 2022
Wednesday, May 4
DAT TIEN LE
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: Emulated Autoencoder: A Time-Efficient Image Denoiser for Defense of Convolutional Neural Networks against Evasion Attacks
As Convolutional Neural Networks (CNN) have become essential to modern applications such as image classification on social networks or self-driving vehicles, evasion attacks targeting CNNs can lead to damage for users. Therefore, there has been a rising amount of research focusing on defending against evasion attacks. Image denoisers have been used to mitigate the impact of evasion attacks; however, there is not a sufficiently broad view of the use of image denoisers as adversarial defenses in image classification due to a lack of trade-off analysis. Thus, trade-offs, including training time, image reconstruction time, and loss of benign F1-scores of CNN classifiers, of a group of image denoisers are explored in this thesis. Additionally, Emulated Autoencoder (EAE), which is the proposed method of this thesis to optimize trade-offs for high volume classification tasks, is evaluated alongside state-of-the-art image denoisers in both the gray-box and white-box threat model. EAE outperforms most image denoisers in both the gray-box and white-box threat models while drastically reducing training and image reconstruction time compared to the state-of-the-art denoisers. As a result, EAE is more appropriate for securing high-volume classification applications of images.
Wednesday, May 18
NIRALI GUNDECHA
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Lambda and Reduction method implementation for MASS Library
The MASS is a parallelizing library that provides multi-agent and spatial simulation over a cluster of computing nodes. The goal of this capstone is to reduce the communication overhead for data and make the user experience effortless. Hence improving the efficiency of MASS.
This paper introduces new features, lambda and reduction methods, and implementation to achieve the goals. This feature is not implemented and provided to any agent-based library till the date. Hence making this sole contribution to agent-based library. This paper validates the lambda and reduce method and uses MASS library to do so.
Implementation of the lambda method library and provides users the flexibility of using the MASS library frictionlessly. Using lambda methods, user can describe their own new feature implementation on the fly and have results instantaneously. On top of the lambda feature, reduce method is responsible to perform reduce operation of any type of users’ data or Agent data. The operation user wants to perform can be anything such as max, min or sum.
The data collection method is described as a lambda method. Using reduce method, user can perform tasks of reduction in single line of code that improves code reliability and clean code. These features remove the hassle of writing blocks of code and getting involved into agents’ behavior over cluster of nodes is distinctive as well as innovative. Lambda and reduce method implementation are revolutionary as this is unique contribution to agent-based library and their users.
PALLAVI SHARMA
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Text Synthesis
With explosion of data in the digital domain, manual synthesis of long texts to extract important information is quite a laborious and time-consuming task. Mobile based text synthesis systems that can take text input and extract important information can be very handy and would reduce the overall time and effort required in manual text synthesis. In this work, a novel system is developed that facilitate users to extract summaries and keywords from long texts in real time using a cross-platform mobile application. The mobile application uses a hybrid approach based on feature extraction and unsupervised learning for generating quality summaries. In this paper, 10 sentence features are used for feature extraction. A hybrid technique based on machine learning with semantic methods is used to extract keywords/key-phrases from the source text. This application also allows users to manage, share and listen to the information extracted from the input text. Additional features like allowing users to draft-error free notes improve users’ experience. To test reliability of this system, experimental and research evaluation were carried out on DUC 2002 dataset using ROGUE parameters. Results demonstrate 51% F-Score which is higher than state of the art methods used for extractive summarization on the same dataset. The hybrid approach used for keyword/key-phrase extraction was tested from the validity of the resulting keywords. Application could produce proper keywords in the form of phrases and words with an accuracy of 70%.
Thursday, May 19
ZHIYANG ZHOU
Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Facial Recognition on Android Devices with Convolutional Neural Networks and Federated Learning
Machine Learning (ML) and Artificial Intelligence (AI) are widely applied in many modern services and products we use. Facial Recognition (FR) is a powerful ML application that has been used extensively in various fields. However, traditionally, the models are trained on photos crawled from the World Wide Web (WWW), and they are often biased towards celebrities and the caucasian population. Centralized Learning (CL), one of the most popular training techniques, requires all data to be on the central server to train ML models. However, it comes with additional privacy concerns as the server takes ownership of end-user data. In this project, we first use Convolutional Neural Networks (CNN) to develop an FR model that can classify 7 demographic groups using the FairFace image dataset. This has a more balanced and diverse distribution of ordinary face images across the racial groups. To further extend the training accessibility and protect sensitive personal data, we propose a novel Federated Learning (FL) system using Flower as the backend and Android phones as edge devices. These pre-trained models are initially converted to TensorFlow Lite models, which are then deployed to each Android phone to continue learning on-device from additional subsets of FairFace. Training takes place in real-time and only the weights are communicated to the server for model aggregation, thus separating user data from the server. In our experiments, we explore various centralized model architectures to achieve an initial accuracy of 52.9%, which is lightweight enough to continue improving to 68.6% in the Federated Learning environment. Application requirements on Android are also measured to validate the feasibility of our approach in terms of CPU, memory, and energy usage. As for future work, we hope the system can be scaled to enable training across thousands of devices and have a filtering algorithm to counter adversarial attacks.
Friday, May 20
VISHNU MOHAN
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Automated Agent Migration Over Structured Data
Agent-based data discovery and analysis views big-data computing as the results of agent interactions over the data. It performs better onto a structured dataset by keeping the structure in memory and moving agents over the space. The key is how to automate agent migration that should simplify scientists’ data analysis. We implemented this navigational feature in multi-agent spatial simulation (MASS) library. First, this paper presents eight automatic agent navigation functions, each we identified, designed, and implemented in MASS Java. Second, we present the performance improvements made to existing agent lifecycle management functions that migrate, spawn and terminate agents. Third, we measure the execution performance and programmability of the new navigational functions in comparison to the previous agent navigation. The performance evaluation shows that the overall latency of all the four benchmark applications improved with the new functions. Programmability evaluation shows that new implementations reduced user line of codes (LOC), made the code more intuitive and semantically closer to the original algorithm. The project successfully carried out two goals: (1) design and implement automatic agent navigation functions and (2) make performance improvements to the current agent lifecycle management functions.
CARL ANDERS MOFJELD
Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Adaptive Acceleration of Inference Services at the Network Edge
Deep neural networks (DNN) have enabled dramatic advancements in applications such as video analytics, speech recognition, and autonomous navigation. More accurate DNN models typically have higher computational complexity. However, many mobile devices do not have sufficient resources to complete inference tasks using the more accurate DNN models under strict latency requirements. Edge intelligence is a strategy that attempts to solve this issue by offloading DNN inference tasks from end devices to more powerful edge servers. Some existing works focus on optimizing the inference task allocation and scheduling on edge servers to help reduce the overall inference latency. One key aspect of the problem is that the number of requests, the latency constraints they have, and network connection quality will change over time. These factors all impact the latency budget for inference computation. As a result, the DNN model that maximizes inference quality while meeting latency constraints can change as well. To address this opportunity, other works have focused on dynamically adapting the inference quality. Most such works, though, do not solve the problem of how to allocate and schedule tasks across multiple edge servers, as the former group does. In this work, we propose combining strategies from both areas of research to serve applications that use deep neural networks to perform inference on offloaded video frames. The goals of the system are to maximize the accuracy of inference results and the number of requests the edge cluster can serve while meeting latency requirements of the applications. To achieve the design goals, we propose heuristic algorithms to jointly adapt model quality and route inference requests, leveraging techniques that include model selection, dynamic batching, and frame resizing. We evaluated the proposed system with both simulated and testbed experiments. Our results suggest that by combining techniques from both areas of research, our system is able to meet these goals better than either approach alone.
Monday, May 23
ISHPREET TALWAR
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
9:00 A.M.
Project: Recycle Helper – A Cross-Platform mobile application to Aid Recycling
With the growth of the population on the planet, the amount of waste generated has also increased. Such waste, if not handled correctly, can cause environmental issues. One of the solutions to this problem is Recycling. Recycling is the process of collecting and processing materials that would otherwise be thrown away as trash and turning them into new products. It can benefit the community and the environment. Recycling can be considered as an umbrella term for the 3R’s – Reduce, Reuse and Recycle. There are a variety of items that are present in the surrounding environment in different states/conditions which makes the process of recycling complex because having the knowledge of the correct way to recycle these items can be overwhelming and time-consuming. To help solve this problem to an extent, this paper proposes a cross-platform mobile application that promotes recycling. It helps users by providing them with recycling instructions for different product categories. The application allows the user to capture/choose an image of an item using a phone camera or gallery. It uses software engineering methodologies and machine learning to predict the item and provide the relevant recycle instructions. The application is able to detect and predict the items with an accuracy of 81.06%, using a Convolutional Neural Network (CNN) model. To motivate and engage users for recycling, the application allows the user to set a monthly target goal for recycling, track its progress, and view their recycling history. The application is user-friendly and will help promote correct recycling in a less time-consuming manner.
Wednesday, May 25
YAN HONG
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Graph Streaming in MASS Java
This project is to facilitate graph streaming in agent-based big data computing where agents find a shape or attributes of a huge graph. Analyzing and processing massive graphs in general has become an important task in different domains because many real-world problems can be represented as graphs such as biological networks and neural networks. Those graphs can have millions of vertices and edges. It is quite challenging to process such a huge graph with limited resources as well as in a reasonable timeframe. The MASS (Muti-Agent Spatial Simulation) library has already supported graph data structure (GraphPlaces) which is distributed on a cluster of computing nodes. However, when processing a big graph, we may still encounter the following two problems. The first is the construction overhead that will delay the actual computation. The second is limited resources that slow down graph processing. To solve those two problems, we implemented graph streaming in MASS Java which repetitively reads a portion of a graph and processes it while reading the next graph portion. It supports HIPPIE and MATSim file formats as the input graph files. We also implemented two graph streaming benchmarks: Triangle Counting and Connected Components, to verify the correctness of and evaluate the performance of graph streaming. Those two programs were executed with 1 – 24 computing nodes, which demonstrates the significant CPU-scalable and memory-scalable performance improvements. We also compared the performance with the non-streaming solution. Graph streaming avoids the explosive growth of the agent population and loads only a small portion of a graph, both efficiently using limited memory space.
Thursday, May 26
BRETT BEARDEN
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: Redesigning the Virtual Academic Advisor System, Backend Optimizations, and Implementing a Python and Machine Learning Engine
Community College students have admiration of continuing their education at a 4-year college or university. The process of navigating college can be complex, let alone figuring out transfer requirements for individual schools. Assisting students in this process requires special knowledge for specific departments and majors. Lower budgeting colleges do not have funds for additional staff regarding academic advising, and the task gets passed to the teaching faculty. Student academic planning is a time-consuming process that can detract from an instructor’s time needed to focus on their current courses and students. For years, a team of students at the University of Washington Bothell have been working on a Virtual Academic Advisor (VAA) system to automate the process of generating student academic plans in support of Everett Community College (EvCC). The goal of the VAA system is to reduce the amount of time an instructor sits with an individual student during academic advisement. However, the VAA system is not yet complete and there were a few roadblocks preventing it from moving forward. The work proposed in this capstone focusses on redesigning the previous VAA system to remove fundamental flaws in how data is stored related to scheduling academic plans. A new system architecture will be designed allowing to conduct backend optimizations. Cross-language support will give the VAA system the ability to communicate with Python for conducting machine learning research. The proposed work brings the VAA system closer to completion and ready for deployment to support EvCC.
SANA SUSE
Chair: Dr. Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Project: Classifying Urban Regions in Satellite Imagery Using the Bag of Words Methodology
Satellite imagery has become more accessible over the years in terms of both availability and in quality, though the analysis of such images has not kept up at the same pace. To investigate the analysis process, this work explores the detection of urban area boundaries from satellite imagery. The ground truth values of these boundaries were collected from the U.S. Census Bureau’s geospatial urban area dataset and were used to train a classification model using the Bag of Words methodology. During training and testing, 1000×1000 pixel patches were used for classification. The resulting classification accuracy was between 85-90% and showed that urban areas were classified with higher confidence than non-urban areas. Most of the sub-images that were classified with a lower confidence are in the transition areas between urban and non-urban areas. In addition to low confidence in classifying these transition areas, these patch sizes are quite large. For this reason, they are not helpful to delineate granular details in the urban area boundaries.
TIANHUI NIE
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Visualization of 2D Continuous Spaces and Trees in MASS Java
MASS is an Agent-Based Modeling (ABM) library. It supports parallelized simulation over a distributed computing cluster. The Place objects in these simulations can be thought of as the environment where agents interact with each other. Places can mimic different data structures to simulate various interaction environments, such as graphs, multi-dimensional arrays, trees, and continuous spaces.
However, the continuous spaces and trees are usually complex for programmers to debug and verify. So, this project will focus on how to visualize these data structures visually. These data structures are available in the MASS library. They can be instantiated at InMASS which enables Java’s JShell interface to execute codes line by line in an interactive fashion. InMASS has also facilitated additional functionalities including checkpoint, and rollback. These functionalities can help programmers to view their simulations better. MASS allows Places and agents to be transferred to the Cytoscape for their visualization. Cytoscape is an open-source network visualization tool initially developed to analyze biomolecular interaction networks. Expanded Cytoscape MASS plugins can build a MASS control panel on the Cytoscape application. It helps users to visualize graphs, continuous spaces, and trees at Cytoscape.
This project successfully realized the visualization of MASS binary trees, quad trees, and 2D continuous spaces with Cytoscape. It also enhanced MASS-Cytoscape integration and optimized the MASS control panel. From this project, these data structure visualizations provide an easier way for other users to learn the MASS library and debug their codes.
Friday, May 27
MARÉ SIELING
Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: AGENT-BASED DATABASE WITH GIS
Geographic Information Systems (GIS) create, manage, analyse and maps data. These systems are used to find relationships and patterns between different pieces of data in a geographically long distance. GIS data can be extremely large and analysing the data can be laborious while consuming a substantial amount of resources. By distributing the data and processing it in parallel, the system will consume less resources and improve performance.
The Multi-Agent Spatial Simulation (MASS) library applies agent-based modelling to big data analysis over distributed computing nodes through parallelisation. GeoTools is a GIS system that is installed on a single node and processes data on that node. By creating a distributed GIS from GeoTools with the MASS library, results are produced faster and more effectively than traditional GIS systems located on a single node.
This paper discusses the efficacy of coupling GIS and MASS through agents that render fragments of feature data as layers on places, returning the fragments to be combined for a completed image. It also discusses distributing and querying the data, returning results by running a query language (CQL). Image quality is retained when panning and zooming without major loss of performance by rerendering visible sections of the map through agents and parallelisation. Results show that coupling GIS and MASS significantly improves the efficiency and scalability of a GIS system.
LIWEN FAN
Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Realistic Fluid Rendering in Real-Time
Real-time realistic fluid rendering is important because fluid is ubiquitous and can be found in many Computer Generated Imagery (CGI) applications, such as video games and movies. However, realism in fluid rendering can be complex due to the fact that fluid does not have a concrete physical form or shape. There are many existing solutions in modeling the movement and the appearance of fluid. The movement of fluid focuses on simulating motions such as waves, ripples, and dripping. The appearance, or rendering, of fluid aims to reproduce the physical illumination process to include effects including reflection, refraction, and highlights. Since these solutions focus on addressing different aspects of modeling fluid, it is important to clearly understand application requirements when choosing among these.
This project focuses on the appearance, or the rendering, of fluid. We analyze existing solutions in detail and adopt the solution which is most suitable for real-time realistic rendering. With a selected solution, we explore implementation options based on modern graphics hardware. More specifically, we focused on graphics hardware that can be programmed through popular interactive graphical applications for the reasons of supporting interactive modeling, high-level shading language, and fast turnaround debugging cycles. The solution proposed by Van Der Laan et al., in their 2009 I3D research article is the choice of solution for this project. Our analysis shows that their approach is the most suitable because of the real-time performance, high-quality rendered results, and very importantly, provided implementation details.
The graphics system and hardware evaluation led to the Unity game engine. This is our choice of implementation platform due to its friendly interactive 3D functionalities, high-level shading language support, and support for efficient development cycles. In particular, the decision is based on Unity’s support of Scriptable Render Pipeline (SRP) functionality where the details of an image generation process can be highly customized. The SRP offers flexibility with ease of customizing shaders, and control of number of passes in processing the scene geometry for each generated image. In our implementation, the SRP is configured to compute the values to all of the parameters in the fluid model via separate rendering passes.
Our implementation is capable of rendering fluid realistically in real-time, where the users have control over the actual fluid appearance. The delivered system supports two types of simple fluid motion: waves and ripples. The rendered fluid successfully captures effects from the intrinsic color of the fluid under Fresnel reflection, the reflection of environmental elements, and, highlights from the light sources. In addition, to provide users with the full control on the rendered results, a friendly interface is supported. To demonstrate the system, we have configured to showcase our fluid rendering of some common conditions including swimming pool, muddy pond, green algae creek, and colored fluid in a flowery environment.
Wednesday, June 1
YILIN CAI
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Project: Model Extraction Attacks Effectiveness And Defenses
Machine learning is developing quickly in data industry and many technology companies who have the resources to collect huge datasets and train models are starting to provide services of pre-trained models for profits. The cost of training a good model for business use is expensive because huge training datasets may not be easily accessible and training the model itself requires a lot time and effort. The increased value of a pre-trained model motivates attackers to conduct model extraction attack, which focus on extracting valuable information from the target model or construct a clone close to the target model for free use by only making queries from the victim. The goal of this experiment is exploring the vulnerability of proposed model extraction attacks and evaluating the effectiveness of the attack by comparing the attack results when the victim model and its target datasets are more complex. We first construct datasets for the attacks by making queries to the victim model and some attacks propose to have certain strategies of selecting queries. Then, we execute the attack either by running it from scratch or using existing test framework. We run the attack with different victim models and datasets and compare the attack results. The results show that the attacks which extract information from a model are effective on simpler models but not on more complex models, and the difficulty of making a cheaper clone model will increase and the attacker may need more knowledge besides query info from the victim when the victim model and its target datasets are more complex. Potential defenses and their weakness are also discussed after the experiment.
PRATIK GOSWAMI
Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Project: Virtual Reality based Orthoptics for Binocular Vision Disorders
The Center for Disease Control noted that approximately 6.8% of children under the age of 18 years in the United States are diagnosed with vision problems significant enough to impact learning. Binocular Disorders can lead to headaches, blurry vision, double vision, loss of coordination, fatigue, and the inability to track objects, thereby severely impacting a child’s ability to learn. Without intervention, vision problems can lead to suppression of vision in the affected eye. Vision Therapy or Orthoptics is meant to help individuals recover their eyesight. It aims to retrain the user to achieve Binocular Fusion using therapeutic exercises. Binocular Fusion refers to the phenomenon of perceiving a single fused image when presented with 2 images in each eye. Virtual Reality (VR) shows a lot of potential as an orthoptics medium. VR headsets can isolate the user from the physical world, reduce real world distractions, provide a dichoptic display where each eye can be presented with a different input, and provide a customized therapy experience for the user.
Although several VR applications exist with a focus on orthoptics, clinicians report that these applications fail to strike a balance between therapy and entertainment. These applications can be too entertaining for the user and thus distract them from the therapy goals.
As a part of the EYE Research Group, I have developed 2 applications which when added to the previously developed applications make a VR toolkit to provide vision therapy to individuals diagnosed with Binocular Disorders. Each application in the toolkit focuses on a level of Binocular Fusion. The 2 applications I developed focuses on the third and fourth level of fusion – Sensory Fusion and Motor Fusion. The project has been successfully developed using Unity Game Engine along with the Oculus VR plugin. All decisions about the controls and features have been made after the analysis of the feedback and interview of the therapists at the EYE See Clinic. Key design decisions have also been the outcome of the demonstration and trial of the prototypes at the ACTION Forum 2021. The forum was attended by therapists, students and researchers in the field of orthoptics.
Although the applications have been successfully developed and have been approved by the therapists at the EYE See Clinic, a clinical study is required to test the usability and the effectiveness of the tools as a therapy tool. As of May 16th, 2022, all applications have been successfully developed, tested, and approved by Dr. Alan Pearson, the clinical advisor to the EYE Research Group. A case study was proposed, reviewed and approved by the UW IRB and the UW Human Subjects Division (HSD) board. The results of the study will be beneficial for future research.
FRANZ ANTHONY VARELA
Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.
Thesis: The Effects of Hybrid Neural Networks on Meta-Learning Objectives
Historically, models do not generalize well when they are trained solely on a dataset/task’s objective, despite the plethora of data and computing available in the modern digital era. We propose that this is at least partially because the representations of the model are inflexible when learned in this setting; in this paper, we experiment with a hybrid neural network architecture that has an unsupervised model at its head (the Knowledge Representation module) and a supervised model at its tail (the Task Inference module) with the idea that we can supplement the learning of a set of related tasks with a reusable knowledge base. We analyze the two-part model in the contexts of transfer learning, few-shot learning, and curriculum learning, and train on the MNIST and SVHN datasets. The results of the experiment demonstrate that our architecture on average achieves a similar test accuracy as the E2E baselines, and sometimes marginally better in certain experiments depending on the subnetwork combination.
NHUT PHAN
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.
Thesis: Deep Learning Methods to Identify Intracranial Hemorrhage Using Tissue Pulsatility Ultrasound Imaging
Traumatic Brain Injury (TBI) is a serious medical condition when a person experiences trauma in the head, resulting in intracranial hemorrhage (bleeding) and potential deformation of head-enclosed anatomical structures. Detecting these abnormalities early is the key to saving lives and improving survival outcomes. Standard methods of detecting intracranial hemorrhage are Computed Tomography (CT) and Magnetic Resonant Imaging (MRI). However, they are not readily available on the battlefield and in low-income settings. A team of researchers from the University of Washington developed a novel ultrasound signal processing technique called Tissue Pulsatility Imaging (TPI) that operates on raw ultrasound data collected using a hand-held tablet-like ultrasound device. This research work aims to build segmentation deep-learning models that take the input TPI data and detect the skull, ventricles, and intracranial hemorrhage in a patient’s head. We employed the U-Net architecture and four of its variants for this purpose. Results show that the proposed methods can segment the brain-enclosing skull and is relatively successful in ventricle detection, while more work is needed to produce a model that can reliably segment intracranial hemorrhage.
Friday, June 3
MONALI KHOBRAGADE
Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.
Project: EcoTrip Planner – An Android App
The emergence of online travel websites like TripAdvisor, Priceline, Expedia, and KAYAK allowed users to get an experience of booking the accommodation online without any hassle with the agent. Users are no longer waiting in queues to get flight tickets to their favorite destinations. They can also get enough idea about the vacation destination over the online travel websites, which was earlier depended solely on the agent’s guidance. Users can book flights, hotels, and restaurants using these online websites. In short, using online travel websites, they can plan a vacation trip after manually evaluating all the options like price, flight timings and availability, hotel location, food options, and nearby locations to checkout. However, a recent study indicates that abundant options available in online travel agencies are overwhelming to users. The main challenge is that these online travel websites do not provide a holistic trip plan including flight and hotel accommodation under the user’s budget. In this paper, we intend to provide a trip plan with flight travel and hotel stay suggestions under the user’s given budget by using personalized factors and analyzing user experience. The aim of this project is to develop an android mobile application that will help users plan trips under a given budget and help fight information overload. Our approach in this application asks users about the vacation destination and the budget amount they can afford. It also asks users about their preferred hotel location, hotel stars, and ratings. It then analyzes the budget and uses heuristic models and natural language processing to recommend the best available travel and lodging. For travel, it suggests the round-trip plan from current location to destination, and for hotels, it suggests the top 3 hotels with a personalized user experience. This system also extracts the top 5 keywords from the hotel reviews. These keywords allow users to get an overall idea about the hotel. Our approach in this android application will help users to plan the trip including flight travel and hotel accommodation in minutes.
WILLIAM OTTO THOMAS
Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.
Thesis: Human Cranium, Brain Ventricle and Blood Detection Using Machine Learning on Ultrasound Data
Any head related injury can be very serious and may be classified as a traumatic brain injury (TBI), which can be a result of intracranial hemorrhaging. TBI is one of the most common injuries in or around a battlefield, which can be caused by both direct and indirect impacts. While assessing a brain injury in a well-equipped hospital is typically a trivial task, the same cannot be said about a TBI assessment in a non-hospital environment. Typically, a computer tomography (CT) machine is used to diagnose TBI. However, this project demonstrates the use of ultrasound and how it can be used to predict where skull, ventricles, and bleeding occur. The Pulsatility Research Group at the University of Washington has conducted three years of data collection and research to create a procedure that diagnoses TBI in a field situation. In this paper, machine learning methodologies will be used to predict these CT derived features. The result of this research shows that with adequate data and collection methods skull, ventricles, and potentially blood can be detected while applying machine learning to ultrasound obtained data.
Master of Science in Cybersecurity Engineering
SUMMER 2024
Monday, August 5
KRISHNA PAUDEL
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
8:45 A.M
Project: MalBuster: An Integrated Approach to Securing Systems through Port Monitoring and Malware Detection
The importance of computer security in today’s digital world cannot be overstated. As cyber threats become more sophisticated and widespread, strong security measures are essential to protect information and ensure smooth operations. This is where traditional anti-malware solutions come in, but our approach takes a different route through the continuous monitoring of network ports for any signs of malicious activity. While most malware scanners merely examine files, MalBuster does its work by detecting malware that tries to communicate with external systems by monitoring system ports in real time and identifying suspicious processes.
MalBuster works by continuously watching local machine’s network ports for inbound and outbound connections, capturing data packets in real-time, linking them to the corresponding processes and classifying the processes using both static and predictive malware analysis techniques. The tool also features a user interface where users can access the generated reports and act on suspected malware. The tool is implemented using various python libraries to perform static and predictive ML analysis on the processes. The data for analysis and testing were collected from real network traffic, malware sample repositories, and normal software applications. Benchmarking various KPIs, our results show that MalBuster is effective at detecting threats accurately, processing data quickly, and using minimal system resources across various test scenarios. These results underline its capabilities in providing proactive detection of threats and mitigation, improving the accuracy and reliability of current security measures. There is scope for improvement through further research to fine-tune the algorithms and expand MalBuster’s capabilities, ensuring even better threat detection and protection.
ANURODH ACHARYA
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.
Project: Examining the Interplay of Psychological Factors in Remote Workers’ Cybersecurity Practices
This research examines the interplay of psychological factors that are influencing cybersecurity practices of remote workers focusing on how perceived severity, perceived vulnerability, self-efficacy, response efficacy, and response cost can impact both protection intention and protection behavior. A survey conducted with a diverse sample of remote workers showed that perceived severity and perceived vulnerability significantly enhance the protection intentions which in turn strongly predict the protection behaviours. Self-efficacy and response efficacy positively influence protection intentions while perceived response cost impacts protection intention but not the actual protective behaviours.
This study also identifies common cybersecurity challenges including network and device security and highlights prevalent issues such as malware attacks and data breaches. Survey participants showed a high level of caution in dealing with suspicious emails, often opting for verification and reporting to the IT departments. Emotional responses to the cybersecurity roles also varied with many of the respondents feeling either positive or neutral showing a general sense of awareness and responsibility. The findings also highlight the important role that psychological factors play in shaping cybersecurity behaviours among the remote workers and further suggest that targeted intervention can greatly improve the cybersecurity practices in remote work environments.
Wednesday, August 7
GAUTAM KUMAR
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
8:45 A.M.
Project: Foodereum: Revolutionizing Food Supply Chains with Blockchain Authentication
Blockchain technology has the potential to revolutionize supply chain management by enhancing transparency, traceability, and security. The proposed work suggests adopting a private Ethereum-based blockchain system to record, track, and verify food products throughout the supply chain. The immutability of blockchain ensures that transactions cannot be altered or concealed, as they are recorded in a distributed ledger accessible to all authorized parties. This transparency allows receivers to access the entire history of a product on the blockchain network, improving food quality and safety. By implementing this blockchain-based solution, the food industry aims to increase consumer appeal and overcome challenges such as centralized systems, third-party involvement, high costs, food waste due to spoilage and contamination, lack of accountability, and communication gaps between supply chain partners.
However, the system faces security threats like cross-site scripting (XSS) attacks, sensitive information exposure, and hardcoded credentials. To mitigate these risks, the project incorporates robust input validation, output encoding, secure storage, and strong authentication mechanisms, including two-factor authentication and strong password policies. Regular security audits, penetration testing, and software updates are also implemented to identify and address vulnerabilities. These security enhancements, combined with blockchain’s inherent benefits, create a more robust and trustworthy system for food supply chain management, increasing transparency, and efficiency, and providing a significantly more secure environment for all stakeholders involved in the food supply chain by increasing consumer appeal and overcoming challenges such as centralized systems, third-party involvement, high costs, food waste, lack of accountability, and communication gaps between supply chain partners.
MANUEL M. DUARTE
Chair: Professor Mark Kochanski
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: Analysis of Enhancing Security Measures in Docker Containers: A Threat Model Approach
This project aims to explore new mitigation strategies and understand the risks
associated with deploying Docker as an architectural solution. We researched various security
issues with Docker container images and reviewed Docker’s architecture and its application
usage. In addition to this analysis, we investigated industry best practices to understand
what can mitigate these threats effectively. We created a Threat Model specifically for
Docker Components and analyzed threats using STRIDE analysis.
Based on our findings, we present a few recommendations for enhancing the secure use of
Docker in application deployment, i.e. when using Docker Hub, always use a security scanner
tool before deploying any images and disable unused services on a Dockerfile; we mention
other good security practices that should be applied in the early stages of development.
Our future work will involve further experiments focusing on using ML real-time anomaly
detection algorithms such as support vector machine (SVM) and Isolation Forrest to detect
vulnerabilities early on, in SSL protocols and other attacks to root access. Also, using
machine learning to improving certificate management. Continued research in this area is
essential to deepen our understanding of Docker container security and to develop robust
defenses against emerging threats.
Keywords: Docker, Dockerfile, Anomaly detection, Encryption STRIDE, DREAD.
SPRING 2024
Wednesday, May 15
TIMOTHY LUM
Chair: Dr. William Erdly
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.; Discovery Hall 464
Project: The Howdu (“How do I…?”) Project: A Knowledge Management System to Support Cybersecurity Implementation
Knowledge capture and distribution has been a perennial human endeavor since prehistory. In the modern era, the internet has facilitated an exponential growth in the volume of information available, however it presents a sub-ideal resource in providing consistent, structured instruction for how best to implement effective cyber defenses.
In this capstone, we evaluate the technical pressures and missing linkages that have prevented guidance from being presented to mitigate cyberattacks. We then build a cyber defense knowledge repository that allows users to create a cumulative snapshot of their experiences and insights.
In limited testing, Howdu facilitated a 61% speedup of an arbitrary and partially obfuscated task for those performing it (Practitioners). It further altered the subjective perception of this task from “Confusing”, “Frustrating”, and “Ambiguous” to a more positive outlook of “Easy”, “Comprehensible”, and “Fun”. These initial results suggest the application’s ability to improve network defense by aiding defender efficiency, decreasing stress, and reducing burnout.
Future Works include an integration of the system for grading information – called the Trust Index – and implementation of a system for translating knowledge articles across languages – called the Gnosetta. Other long term goals include containerization of the application and reductions in third-party service reliance.
Monday, May 20
KEVIN HUANG
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: Targeted Cybersecurity Awareness and Training Programs for Two Asiatic Minority Groups in the U.S.
Cybersecurity education provides the knowledge and skills necessary for consumers to navigate the digital world safely. This knowledge is not equally accessible for all consumers. There is currently a lack of cybersecurity education available for Asian minority groups in the U.S. Language and cultural barriers make it difficult for consumers in these groups to receive the knowledge that they need. The goal of our project was to create effective cybersecurity training programs for the Chinese and Vietnamese minority groups. We accomplished this by first creating education materials through incorporating the latest research and guidelines for password security and social engineering prevention. We then found participants to attend our training sessions through the help of nonprofit organizations. After conducting the training sessions, our results confirmed that there is a demand from the Chinese and Vietnamese minority groups for cybersecurity education. Our analysis also showed that the training sessions made an impact on the participants’ intentions to improve their password hygiene and be more vigilant against social engineering attempts. The insight gained from this project can be used to expand the research and development of cybersecurity education to different Asian minority groups, additional cybersecurity topics, and additional cities across the U.S.
Thursday, May 23
CHALERMWAT PUAPOLTHEP
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.
Project: Privacy-Preserving Source Code Similarity Detection
Reusing or sharing the source code could potentially violate license agreements and lead to lawsuits for an organization. These infringements can be identified using source code similarity detection tools. However, existing methods lack sufficient protection for the privacy of the source code. Due to privacy concerns, the software creator might be hesitant to share their source code for investigation. Therefore, we propose a source code similarity detection technique that preserves the privacy of the source code by implementing a normalizing process to standardize the source code before hashing it. Our comparison algorithm employs fuzzy hashing, which enables us to evaluate similarity based on the hash values. Our proof-of-concept application efficiently detects Type I and II code clones with minor time trade-offs. To verify the trade-off, we experimented to compare the time usage with and without the normalizing process. Additionally, the accuracy of Type II code clone detection is investigated. The result showed an improvement in precision and recall, with an average time taken of 0.0005 seconds longer than the process without normalizing.
Friday, May 24
CHHEANG DUONG
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
8:45 A.M.
Project: Generating Synthetic Data and Evaluating its Privacy
The field of Generative AI continues to grow in popularity in recent years from organizations seeking new ways to leverage synthetic data to optimize their processes. This technology is especially useful in areas like healthcare, where sharing data for research needs to balance between keeping information private and making sure the data is still useful. The deployment of generative AI for data anonymization prior to public release is being increasingly considered by organizations. Synthetic data offers the benefit of preserving privacy while maintaining the utility of the data for research purposes. However, the assumption that synthetic data does not contain real data can create a false sense of security if the underlying machine learning models are inadequately configured. Thus, organizations must be able to assess the privacy risk of the synthetic data.
This study sought to establish the foundations for a comprehensive framework that evaluates the privacy risks associated with synthetic data. Our approach incorporates previous works on synthetic data generation and privacy assessment into a single workflow consisting of three phases. First, our Data Synthesis module generates synthetic data utilizing Conditional Tabular Generative Adversarial Networks (CTGAN) or Differentially Private CTGAN (DP-CTGAN). Next, our Privacy Attack module runs Membership Inference Attacks (MIAs) against the synthetic data to identify potential privacy leaks. As one of the three types of privacy attack strategies, MIAs aim to reverse engineer the machine learning model and deduce if a data sample is part of the original data. Finally, our Privacy Evaluation module analyzes the Data Utility and Privacy Defense of the synthetic datasets by translating the results of the Privacy Attack module into quantitative metrics.
The results of our testing provide valuable insights for future research in optimizing and adapting our workflow. For example, current literature on synthetic data privacy recommends leveraging Differential Privacy to improve privacy risk, but they did not validate data utility. Low data utility would essentially render the synthetic data useless. Our results found that while Differential Privacy significantly reduced the probability of a successful privacy attack, data utility also decreased considerably. This finding supports the need for further optimization of our Data Synthesis module, assisting users in generating statistically similar synthetic data. In the future, we may be able to build a robust framework that would allow organizations to confidently generate and release synthetic datasets for public consumption without compromising individual privacy.
SAJA ALSULAMI
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Thesis: A Study on the Effectiveness of Education and Fear Appeal to Prevent Spear Phishing of Online Users
Spear phishing attack is considered one of the most elaborate attacks in social engineering. It presupposes that an attacker designs a scam to obtain the personal information of specific users from their social media accounts. It involves a preliminary analysis of targeted users and their online behaviors needed to persuade them that a malicious link or attachment is sent by a trusted person. This attack implies that human beings are the weakest link within a security system; their vulnerabilities could be exploited. The most detrimental consequences following spear phishing attacks are financial losses, network compromises, loss of login credentials, and malware installation.
This quantitative study aims to examine the impact of education and fear appeals on users’ knowledge and abilities to identify spear phishing attacks. Three interventions were implemented: an educational intervention, a fear appeal intervention, and a combined educational-fear appeal intervention. The control group was used for comparison purposes. This study was conducted as an online experiment with 726 participants, and they were assigned randomly into four groups; after interventions, there was a test to evaluate their knowledge and abilities to identify spear phishing attacks. The test was administered to compare the efficacy of every intervention group (educational, fear appeal, and combined educational-fear appeal) to the control group. The experiment findings revealed no statistically significant differences in the mean test for these four groups. The study results indicate further research is needed to develop an effective intervention program that would considerably enhance users’ knowledge of spear phishing attacks and their resilience to them.
Tuesday, May 28
ANTHONY JESUS BUSTAMANATE SUAREZ
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.; Discovery Hall 464
Thesis: Advancing Deep Packet Inspection in SDNs: A Comparative Analysis of P4 and OpenFlow Programmability
This thesis undertakes a critical examination of Deep Packet Inspection (DPI) capabilities within
Software-Defined Networking (SDN) frameworks, emphasizing the comparative efficacy of P4 programming language against the conventional OpenFlow protocol.
OpenFlow, while foundational in SDN’s evolution, exhibits notable constraints in DPI’s domain,
primarily due to its limited packet inspection depth, confined largely to the transport layer (Layer 4).
In contrast, this research advocates for the adoption of P4 for its unparalleled flexibility and programmability, extending DPI functionalities to the application layer (Layer 7), thereby addressing and potentially surpassing OpenFlow’s limitations.
Employing a methodical approach, this study harnesses Open vSwitch and BMv2 (Behavioral Model
version 2) switches to simulate real-world network scenarios. These simulations facilitate a head-to-
head comparison of OpenFlow and P4 in executing DPI tasks, particularly focusing on HTTP and SQL protocols — common vectors for network threats. Through a comprehensive suite of protocols including OpenFlow, gRPC (Google Remote Procedure Call), and P4Runtime, the research crafts a robust DPI framework, further complemented by a custom-developed controller designed for the BMv2 and P4 ecosystem. This controller’s introduction marks a significant step in demonstrating DPI’s operational viability and efficiency within an SDN environment, leveraging application-layer traffic management. The research culminates providing three different implementations to do Deep Packet Inspection within SDN benchmarking each of them to measure their advantages and disadvantages. With these implementations and benchmarking, we not only aim to validate P4’s superiority over OpenFlow in managing DPI tasks but we also seek to dynamically adapt packet-processing techniques to the ever-evolving landscape of network threats. By advancing SDN functionalities beyond traditional layer boundaries, this thesis contributes significantly to the discourse on network security, management, and optimization, paving the way for future innovations in increasingly complex network environments.
Wednesday, May 29
EDUARD RASKIN
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: American Automobile Association Penetration Test
Penetration testing simulates attacks against systems, networks, and applications. Such audits are a cornerstone of successful attack surface management programs since they help organizations identify and close security gaps and meet industry-specific security standards and requirements. This document outlines the learning, preparation, technical work and the results of a black box security review conducted on American Automobile Association of Washington’s publicly accessible network infrastructure and company website. The goal of the assessment was to identify vulnerabilities that an outside attacker could leverage to gain unauthorized access and assess the overall security posture of AAAWA’s external perimeter. The test analyzed AAAWA’s digital footprint, enumerated publicly accessible servers and authentication portals and examined infrastructure and web application vulnerabilities. The work included an authentication, access control, and user management assessment. The key deliverable was a detailed penetration test report that outlined findings and provided recommendations based on risks s
Autumn 2023
Friday, December 8
NEIL PRAKASAM
Chair: Dr. Bren Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.
Thesis: A System for Secure and Categorized Video-Sharing
Online video sharing is a phenomenon which continues to be increasingly utilized by the entire population. Preserving the privacy of videos shared online is of utmost importance, but there is one use case that hasn’t yet been covered by current mainstream video sharing platforms. This project aims to provide the ability to categorize whether multiple videos are of the same event, so that users can share them only amongst others who were also present at the event and have video evidence. The main method of categorization will be through DNA sequencing, where video files will be converted into literal dna in order to be categorized into 4 categories. This includes those that are of the same event, space, activity, or are completely different videos. The research has shown rather lackluster results that could potentially be further optimized to categorize videos between the 4 categories, let alone whether or not they are of the same event. This paper will introduce and implement multiple methods of doing so.
JAMES EARRON COOPER
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: Peer Validated Proof-of-Presence for Crowdsensing Applications and Other Location Critical Apps/Services
An issue prevalent in crowdsourcing applications, but applicable to many other types of apps and services, is the challenge of authenticating a user’s location; and issues sometimes referred to as proof-of-presence. A notable concern in crowdsourcing apps is the submission of data by malicious users that was not legitimately collected in order to collect incentives that are being offered. Relying solely on GPS for location determination can be risky given the susceptibility of smartphones to GPS spoofing. Alternative methods, such as trilateration of cellular signals and databases of pre-mapped Wi-Fi access points, exist but often provide low accuracy.
This project attempts to addresses the proof-of-presence issue by comparing contextual data collected about the environment from all users to determine the ground truth and identify potential outliers. The system utilizes the user location and data about detected Wi-Fi signals to construct a dynamic “map” of WiFi access points and employs a variety of probabilistic techniques to assign each user a “score” indicating the likelihood of the inputs being potentially erroneous or fabricated. While the system achieves its goals under simplified and ideal laboratory conditions, real-world scenarios pose significant challenges. This project lays the foundation for a more robust and complicated system capable of addressing those challenges.
Summer 2023
Thursday, July 27
SETH DON-HAO PHAM
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: Evaluating Player Engagement in a Choose-Your-Own-Adventure Game Illustrating Personal Cybersecurity Awareness
The professional environment has seen great success in adapting serious games to raise cybersecurity awareness and skills in the workplace. These games provide scenario-driven experiences allowing players to explore and interact with cybersecurity skills without real-world consequences. Enterprise training requirements ensure employees engage with the games, a function not present for personal cybersecurity, leading the average person to not engage with this type of game in their free time. With an organically engaging game, the integration of cybersecurity scenarios can be introduced in an inviting context to the casual player, leading to higher engagement in cybersecurity awareness. This project evaluates the effectiveness of a cybersecurity game designed to entertain and engage players while increasing their cybersecurity awareness. Based on the feedback from an initial test group, three core concepts were critical for player engagement and enjoyment: an easy-to-handle UI; a fun, exciting story; and player text-length preference. The survey evaluated player preferences and the effect of their evaluation on the game. In addition, participants evaluated the game’s effectiveness based on the framework used in the previous study.
CLAIRE ANNA JENNINGS
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.
Project: Developing an Entertaining Choose-Your-Own-Adventure Game Illustrating Personal Cybersecurity Awareness
Cybersecurity awareness and education improvements in professional settings substantially increased with the development of serious cybersecurity games. These games engage interactively with the player to teach awareness and skills around cybersecurity and show higher engagement rates over other types of corporate training. These games primarily train, not entertain, and the average person chooses entertainment over voluntarily playing a serious game. While 93% of American adults use the internet and 72% use social media, most Americans approach cybersecurity with fear, confusion, or apathy, resulting in a reluctance to find training for personal cybersecurity awareness and education. This project combines entertainment and cybersecurity awareness by using a popular Hollywood storytelling approach, Save the Cat, with cybersecurity lessons to create an entertaining useful game. Iterative development, informed by user studies, ensures the game maintains a “fun” factor while improving the player’s cybersecurity awareness. Integrating cybersecurity knowledge into stories, with the primary goal of entertaining, allows the average American adult to improve their personal cybersecurity through a positive experience. Based on project results, not all useful games need to be serious games, and the average person does not need to seek cybersecurity knowledge to gain awareness.
Friday, August 11
HARSHAVARDHAN KAKARLA
Chair: Dr. Yang Peng
Candidate: Master of Science in Cybersecurity Engineering
5:45 P.M.
Project: Quick Connection/Handoff In An Opportunistic Vehicular Edge Computing Environment
The rise of connected vehicles has led to an explosion of data generated by vehicular networks. However, transmitting this data to centralized cloud servers can result in high latency and network congestion. Vehicular Edge Computing (VEC) presents a promising solution by offloading data processing to edge servers situated in close proximity to vehicles. This project introduces a VEC project that focuses on optimizing data offloading in connected vehicles, with a particular emphasis on quick connection and seamless handoff to ensure efficient edge computing in opportunistic environments.
The objective of this project is to design and implement a VEC framework that facilitates rapid data connection between connected vehicles and edge servers. The proposed system employs intelligent context-aware decision-making algorithm to dynamically choose edge servers based on network conditions, performance parameters, and vehicle proximity. By adopting this approach, the project aims to decrease data transmission time, reduce handoff delays, and enhance overall data processing efficiency, contributing to improved performance in opportunistic vehicular edge computing environments.
The project leverages Internet of Things (IoT) communication protocols, including MQTT, to establish real-time connections between the client, cloud service, and edge servers. Multiple algorithms are used to optimize edge server selection for seamless handoffs. Android devices and ESP32 microcontroller modules act as clients and edge servers, while AWS IoT Core and DynamoDB serve as cloud services respectively. The implementation involves Java for Android application development and C++ for Arduino IDE programming.
Comprehensive experimentation and testing of the VEC framework have been conducted by downloading the APK to multiple android devices which were used as clients and ESP32 microcontroller as the server. The results showcase significant improvements in connection and handoff delays. The system demonstrates exceptional performance in managing varying network conditions and ensuring seamless data processing at edge servers, enhancing the overall efficiency of connected vehicles.
The project’s findings highlight the effectiveness of Vehicular Edge Computing in addressing the challenges of connection handoffs in opportunistic environments. By optimizing data connection and enabling seamless handoff, the VEC framework contributes to faster data processing, reduced latency, and enhanced overall network efficiency. The project’s emphasis on quick connection and handoff in an opportunistic setting underscores the potential of edge computing to cater to the dynamic nature of vehicular networks, making it a valuable contribution to the field of connected vehicle technology.
Future endeavours focus on extending the VEC framework to accommodate a larger number of connected vehicles and edge servers, enabling scalability for city-wide smart transportation applications. Additionally, the integration of advanced handoff algorithms and context-aware decision-making techniques could further enhance the system’s performance and adaptability in opportunistic environments. Conducting field trials with live vehicular data would validate the framework’s effectiveness in real-world scenarios and provide insights into its practical deployment.
Spring 2023
Friday, May 26
NEIL PRAKASAM
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
8:45 A.M.
Thesis: A System for Secure and Categorized Video-Sharing
Online video sharing is a phenomenon which continues to be increasingly utilized by the entire population. Preserving the privacy of videos shared online is of utmost importance, but there is one use case that hasn’t yet been covered by current mainstream video sharing platforms. This project aims to provide the ability to categorize whether multiple videos are of the same event, so that users can share them only amongst others who were also present at the event and have video evidence. The main method of categorization will be through DNA sequencing, where video files will be converted into literal DNA in order to be categorized into 4 categories. This includes those that are of the same event, space, activity, or are completely different videos. The research has shown promising results that can be further optimized to categorize videos between the 4 categories, let alone whether or not they are of the same event.
NIHARIKA JAIN
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: FortifyML: Machine Learning Security Recommender for Everyone
Deep Neural Networks (DNN) models have achieved remarkable performance in various applications, ranging from language translation to autonomous vehicles. However, studies have shown that DNN models are vulnerable to adversarial attacks, wherein malicious inputs are carefully crafted to deceive the models. Adversarial attacks pose a significant threat to the reliability and security of DNN models, especially in critical applications such as robotics, finance, text-to-speech, healthcare, and even national security. The number of research publications focused on adversarial attacks and defenses, including the sophistication of approaches has grown immensely since the first such publication in 2013. Even with an abundance of research papers on adversarial attacks, there is a lack of tools or systems that can coherently and systematically align a researcher or user with the specific defenses that could strengthen their individual use case, more so for beginners in the machine learning domain. In this paper, we extended FortifyML, an existing machine learning security recommender for everyone. We have accomplished the following with this project: 1) Successfully extended the recommender system to support DNN models in the Natural Language Processing (NLP) domain, making it a valuable tool for researchers and practitioners in the field of machine learning. 2) Simplified the user interface to make the system accessible to everyone, including beginners. 3) Added suggested links to articles or academic papers to direct users to additional details regarding potential attacks or mitigation strategies. 4) Recommendations made by the system are based on real-world statistics collected by running actual attacks and defenses in contained environments and can act as guidelines out of the box. As a result, FortifyML will help guide machine learning engineers and practitioners secure the models and their applications without the need for explicit security knowledge.
Tuesday, May 30
SANGYOON JIN
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: A Study on the Password Usage Behavior of Online Users
This project aims to measure online users’ password usage behavior and examine the relationship between it and its antecedent factors based on Protection Motivation Theory (PMT). PMT is a representative theory that explains the process of changing protection motivation according to threat messages in the health science field and has been extended to the area of information security. This project uses PMT to explain how motivation for password usage behavior is formed against the threat of password compromise. While previous studies applying PMT have observed its relevance concerning protective behavioral intentions, such as the intention to comply with information security guidelines, this project focuses on the relevance concerning password usage behavior in terms of password security strength.
The Qualtrics survey platform and Amazon Mechanical Turk are used to create and distribute a survey. In addition, the survey uses the Rockyou.txt file, which contains password usage behaviors of past online users. The result suggests that different password usage behaviors are identified according to the characteristics of each user. In addition, multiple regression analysis derives some relationship between the PMT model and password usage behavior. At the same time, we found that the explanatory power of the antecedents can be enhanced in an extended PMT model that also considers the information security climate of the organization to which the online users belong. These findings suggest the need to consider new research models for future research in the field of password-based information security. Furthermore, these results can contribute to providing customized password policies to organizations that ultimately need to improve information security.
PAUL BEARD
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: Two Factor Message Authentication for Entry/Exit Operations During Autonomous Vehicle Platooning
As the use of autonomous vehicles becomes more prevalent, ensuring secure and reliable communication between the vehicles is crucial. One important aspect of this communication is message authentication during the docking and undocking process, which involves verifying the identity of the vehicles so that the origin of the message can be established.
An evolution of that autonomy involves vehicular operations know as platooning. This transit method involves multiple autonomous vehicles that are connected, either logically or physically, and behaving as a single unit. This capstone paper will discuss available methods to securely communicate during the entry and exit functions of those platooning operations. A vulnerable period during the platooning process occurs when a vehicle enters and exits the platoon.
The performance of various authentication methods has been analyzed based on security, computational complexity, and communication overhead. Additionally, the implementation feasibility of each method has been assessed for the docking/undocking process.
Overall, this paper will contribute to the body of knowledge on secure message authentication in autonomous vehicles and provide insights into the best practices for ensuring secure and trustworthy communication between autonomous vehicles during the docking and undocking process. Ultimately, this will help ensure the safety and security of autonomous vehicles and their passengers.
Thursday, June 1
ADITYA SIDDALINGASWAMY
Chair: Dr. Erika Parsons
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: Cybersecurity Framework for an Edge Computing Medical Application for Stroke Patient Rehabilitation
This project aims to design and develop a cybersecurity framework for an Edge Computing ecosystem used in a medical setting. In the scope of this project, the focus is a medical application for the rehabilitation of stroke patients. Such patients often experience spatial neglect, a condition that significantly affects their functional recovery and quality of life, making rehabilitation crucial. This work is part of a larger project to make use of the aforementioned Edge Computing ecosystem, which in the long term, is geared to make the rehabilitation experience more enjoyable for the patients while collecting data to help medical providers monitor patient progress and design individualized treatment plans. The design of the ecosystem is based on technologies such as Edge Computing, Cloud Computing, IoT, Kubernetes, and, from a medical standpoint, Electroencephalography. The use of all these technologies, used particularly in a medical environment, means that it is of the utmost importance to address cybersecurity risks to ensure patient data security and privacy. The project’s goal is to create a strong cybersecurity framework that protects patient data from unauthorized access and prevents data breaches while promoting collaboration among healthcare providers and technology experts. The project focused on studying the current importance of cybersecurity in the medical industry and the potential applications of edge computing, the importance of collaboration, and teamwork in developing technological solutions. By achieving the cybersecurity work objectives, the project has the potential to enable current and future efforts to improve the quality of stroke patient rehabilitation methods.
Friday, June 2
MATT DIOSO
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: CSI: Channel State Investigation, A Device Localization System based on Physical Layer Properties
Hidden streaming devices are becoming a more widespread issue as these types of devices become smaller and more accessible. Existing methods for localizing these devices in an environment require a user to traverse the area of interest while monitoring network traffic to draw correlation between digital spikes and physical location. These methods state a localization time ranging from five to 30 minutes depending on the size of the environment and the number of reference points throughout the area.
This work presents a system that greatly reduces this localization time and removing the need for the user to traverse an entire area, thus enabling detection in a wider variety of locations and situations. The use of RGB images to determine environment information from depth images provides the bounds in which the streaming device can be located. Localization time is greatly reduced by leveraging Channel State Information (CSI), a physical layer characteristic of transmitted signals, which has been proven to be more temporally stable than the RSSI value and provides richer, fine-grained data to learn position from. The results from this work show the following:
- Localization precision within 1.9m of device’s true location
- 0.98 F1 score with 0.98 recall and precision
- Removed physical requirement for users to traverse an area for localization efforts
- Localization estimation time greatly reduced from 5 – 30 minutes down to 30 seconds
BALAPRASANTH RAMISETTY
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: Developing individual Awareness on Phishing Towards Mitigating Increased Cases of Email Phishing
In today’s digital landscape, safeguarding our online safety and security is of utmost importance. This capstone report delves into the issue of phishing attempts, specifically focusing on email phishing, and explores effective measures to prevent and mitigate their impact. To gather insights, a comprehensive questionnaire was administered using the user-friendly Qualtrics survey platform. Participants were engaged with an embedded awareness video and provided valuable feedback and perspectives. The findings highlight the widespread occurrence of phishing attempts and underscore the significance of understanding their characteristics, identifying suspicious elements in emails, and recognizing different types of phishing attacks. Education and awareness emerge as critical factors in empowering individuals to effectively combat phishing attempts. The research findings contribute to the existing body of knowledge on phishing prevention, offering practical recommendations for individuals and organizations to bolster their resilience. By leveraging the Qualtrics platform and incorporating an awareness video, the survey methodology comprehensively captures participants’ perspectives, providing a deeper understanding of email phishing. This capstone report serves as a valuable resource for individuals, organizations, and security professionals seeking to tackle the persistent threat of phishing attacks. It presents insights, trends, strategies, and preventive measures to safeguard personal and sensitive information.
Winter 2023
Thursday, February 28
VINCENT SCHIARELLI
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.
Project: Evaluating the Public Perception of a Blockchain-Based Election
The concept of voter confidence was introduced into the political domain after the contentious recount of the 2000 United States presidential election results in Florida. Twenty years after this election, the concept of voter confidence has made headlines again as record low number of voters express confidence that votes will be accurately cast and counted nationwide. Even in the absence of specific security concerns regarding vote tabulation, the low voter confidence elicited by our existing voting infrastructure has impacts to our democratic institutions. As an alternative to existing voting infrastructure, some have proposed incorporating blockchain solutions into electoral systems. While blockchain could add additional transparency through mechanisms such as the public ledger and decentralized accounting, blockchain’s impact to voter confidence may not be straightforward. This project seeks to evaluate the public’s confidence in the ability of a blockchain-based voting system to fairly and accurately tabulate votes. In order to measure this confidence, the Technology Acceptance Model was leveraged in order to quantify the relationships between individuals and their perception of blockchain technology. A between groups experiment was performed in order to measure these relationships.
Thursday, March 2
CAROLYNE SIBOE
Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Cybersecurity Engineering
8:45 A.M.
Project: Towards Zero Trust Model Using Software Defined Perimeter in 5G Based IoT Networks
Internet of things (IoT) has emerged as one of the significant technologies in recent years, with the vision of providing ubiquitous connectivity to devices, things, and people ‘anytime’ and ‘anywhere’. This increased connectivity of many devices generates massive volumes of data requiring high bandwidth. The evolution of fifth generation cellular networking technology (5G) networks with their high throughput, extended coverage, reliable communication, and lower latency, has the potential to meet the continuously increasing demands of future IoT services and is thus becoming a major driving force for IoT. While the use of 5G technology for Internet of Things is gaining interest, research also points out various security issues in these networks. These attacks include advance scanning, Denial of Service (DoS) attacks, wireless jamming, Sensitive data exposure, Brute Force Attacks, DDoS, Man-In-The-Middle attacks, session hijacking among others. These security problems, if not addressed, will severely impede the deployment of 5G based IoT networks. Since 5G technology and its integration with IoT networks continues to evolve, there exists limited research on security of these networks. While some of the initial security research in these networks has focused on authentication and AI based privacy solutions, most of these solutions are conceptual in nature, with no real implementation. The aim of this capstone project is to simulate and implement a zero-trust security framework based on Software Defined Perimeter that can be used to prevent security attacks in 5G based IoT networks. We leverage the existing open source solution provided by Waverley Labs to evaluate and verify if it works on 5G SA (stand-alone) based IoT. The proposed framework ensures end to end security that helps in reducing the risk exposure of the IoT devices in 5G SA(stand-alone) network. Our results show that the proposed zero-trust SDP solution successfully protects the IoT Application server running on 5G SA network against various attacks and vulnerabilities. This confirms that SDP is an effective way of implementing zero trust security framework in 5G IoT based applications.
Monday, March 6
JAYANT JIRAGE
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
10:00 A.M.
Project: Ethical Penetration Test for AAA Washington
Penetration testing, which simulates an attack from the viewpoint of the attacker, is a well-known ethical hacking technique for actively examining and assessing the security of a network or information system. Modern industry standards suggest that vulnerability scanning and penetration testing are crucial for a company to keep a secure posture as cyberattacks escalate. Organizations functioning as payment gateways must conduct frequent penetration testing on their infrastructure to comply with industry standards. AAA Washington has expressed the need for an external penetration test to assess their internet-facing resources. This project aimed to carry out and conduct such a test by establishing a custom penetration testing methodology for the organization while creating a repeatable procedure for subsequent work. The test identified weak configurations and vulnerabilities in assets controlled by AAA Washington and exploited them. This test’s findings, executive summary, and detailed remediation recommendations were forwarded to AAA Washington for performing remedial action. Following the completion of corrective actions, a validation test was conducted to ensure all vulnerabilities had been patched per the specified remedial actions, completing the penetration testing life cycle.
Tuesday, March 7
MATTHEW HEWITT
Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.
Project: Connecting Security Design Patterns To Source Code
All software has a need to be secure in the tech industry. Inside the source code of each piece of software contains sections or implementations of secure patterns which are meant to strengthen the security of the overall product. Security patterns serve as the set of reusable building blocks of a secure software architecture that are able to provide solutions to particular recurring security problems in different contexts. Incomplete or nonstandard implementations of security patterns are enticing to attackers and open the door for vulnerabilities within the infrastructure to be discovered and exploited. Therefore, being able to detect these security patterns improves the quality of the security features and prevents vulnerabilities from occurring in the future. In this paper, we propose a means of connecting security design patterns to source code through the use of various software tools such as Visual Paradigm, PatternScout, CodeOntology, and Apache Jena. By using open source Java projects as a testing ground, it is possible to discover security design patterns scattered throughout a project, saving programmers and software engineers valuable time in trying to find them manually. We discuss the underlying architecture of PatternScout and its functionalities with converting UML diagrams to SPARQL queries, and how CodeOntology can take source code from each open source project and generate a meaningful set of RDF triples for parsing. The results are queries that represent portions of source code where the security design patterns exist, simplifying the search for these patterns across various projects and allowing quicker action in ensuring a more secure piece of software.
Thursday, March 9
JOSEPH KHAMSENE TSAI
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.
Project: Identification and Operationalization of Key Threats and Mitigations for the Cybersecurity Risk Management of Home Users
The negative impact of cybercrime against organizations and home users is on the rise. While organizations can utilize dedicated risk management teams to leverage robust and holistic risk management frameworks to address cybersecurity risks, the standard home user does not have access to such resources. This research leveraged the Delphi Technique over three separate surveys sent to cybersecurity professionals to identify what key threats and corresponding mitigations would be most important to home users. The identified threats and mitigations were then operationalized into a personalized security recommendations tool. The tool allows users to answer questions related to their security preferences and priorities and returns relevant results for management of their own security risk. As a result of this research, future research opportunities have been identified in the realm of cybersecurity risk management for both home users and organizations.
Autumn 2022
Wednesday, November 30
PETER VAN EENOO
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: Securing WireGuard with an HSM
WireGuard is a popular, secure, and relatively new VPN implementation that has seen widespread adoption. WireGuard’s basic key management in the reference implementation leaves some weaknesses that could be exploited by threat actors to steal keys, compromising a user’s identity or exploit their privileged access. In my project I combined the industry-standard practice of isolating sensitive data with cutting-edge support for Curve25519 keys on an HSM. I created a WireGuard-compatible fork called WireGuard-HSM which uses the PKCS#11 interface to securely access a user’s private key and perform privileged operations on a USB security key. After performing two threat model analysis and comparing the results, I show how my modifications improve the security of the WireGuard system by decreasing the attack surface and mitigating two vulnerabilities, if the host computer is compromised. WireGuard-HSMs security improvements come without a noticeable performance penalty.
Summer 2022
Thursday, July 14
CHRISTIAN DUNHAM
Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Thesis: Adversarial Trained Deep Learning Poisoning Defense: SpaceTime
Smart homes, hospitals, and industrial complexes are increasingly reliant on the Internet of Things (IoT) technology to unlock doors, regulate insulin pumps, or operate critical national infrastructure. While these technologies have made tremendous improvements that were not achievable before IoT, the increased the adoption of IoT has also expanded the attack surface and increased the security risks in these landscapes. Diverse IoT protocols and networks have proliferated allowing these tiny sensors with limited resources to both create new edge networks and deploy at depth into conventional internet stacks. The diverse nature of the IoT devices and their networks has disrupted traditional security solutions.
Intrusion Detection Systems (IDS) are one security mechanism that must adopt a new paradigm to provide measurable security in this technological evolution. The diverse resource limitations of IoT devices and their enhanced need for data privacy complicates centralized machine learning models used by modern IDS for IoT environments. Federated Learning (FL) has drawn recent interest adapting solutions to meet the requirements of the unevenly distributed nodes in IoT environments. A federated anomaly-based IDS for IoT adapts to the computational restraints, privacy needs, and heterogeneous nature of IoT networks.
However, many recent studies have demonstrated that federated models are vulnerable to poisoning attacks. The goal of this research is to harden the security of federated learning models in IoT environments to poisoning attacks. To the best of our knowledge poisoning defenses do not exist for IoT. Existing solutions to defend against poisoning attacks in other domains commonly utilize different spatial similarity measurements from Euclidean Distance (ED), cosine similarity (CS), and other pairwise measurements to identify poison attacks.
Poisoning attack methodologies have also adapted to IoT causing an evolution that defeats these existing defensive solutions. Poisoning evolution creates a need to develop new defensive methodologies. In this we develop SpaceTime a deep learning recurrent neural network that uses a four-dimensional spacetime manifold to distinguish federated participants. SpaceTime is built upon a time series regression many-to-one architecture to provide an adversarial trained defense for federated learning models. Simulation results shows that SpaceTime exceeds the previous solutions for Byzantine and Sybil label flipping, backdoor, and distributed backdoor attacks in an IoT environment.
SPRING 2022
Monday, May 16
MATTHEW SELL
Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: Designing an Industrial Cybersecurity Program for an Operational Technology Group
The design of a cybersecurity program for an Information Technology (“IT”) group is well documented by a variety of international standards, such as those provided by the U.S. National Institute of Standards and Technology (“NIST”) 800-series Special Publications. However, for those wishing to apply standard information security practices in an Operational Technology (“OT”) environment that supports industrial control and support systems, guidance is seemingly sparse.
For example, a search of a popular online retailer for textbooks on the implementation of an industrial cybersecurity program revealed only seven books dedicated to the subject, with another two acting as “how-to” guides for exploiting vulnerabilities in industrial control systems. Some textbooks cover the high-level topics of developing such a program, but only describe the applicable standards, policies, and tools in abstract terms. It is left as an exercise to the reader to explore these concepts further when developing their own industrial cybersecurity program.
This project expands on the abstract concepts described in textbooks like those previously mentioned by documenting the implementation of an industrial cybersecurity program for a local manufacturing firm. The project started with hardware and software asset inventories, followed by a risk assessment and gap analysis, and then implemented mitigating controls using a combination of manual and automated procedures. Security posture of the OT group was constantly evaluated against corporate security goals, the project-generated risk assessment, and NIST SP800-171 requirements. Improvements in security posture and compliance to corporate requirements were achieved in part through alignment with existing policies and procedures developed by the organization’s IT group, with the balance implemented and documented by the author of this project. The materials generated by this project may be used to assist other organizations starting their journey towards securing their industrial control assets.
Friday, May 20
JAYNIE A. SHORB
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: Malicious USB Cable Exposer
Universal Serial Bus (USB) cables are ubiquitous with many uses connecting a wide variety of devices such as audio, visual, and data entry systems and charging batteries. Electronic devices have decreased in size over time and they are now small enough to fit within the housing of a USB connector. There are harmless 100W USB cables with embedded E-marker chips to communicate power delivery for sourcing and sinking current to charge mobile devices quickly. However, some companies have designed malicious hardware implants containing keyloggers and other nefarious programs in an effort to extract data from victims. Any system compromise that can be implemented with a keyboard is possible with vicious implants. This project designs a malicious hardware implant detector by sensing current draw from the USB cable which exposes these insideous designs. The Malicious USB Exposer is a hardware circuit implementation with common USB connectors to plug in the device under test (DUT). It provides power to the DUT and uses a current sensor to determine the current draw from the cable. The output is a red LED bargraph to show if the DUT is compromised. Unless, the DUT contains LEDs internally, any red LED output shows compromise. Active long USB cables intended to drive long distances produce a false positive and are not supported. The minimum current sensed is 10mA which is outside the range of normal USB cables with LEDs (4-6mA), and E-Marker chips (1mA). Though there is another malicious USB detector on the market it is created by a malicious USB cable supplier and designed to detect their cable. This project provides an open source solution for distinguishing USB cables to uncover a range of compromised cables from different vendors.
Wednesday, May 25
ANKITA CHAKRABORTY
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.
Project: EXPLORING ADVERSARIAL ROBUSTNESS USING TEXTATTACK
Deep neural networks (DNNs) are subject to adversarial examples, that forces deep learning classifiers to make incorrect predictions of the input samples. In the visual domain, these perturbations are typically indistinguishable from human perception, resulting in disagreement between the classification done by people and state-of-the-art models. Small perturbations, on the other hand, are readily perceptible in the natural language domain, and the change of a single word might substantially affect the document’s semantics. In our approach, we perform ablation studies to analyze the robustness of various attacks in NLP domain and formulate ways to alter the factor of “Robustness” leading to more diverse adversarial text attacks. This work heavily relies on TextAttack (a Python framework for adversarial attacks, data augmentation, and adversarial training in NLP), for deducing the robustness of various models under attack from pre-existing or fabricated attacks. We offer various strategies to generate adversarial examples on text classification models which are anything but out of-context and unnaturally complex token replacements, easily identifiable by humans. We compare the results of our project with two baselines: Random and Pre-existing recipes. Finally, we conduct human evaluations on thirty-two volunteers with diverse backgrounds to guarantee semantic and grammatical coherence. Our research project proposes three novel attack recipes namely USEHomogyphSwap, InputReductionLeven and CompositeWordSwaps. Not only are these attacks able to reduce the prediction accuracy of current state-of-the-art deep-learning models to 0 % with the least number of queries, but also, they create crafted text that are visually imperceptible to human annotators to a great extent.
Wednesday, June 1
ROCHELLE PALTING
Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: A Methodology for Testing Intrusion Detection Systems for Advanced Persistent Threat Attacks
Advanced Persistent Threats (APTs) are well-resourced, highly-skilled, adaptive, malicious actors who pose a major threat to the security of an organization’s critical infrastructure and sensitive data. An Intrusion Detection System (IDS) is one type of mechanism used in detecting attacks. Testing with a current and realistic intrusion dataset, promptly detecting and correlating malicious behavior at various attack stages, and utilizing relevant metrics are critical in effectively testing an IDS for APT attack detection. Testing with outdated and unrealistic data would yield results unrepresentative of the IDS’s detection ability of real-world APT attacks. In this project, we present a testing methodology utilizing our recommended procedure for preparing the intrusion dataset along with recommended evaluation metrics. Our proposed testing methodology incorporates a software program we develop which dynamically retrieves real-world intrusion examples compiled in the MITRE ATT&CK knowledge base, presents the list of known APT tactics and techniques for user selection into their scenario, and exports the attack scenario to an output file consisting of the selected APT tactics and techniques. Our testing methodology, along with attack scenario generator, provide IDS testers with guidance in testing with a current and realistic dataset and with additional evaluation data points to improve their IDS under test. The benefits IDS testers are afforded include time saved in dataset preparation and improved reliability in their IDS APT detection evaluation.
Thursday, June 2
CHRISTOPHER COY
Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.
Project: Multi-platform User Activity Digital Forensics Intelligence Collection
In today’s interconnected world, computing devices are employed for all manner of professional and personal activity, from implementing business processes and email communications to online shopping and web browsing. While most of this activity is legitimate, there are user actions that violate corporate policy or constitute criminal activity, such as clicking a link in a phishing email or downloading child sexual abuse material.
When a user is suspected of violating policies or law, a digital forensic analyst is typically brought in to investigate the traces of user activity on a system in an effort to confirm or refute the suspected activity.
Digital forensics analysts need the capability to quickly and easily collect and process key user activity artifacts that enable rapid analysis and swift decision making. The FORINT project was developed to provide digital forensics analysts with this very capability across multiple operating systems.
SARIKA RAMESH BHARAMBE
Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.
Project: New Approach towards Self-destruction of Data in Cloud
One of the most pressing issues faced by cloud service industry is ensuring data privacy and security. Dealing with data in a cloud environment that leverages shared resources, as well as offering reliable and secure cloud services, necessitates a strong encryption solution that has no or minimal performance impact. One of the approaches towards this issue is to introduce self-destruction of data which mainly aims at protecting the shared data. Encrypting files is a simple way to protect personal or commercial data. Using a hybrid RSA AES algorithm, we propose a time-based self-destruction method to address the above difficulties and improve file encryption performance and security using file split functionality. Each data owner must set an expiration limit on the contents for collaboration which will initialize after uploading file to the cloud. Once a user-specified expiration period has passed, the sensitive information is securely self-destructed.
In this approach we have introduced on how to use channels on clouds which will help increase the data security as we split the bits of each word and upload it in encrypted format. For this purpose, we are using ThingSpeak, a cloud platform used for visualization, analyzation and sharing data through public and private channels. We experimentally test the performance overhead of our approach with ThingSpeak and use realistic tests to demonstrate the viability of our solution for enhancing the security of cloud-based data storage. For encryption and decryption technique we have used Hybrid RSA AES algorithm. Through results of various experiments performed, we can conclude that this algorithm has higher efficiency, increased accuracy, better performance, and security benefits.
Master of Science in Electrical & Computer Engineering
SPRING 2024
Tuesday, May 14
CHERYL KUNG
Chair: Dr. Walter Charczenko
Candidate: Master of Science in Electrical Engineering Engineering
3:30 P.M.; Discovery Hall 464
Thesis: Low-Cost 3-D Printed Helical Antenna with Dielectric Support
Satellite-based internet connection requires high directivity, millimeter wave phased array antennas to be able to receive and transmit signals effectively. Phased array antennas for millimeter waves have historically been very expensive to manufacture. Exploring low-cost methods for manufacturing high directivity antennas may bring down the costs of these systems, allowing more equitable access to internet.
Helical antennas are a type of high directivity antenna that can be used for these purposes. However, helical antennas are difficult to manufacture and scale due to its three dimensional (3-D) shape of the helix conductor. New 3-D printing technology allows the creation of a dielectric support for the helical antenna element. This adds mechanical rigidity to the antenna and is feasible for high volume manufacturing at a lower cost.
This thesis explores the design of a low-cost helical antenna using a 3-D printed dielectric core for mechanical support. The research in this thesis concludes that it is possible to design a helical antenna using low-cost dielectric materials with high relative permittivity at microwave frequencies. As a proof of principle, a 5 GHz helical antenna embedded in a solid dielectric was designed and modeled using electromagnetic field simulation software. At 5 GHz, the software simulations can be compared to helical antennas that are manufactured on conventional 3-D printers and commonly used resin dielectrics. The conclusion and results of the computer simulations show that helical antennas with dielectric support will radiate in the axial mode with high directivity and circular polarization.
Summer 2023
Friday, August 11
PRITAM BHANDARI
Chair: Dr. Seungkeun Choi
Candidate: Master of Science in Electrical Engineering
11:00 A.M.
Thesis: Characterization of Nafion Based Resistive RAM Devices
The development of computers in the modern era has escalated the race towards the development of powerful and efficient memory devices. In the past hundred years, we have gone from using punch cards for a mere kilobyte of storage to an enormous capability of storing terabytes of data. This progress has been mainly possible due to the advancement in the field of non- volatile memory devices and the fabrication technologies. By the use of advanced miniaturization techniques and new materials, we have been able to dramatically reduce the size of the memory devices while increasing the storage capacity and computing performance.
Despite having achieved this feat of miracle, we are reaching a point of slower growth in the computing performance of MOSFET-based non-volatile memory devices. It becomes very difficult to further decrease the size of memory devices. Hence, the next generation memory technology must have the following features to meet the high computing performance in the era of artificial intelligence: low-power consumption, fast switching, non-volatile, high-density fabrication. Resistive Random-Access Memory Devices (ReRAM) meets all those requirements; hence, is considered as one of the best candidates for the next generation memory technologies.
In this research, a ReRAM device with Nafion as a switching layer was fabricated. To characterize the resistive switching performance, Nafion was annealed at three different temperatures: 30°C, 60°C, and 90°C. In order to study the effect of different electrode, we used two different bottom electrodes (Au and Cu) and Al as a top electrode. The devices with Cu as a bottom electrode exhibits good resistive switching properties while the device with Au as a bottom electrode shows little or negligible switching performance. We found that the performance of switching is best when Nafion is annealed at 60°C. However, the experiment shows a wide variation of device performance even in the same substrate, indicating the importance of uniform film thickness and quality of Nafion.
Summer 2022
Tuesday, August 9
MOOSA RAZA
Chair: Dr. Seungkeun Choi
Candidate: Master of Science in Electrical Engineering
8:45 A.M.
Thesis: Multilevel Resistive Switching in a Metal Oxide Semiconductor based on MoO3
Over the years a resistive random-access memory (ReRAM) has received so much attention among many emerging memory technologies due to its simple structure and fabrication process, cost-effective development, low-power consumption, scalability, high throughput, and other attracting memory characteristics.
Multilevel switching operation of the stack ReRAM device based on the MoO3 has been investigated by the compliance current control method, where the device exhibited 2-bit per cell memory storage density. Device realized the bipolar resistive switching mode with the varying levels of high resistive state (HRS) between 11.7 Ω to 90 Ω and low resistive state (LRS) between 3.89 Ω to 47 Ω read at 0.01 V during the endurance characteristics, exhibiting the variable OFF/ON ratio between 1.6 to 15. The device also showed insignificant variations in the switching voltages such as set voltage (Vset) between 0.22 V to 0.27 V and reset voltage (Vreset) between 0.15 V to 0.30 V was observed over 11 resistive switching cycles, when swept between -0.5 V to +0.5 V.
Also, the unique resistive switching behavior of the novel lateral ReRAM device based on MoO3 has been reported, showing multiple set and reset voltages in both the positive and negative voltage regimes and maintained the consistency across the switching voltages i.e., Vset A, Vset B, Vset C, Vreset A and Vreset B were noticed around -40 V, 40 V, -10 V, 40 V and -40 V respectively throughout the 105 switching cycles. The device also exhibits the self-compliance property at much smaller currents around few microamperes (≅ 0.9 μA), making it suitable for the wide range of power applications. Further investigation is required to determine the plausible applications of the unique resistive switching properties achieved from the lateral ReRAM
AUTUMN 2021
Friday, December 3
SHARMILA DEVI KANNIVELU
Chair: Dr. Sunwoong S. Kim
Candidate: Master of Science in Electrical Engineering
8:45 A.M.
Thesis: Privacy-Preserving Image Filtering and Thresholding Using Numerical Methods for Homomorphically Encrypted Numbers
Homomorphic encryption (HE) is an important cryptographic technique that allows one to directly perform computation on encrypted data without decryption. In HE-based applications using digital images, a user often encrypts a private image captured on a local device. This image can contain noise that negatively affects the results of HE-based applications. To solve this problem, this thesis paper proposes a HE-based locally adaptive Wiener filter (HELAWF). For small-sized encrypted input data, pixels that have no dependency when sliding a window are encoded into the same ciphertext. For division in the adaptive filter, which is not supported by conventional HE schemes, a numerical approach is adopted. Image thresholding is a method of segmenting a region of interest and is used in many real-world applications. Typically, image thresholding contains a comparison operation, but this operation is not supported in conventional HE schemes. To solve this problem, a numerical approach for comparison operation is used in the proposed HE-based image thresholding (HETH). The proposed HELAWF and HETH designs are integrated and implemented as a proof-of-concept client-server model. In practical HE schemes, the number of consecutive multiplications on encrypted data is limited. Therefore, the number of iterations of the numerical methods used in the integrated design is carefully chosen. To the best of the authors’ knowledge, this thesis paper is the first work that applies approximate division and comparison operation over encrypted data to image processing algorithms. The proposed solutions can address important privacy issues in image processing applications in internet-of-things and cyber-physical systems, where many devices are connected through a vulnerable network.
SPRING 2021
Tuesday, June 1
COURTNEY CHAN CHHENG
Chair: Dr. Denise Wilson
Candidate: Master of Science in Electrical Engineering
5:45 P.M.
Thesis: Abnormal Gait Detection using Wearable Hall-Effect Sensors
Abnormalities and irregularities in walking (gait) are predictors and indicators of both disease and injury. Gait has traditionally been monitored and analyzed in clinical settings using complex video (camera-based) systems, pressure mats, or a combination thereof. Wearable gait sensors offer the opportunity to collect data in natural settings and to complement data collected in clin-ical settings, thereby offering the potential to improve quality of care and diagnosis for those whose gait varies from healthy patterns of movement. This paper presents a gait monitoring system designed to be worn on the inner knee or upper thigh. It consists of low-power Hall-effect sensors positioned on one leg and a compact magnet positioned on the opposite leg. Wireless data collected from the sensor system were used to analyze stride width, stride width variability, cadence, and cadence variability for four different individuals engaged in normal gait, two types of abnormal gait, and two types of irregular gait. Using leg gap variability as a proxy for stride width variability, 81% of abnormal or irregular strides were accurately identified as different from normal stride. Cadence was surprisingly 100% accurate in identifying strides which strayed from normal, but variability in cadence provided no useful information. This highly sensitive, non-contact Hall-effect sensing method for gait monitoring offers the possibility for detecting visually imperceptible gait variability in natural settings. These nuanced changes in gait are valuable for predicting early stages of disease and also for indicating progress in recovering from injury.
WINTER 2021
Friday, March 12
RUOHAO “EDDIE” LI
Chair: Dr. Kaibao Nie
Candidate: Master of Science in Electrical Engineering
11:00 A.M.
Thesis: Improving Keywords Spotting in Noise with Augmented Dataset from Vocoded Speech and Speech Denoising
As more electronic devices have an on-device Keywords Spotting (KWS) system, producing and deploying trained models for keyword(s) detection is becoming more demanding. The dataset preparation process is one of the most challenging and tedious tasks in Keywords Spotting. It requires a significant amount of time to obtain raw or segmented audio speeches. In this thesis, we first proposed a data augmentation strategy using a speech vocoder to generate vocoded speech at different numbers of channels artificially. Such a strategy can increase the dataset size by at least two-fold, depending on the use case. With the new features introduced by the different number of channels of the vocoded speeches, a convolutional neural network (CNN) KWS system trained with the augmented dataset from vocoded speech showed promising improvement evaluated at +10 dB SNR noisy condition. The same results were confirmed in hardware implementation and proved using vocoded speech in data augmentation is the potential to improve KWS on microcontrollers. We further proposed a neural-network-based speech denoising system using the Weighted Overlap-Add (WOLA) algorithm for feature extraction for more efficient processing. The proposed speech denoising system uses regression between a noisy speech and a clean speech and converts noisy speech (as input) into clean speech (as output). Thus, the input of the proposed KWS system will be relatively clean speech. Furthermore, by changing the training target to vocoded speech, such a speech denoising system can convert noisy speech (as input) into vocoded speech (as output). The combination of speech denoising and vocoded speech in data augmentation achieved relatively high accuracy when evaluated at +10 dB SNR noisy condition.
SPRING 2019
Friday, June 7
FEIFAN LAI
Chair: Dr. Kaibao Nie
Candidate: Master of Science in Electrical Engineering
11:00 A.M.
Thesis: Intelligent background sound event detection and classification based on WOLA spectral analysis in hearing devices
Audio signals from real-life hearing devices typically contain background noises. The purpose of this thesis is to build a system model which can automatically separate background noise from noisy speech, and then classify background sound into predefined event categories. This thesis proposed to use weighted overlap-add algorithm (WOLA) for feature extraction and feed-forward neural network for sound event detection. In this approach, an energy signal trough detection algorithm is used to separate out speech gaps which primarily contain background noise. To further analyze the noise signal’s spectrum, the WOLA algorithm is used to extract spectral features by transforming a fraction of time domain signal into frequency domain data represented in 22 channels. Moreover, a feed-forward neural network with one hidden layer is used to recognize each event’s diverse spectral feature pattern. It then produces classification decisions based on confidence values. Recordings of 11 realistic background noise scenes (cafe, station, hallway …), mixed with human speech at Signal to Noise Ratio (SNR) of 5 dB, are used for training. The neural network will learn the mapping between spectral feature characteristics and sound event categories. After training, the neural network classifier is evaluated by measuring the accuracy of event classification. The overall detection accuracy has achieved 96%, while the event ‘hallway’ has the lowest detection rate at 85%. This detection algorithm also has the ability for improving noise reduction in hearing devices by applying distinct compensation gains, which will attenuate the noise dominated frequency bands for each particular predefined event. In our preliminary evaluation experiment, the application of gain patterns has been proven to be effective in reducing background noise. Its combinational usage with instant gain pattern would produce improved results with noticeably attenuated noise and smooth spectral cues in the processed audio output.
SUMMER 2018
Friday, August 3
MALIA STEWARD
Chair: Dr. Seungkeun Choi
Candidate: Master of Science in Electrical Engineering
3:30 P.M.
Thesis: Development of Corrugated Wrinkle Surface for an Organic Solar Cell
There have been great interest in organic photovoltaics (OPVs) due to their potential for the development of low-cost, high throughput, and large-area solar cells with a flexible form factor. Hence, the power conversion efficiency of OPVs has been dramatically improved for the past two decades. Although the power conversion efficiency (PCE) of OPVs exceeds 10% now, the PCE of this thin-film based solar cells is fundamentally limited by the ability of the photo-active layer to absorb the incident sunlight. The external quantum efficiency (EQE) is used to describe this ability and rarely exceeds 70% for the state-of-the-art OPVs, implying that only 70% of incident photons contributes to a photo-current generation. The EQE can be improved by trapping more light in the active layer which is very challenging for thin-film based photovoltaics.
In this research, I have investigated optimization of the organic solar cell fabrication by tuning a charge carrier transport layer and also developed a new metallization method in order to replace vacuum deposited silver electrode with electroplated copper which is less expensive and better fits to the industry manufacturing. I also investigated a number of methods to fabricate optimum wrinkle structure that can be used as a light trapping vehicle for organic solar cells. I fabricated wrinkles on SU-8 polymer by controlling softness of the SU-8. While wrinkles generally produced after metal deposition, I found that more suitable wrinkle profile can be fabricated before the metal deposition. Future work will focus on the development of reproducible, scalable, and high throughput wrinkle fabrication with an optimum profile and the demonstration of highly efficient organic solar cells by enhancing light trapping thanks to the wrinkles.