Thesis defence /computer-science/ en Master’s Thesis Presentation • Cryptography, Security, and Privacy (CrySP) • Compiler Support for Constant-Time Programs in LLVM /computer-science/events/masters-thesis-presentation-crysp-compiler-support-for-constant-time-programs-in-llvm <span class="field field--name-title field--type-string field--label-hidden">Master’s Thesis Presentation • Cryptography, Security, and Privacy (CrySP) • Compiler Support for Constant-Time Programs in LLVM</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Wed, 07/02/2025 - 15:43</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This master’s thesis presentation will take place online.</span></span></h2> <p><span><span><strong>Mehdi Aghakishiyev, Master’s candidate</strong><br /><em>David R. Cheriton School of Computer Science</em></span></span></p> <p><span><span><strong>Supervisors</strong>: Professors N. Asokan, Meng Xu</span></span></p> <p><span><span>Side-channel attacks aim to extract sensitive information by monitoring the additional information generated during program execution, such as execution time or power consumption. Certain coding patterns, such as using secret data in control flow and memory addressing instructions, cause the execution time of the program to vary based on secret input, making the program vulnerable to timing-based side-channel attacks. Constant-time programming offers a defence against such attacks; however, it is difficult to implement manually as it requires tracking secret data through complex program logic.</span></span></p> <p><span><span>In this thesis, we propose an automated approach to generate constant-time programs based on static analysis and program transformations. First, we use taint tracking to monitor the flow of secret input through the program and mark branching and memory addressing instructions that depend on secret data. Then, we apply program transformation techniques such as branch linearization to remove these dependencies and produce constant-time code. We perform our analysis and transformations on LLVM IR and implement our tool as part of the LLVM Pass Infrastructure.</span></span></p> <p><span><span>To evaluate our tool’s effectiveness, we apply our analysis and transformations to programs from the OISA benchmark. We validate our results through BliMe, an architecture performing hardware-enforced taint tracking to prevent side-channel attacks.</span></span></p> <hr /><p><span><span><a href="https://uwaterloo.zoom.us/j/99555441781">Attend this master’s thesis presentation virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Wed, 02 Jul 2025 19:43:32 +0000 Joe Petrik 3982 at /computer-science Master’s Thesis Presentation • Formal Methods • Local Theories and Efficient Partial Quantifier Elimination /computer-science/events/masters-thesis-presentation-formal-methods-local-theories-and-efficient-partial-quantifier-elimination <span class="field field--name-title field--type-string field--label-hidden">Master’s Thesis Presentation • Formal Methods • Local Theories and Efficient Partial Quantifier Elimination</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Thu, 06/26/2025 - 11:59</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This master’s thesis presentation will take place in DC 2310 and online.</span></span></h2> <p><span><span><strong>Estifanos Getachew, Master’s candidate</strong><br /><em>David R. Cheriton School of Computer Science</em></span></span></p> <p><span><span><strong>Supervisors</strong>: Professors Arie Gurfinkel, Richard Trefler</span></span></p> <p><span><span>Quantifier elimination is used in various automated reasoning tasks, including quantified SMT solving, exists/forall solving, program synthesis, model checking, and constrained Horn clause (CHC) solving. Complete quantifier elimination, however, is computationally intractable for many theories. The recent algorithm QEL shows a promising approach to approximate quantifier elimination, which has resulted in improvements in solver performance. QEL performs partial quantifier elimination with a completeness guarantee that depends on a certain semantic property of the given formula.</span></span></p> <p><span><span>In this thesis, we study local theories, focusing on their proof theoretic and semantic characterization. We identify a subclass of local theories in which partial quantifier elimination can be performed efficiently. By considerably generalizing the previous approach, we present T-QEL, a parametrized polynomial time algorithm that is relatively complete for this class of theories. The algorithm utilizes the proof theoretic characterization of the theories, which is based on restricted derivations. Finally, we prove for T-QEL, soundness in general, and relative completeness with respect to the identified class of theories.</span></span></p> <hr /><p><span><span>To attend this master’s thesis presentation in person, please go to DC 2310. You can also <a href="https://uwaterloo.zoom.us/j/98744965137">attend virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Thu, 26 Jun 2025 15:59:20 +0000 Joe Petrik 3977 at /computer-science PhD Defence • Human-Computer Interaction • Concerto: Elementary VR/AR Interactions Augmented by Bimanual Input /computer-science/events/phd-defence-hci-concerto-elementary-vr-ar-interactions-augmented-by-bimanual-input <span class="field field--name-title field--type-string field--label-hidden">PhD Defence • Human-Computer Interaction • Concerto: Elementary VR/AR Interactions Augmented by Bimanual Input</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Thu, 06/19/2025 - 16:11</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This PhD defence will take place in DC 2314 and online.</span></span></h2> <p><strong><span><span>Futian Zhang, PhD candidate</span></span></strong><br /><em><span><span>David R. Cheriton School of Computer Science</span></span></em></p> <p><span><span><strong>Supervisors</strong>: Professors Jian Zhao, Keiko Katsuragawa</span></span></p> <p><span><span>While unimanual input is widely adopted in Virtual Reality/Augmented Reality systems with head-mounted displays, some of them still require multiple steps to complete. Although the total task completion time may not be long, a high frequency of usage will still increase the overall activity duration and reduce user efficiency. However, when the user is executing unimanual input with the dominant hand, the non-dominant hand is mostly left unused, which could potentially help execute some steps synchronously with the dominant hand, while maintaining an intuitive user experience. By harnessing the capabilities of both hands simultaneously, bimanual input can extend the interaction dimension, complementing existing unimanual methods for elementary tasks, like pointing, locomotion, and command access, which are frequently used in short-time tasks. This thesis, Concerto, explores how bimanual input can be leveraged to improve the performance of elementary tasks.</span></span></p> <p><span><span>Three projects examine how bimanual input can improve efficiency in VR/AR.</span></span></p> <p><span><span>In the first project, I present a new interaction technique called Conductor, which aims to improve points, one of the fundamental tasks in VR/AR, by leveraging bimanual input. Conductor is an intuitive, intersection-based 3D pointing technique where users utilize their non-dominant hand to adjust the cursor distance along a ray while pointing with their dominant hand. I evaluate Conductor against Raycursor, a state-of-the-art VR pointing technique, and demonstrate that Conductor offers superior performance in selection tasks.</span></span></p> <p><span><span>In the second project, I examine how Conductor can be adapted to another critical VR task: locomotion. I introduce Fly The Moon To Me (or Locomoontion for short), a technique that allows users to create a preview copy of the object they wish to approach, reposition it ideally using Conductor, and then seamlessly align the original object with this preview, along with the rest of the virtual environment. A teleportation experiment, in which participants locate a box and place another smaller box inside it, shows that Locomoontion is faster and requires less physical effort than the traditional Point & Teleport method, even with the Point & Tug modification.</span></span></p> <p><span><span>In the third project, I explore how bimanual input can improve command selection tasks through shortcut mechanisms. I introduce the Drum Menu, a shortcut menu inspired by traditional marking menus, featuring three input methods with both unimanual and bimanual options for 4-item and 8-item VR controller layouts. Users can select commands by rotating the joystick, drawing a stroke, or pointing in specific directions. Bimanual input provides simultaneous access to two menu levels. A controlled user study shows that bimanual variants are faster than unimanual ones in the 4-item layout. Users favour the bimanual joystick menu for the 4-item layout, though the 8-item layout shows increased error rates for advanced users.</span></span></p> <p><span><span>Together, this thesis pushes the boundaries of bimanual input, employing new interaction techniques to enhance the productivity of VR/AR users.</span></span></p> <hr /><p><span><span>To attend this PhD defence in person, please go to DC 2314. You can also <a href="https://uwaterloo.zoom.us/j/95773861145">attend virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Thu, 19 Jun 2025 20:11:28 +0000 Joe Petrik 3972 at /computer-science PhD Defence • Software Engineering • Precise and Scalable Constraint-based Type Inference for Incomplete Java Code Snippets in the Age of Large Language Models /computer-science/events/phd-defence-se-precise-scalable-constraint-based-type-inference-for-incomplete-java-code-snippets-in-age-of-llms <span class="field field--name-title field--type-string field--label-hidden">PhD Defence • Software Engineering • Precise and Scalable Constraint-based Type Inference for Incomplete Java Code Snippets in the Age of Large Language Models</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Wed, 06/18/2025 - 16:07</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This PhD defence will take place in DC 2310 and online.</span></span></h2> <p><span><span><strong>Yiwen Dong, PhD candidate</strong><br /><em>David R. Cheriton School of Computer Science</em></span></span></p> <p><span><span><strong>Supervisor</strong>: Professor Chengnian Sun</span></span></p> <p><span><span>Online code snippets are prevalent and are useful for developers. These snippets are commonly shared on websites such as Stack Overflow to illustrate programming concepts. However, these code snippets are frequently incomplete. In Java code snippets, type references are typically expressed using simple names, which can be ambiguous. Identifying the exact types used requires fully qualified names typically provided in import statements. Despite their importance, such import statements are only available in 6.88% of Java code snippets on Stack Overflow. To address this challenge, this thesis explores constraint-based type inference to recover missing type information. It also proposes a dataset for evaluating the performance of type inference techniques on Java code snippets, particularly large language models (LLMs). In addition, the scalability of the initial inference technique is improved to enhance applicability in real-world scenarios.</span></span></p> <p><span><span>The first study introduces SnR, a constraint-based type inference technique to automatically infer the exact type used in code snippets and the libraries containing the inferred types, to compile and therefore reuse the code snippets. Initially, SnR builds a knowledge base of APIs, i.e., various facts about the available APIs, from a corpus of Java libraries. Given a code snippet with missing import statements, SnR automatically extracts typing constraints from the snippet, solves the constraints against the knowledge base, and returns a set of APIs that satisfies the constraints to be imported into the snippet. When evaluated on the StatType-SO benchmark suite, which includes 267 Stack Overflow code snippets, SnR significantly outperforms the state-of-the-art tool Coster. SnR correctly infers 91.0% of the import statements, which makes 73.8% of the snippets compilable, compared to Coster’s 36.0% and 9.0%, respectively.</span></span></p> <p><span><span>The second study evaluates type inference techniques, particularly of LLMs. Although LLMs demonstrate strong performance on the StatType-SO benchmark, the dataset has been publicly available on GitHub since 2017. If LLMs were trained on StatType-SO, then their performance may not reflect how the model would perform on novel, real-world code, but rather result from recalling examples seen during training. To address this, this thesis introduces ThaliaType, a new, previously unreleased dataset containing 300 Java code snippets. Results reveal that LLMs exhibit a significant drop in performance when generalizing to unseen code snippets, with up to 59% decrease in precision and up to 72% decrease in recall. To further investigate the limitations of LLMs in understanding the execution semantics of the code, semantic-preserving code transformations were developed. Analysis showed that LLMs performed significantly worse on code snippets that are syntactically different but semantically equivalent. Experiments suggest that the strong performance of LLMs in prior evaluations was likely influenced by data leakage in the benchmarks, rather than a genuine understanding of the semantics of code snippets.</span></span></p> <p><span><span>The third study enhances the scalability of constraint-based type inference by introducing Scitix. Constraint-solving becomes computationally expensive using a large knowledge base in the presence of unknown types (e.g. user-defined types) in code snippets. To improve scalability, Scitix represent certain unknown types as Any, ignoring such types during constraint solving. Then a iterative constraint-solving approach saves on computation and skips constraints involving unknown types. Extensive evaluations show that the insights improve both performance and scalability compared to SnR. Specifically, Scitix achieves F1-scores of 96.6% and 88.7% on StatType-SO and ThaliaType, respectively, using a large knowledge base of over 3,000 jars. In contrast, SnR consistently times out, yielding F1-scores close to 0%. Even with the smallest knowledge base, where SnR does not time out, Scitix reduces the number of errors by 79% and 37% compared to SnR. Furthermore, even with the largest knowledge base, Scitix reduces error rates by 20% and 78% compared to state-of-the-art LLMs.</span></span></p> <hr /><p><span><span>To attend this PhD defence in person, please go to DC 2310. You can also <a href="https://uwaterloo.zoom.us/j/92161542549">attend virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Wed, 18 Jun 2025 20:07:15 +0000 Joe Petrik 3970 at /computer-science PhD Defence • Data Systems | Differential Privacy • Towards Practicable and Efficient Private Learning /computer-science/events/phd-defence-data-systems-differential-privacy-towards-practicable-and-efficient-private-learning <span class="field field--name-title field--type-string field--label-hidden">PhD Defence • Data Systems | Differential Privacy • Towards Practicable and Efficient Private Learning</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Wed, 06/18/2025 - 09:47</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This PhD defence will take place in DC 2310 and online.</span></span></h2> <p><span><span><strong>Shubhankar Mohapatra, PhD candidate</strong><br /><em>David R. Cheriton School of Computer Science</em></span></span></p> <p><span><span><strong>Supervisor</strong>: Professor Xi He</span></span></p> <p><span><span>Data privacy is one of the top concerns in data science. The notion of privacy that is most used in practice is differential privacy. It offers a guaranteed bound on the loss of privacy even in worst-case assumptions. Multiple algorithms have been built to perform learning tasks such as training machine learning models or generating synthetic data using differential privacy. In practice, these algorithms need to be performed within the limit of an assigned privacy budget. This budget asks practitioners to limit the number of computations on the private dataset, including all routine procedures such as data cleaning, hyperparameter tuning, and model training. Several tools can perform these tasks in disjunction when the dataset is non-private. However, these tools do not translate easily to differential privacy and often do not consider the cumulative privacy costs.</span></span></p> <p><span><span>In this thesis, we explore various pragmatic problems that a data science practitioner may face when deploying a differentially private learning framework from data collection to model training. In particular, we are interested in real-world data quality problems such as missing data, inconsistent data, wrongly labelled data, and machine learning pipeline requirements such as hyperparameter tuning. We envision building a general-purpose private learning framework that can handle real data as input and can be used in learning tasks such as generating a highly accurate private machine learning model or creating a synthetic version of the dataset with end-to-end differential privacy guarantees. We envision our work will make differentially private learning more accessible to data science practitioners and easily deployable in day-to-day applications.</span></span></p> <hr /><p><span><span>To attend this PhD defence in person, please go to DC 2310. You can also <a href="https://uwaterloo.zoom.us/j/97189612659">attend virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Wed, 18 Jun 2025 13:47:02 +0000 Joe Petrik 3968 at /computer-science PhD Defence • Algorithms and Complexity • Optimal Graph Streaming Algorithms and Further Advances in Modern Models of Computation /computer-science/events/phd-defence-algorithms-and-complexity-optimal-graph-streaming-algorithms-and-further-advances-in-modern-models-of-computation <span class="field field--name-title field--type-string field--label-hidden">PhD Defence • Algorithms and Complexity • Optimal Graph Streaming Algorithms and Further Advances in Modern Models of Computation</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Fri, 06/13/2025 - 10:05</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This PhD defence will take place in DC 2314 and online.</span></span></h2> <p><span><span><strong>Vihan Shah, PhD candidate</strong><br /><em>David R. Cheriton School of Computer Science</em></span></span></p> <p><span><span><strong>Supervisor</strong>: Professor Sepehr Assadi</span></span></p> <p><span><span>The rise of large-scale datasets across domains such as social networks, biological systems, and the web has made it increasingly important to understand how core graph problems can be solved under tight resource constraints. As these datasets grow, traditional algorithms that assume random access to the input become increasingly infeasible. This thesis explores how to process massive graphs efficiently under modern models of computation that address these limitations. The primary focus is on the streaming model, and this thesis also explores other modern models, including sublinear-time, fully dynamic, and oracle-based models.</span></span></p> <p><span><span>The first part of the thesis develops space-optimal algorithms for fundamental graph problems in the streaming setting. We study approximate minimum cut, k-vertex connectivity, maximum matching, minimum vertex cover, and correlation clustering across different streaming models—including insertion-only, dynamic, and random-order streams. By establishing tight upper and lower bounds—often matching up to constant or polylogarithmic factors—these results resolve several open questions in the streaming literature and characterize entire space-approximation trade-offs for some of these problems.</span></span></p> <p><span><span>The second part of the thesis expands the exploration to other modern models of computation, including sublinear-time algorithms, fully dynamic algorithms, and oracle-based models, such as learning-augmented algorithms and streaming verification. We begin by studying the 4-cycle counting problem in the fully dynamic model and show an improvement over the previous best algorithm, demonstrating that the previously assumed natural bound was not tight. In the sublinear setting, we examine the problem of estimating matching size and prove strong lower bounds against non-adaptive algorithms. For the maximum independent set problem in the learning-augmented model, we develop a new algorithm that achieves a significantly improved approximation factor in polynomial time. Lastly, we explore the streaming verification model, focusing primarily on connectivity problems. Together, these contributions deepen our understanding of the fundamental limits and possibilities of algorithm design for massive data under constrained computational resources.</span></span></p> <hr /><p><span><span>To attend this PhD defence in person, please go to DC 2314. You can also <a href="https://uwaterloo.zoom.us/j/96385496153">attend virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Fri, 13 Jun 2025 14:05:34 +0000 Joe Petrik 3964 at /computer-science Master’s Thesis Presentation • Human-Computer Interaction • Exploring How AI-Suggested Politeness Strategies Influence Email Writing and Social Perception Among Native and Non-Native Speakers /computer-science/events/masters-thesis-presentation-hci-exploring-ai-suggested-politeness-strategies-influence-email-writing-social-perception-native-non-native-speakers <span class="field field--name-title field--type-string field--label-hidden">Master’s Thesis Presentation • Human-Computer Interaction • Exploring How AI-Suggested Politeness Strategies Influence Email Writing and Social Perception Among Native and Non-Native Speakers</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Thu, 06/12/2025 - 15:37</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This master’s thesis presentation will take place in DC 2310 and online.</span></span></h2> <p><span><span><strong>Zibo Selena Zhang, Master’s candidate</strong></span></span><br /><em><span><span>David R. Cheriton School of Computer Science</span></span></em></p> <p><span><span><strong>Supervisor</strong>: Professor Jian Zhao</span></span></p> <p><span><span>As AI writing assistants are increasingly used for interpersonal communication, they may have profound impacts on interpersonal relationships. Politeness is one important aspect of social communication that is grounded in people’s perception of relational dynamics and significantly shapes social interactions. We investigate how politeness strategies in AI-generated suggestions affect people’s email writing and alter their perception of the social situation.</span></span></p> <p><span><span>Through a within-subject online experiment (N = 52), we found that human writers tend to mirror the type of politeness strategies used in the AI suggestions when writing their own messages. Non-native English speakers are more affected by AI compared to native speakers, and this greater susceptibility to AI influence is partly mediated by higher reliance on AI tools. In addition, writers’ social perception is also influenced by AI. When writers are exposed to more deferential politeness strategy suggestions, they tend to perceive the social relationship as more distant. These findings highlight the need for better design of AI writing assistants that account for social contexts and individual differences.</span></span></p> <hr /><p><span><span>To attend this master’s thesis presentation in person, please go to DC 2310. You can also <a href="https://uwaterloo.zoom.us/j/98192658638">attend virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Thu, 12 Jun 2025 19:37:55 +0000 Joe Petrik 3962 at /computer-science PhD Defence • Bioinformatics • Advancing Proteomic Analyses with Graph-Based Deep Learning: Protein Inference and DIA De Novo Peptide Sequencing /computer-science/events/phd-defence-bioinformatics-advancing-proteomic-analyses-graph-based-deep-learning-protein-inference-dia-de-novo-peptide-sequencing <span class="field field--name-title field--type-string field--label-hidden">PhD Defence • Bioinformatics • Advancing Proteomic Analyses with Graph-Based Deep Learning: Protein Inference and DIA De Novo Peptide Sequencing</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Tue, 06/10/2025 - 09:55</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This PhD defence will take place online.</span></span></h2> <p><span><span><strong>Zheng Ma, PhD candidate</strong><br /><em>David R. Cheriton School of Computer Science</em></span></span></p> <p><span><span><strong>Supervisors</strong>: Professors Ali Ghodsi, Ming Li</span></span></p> <p><span><span>Proteomic analysis plays a central role in unraveling the complex molecular underpinnings of biological systems. However, traditional approaches to protein inference and peptide sequencing have been hampered by challenges such as data complexity, label scarcity, and spectral noise. In this thesis, we leverage advanced deep learning techniques to address these challenges, thereby expanding the efficacy of proteomic analyses.</span></span></p> <p><span><span>Our work is organized around three major contributions. First, we introduce GraphPI, a novel protein inference framework that redefines the inference problem as a node classification task within a tripartite graph structure. In GraphPI, proteins, peptides, and peptide-spectrum matches (PSMs) are modeled as interconnected nodes, while edges incorporate features such as peptide identification scores and a specialized peptide-sharing attribute. By harnessing a tailored graph neural network (GNN) architecture inspired by GraphSAGE, our approach effectively aggregates and propagates information across heterogeneous node types. Critically, GraphPI is trained in a semi-supervised manner using pseudo-labels generated from established protein inference methods, combined with hard negative decoy information. This training process not only circumvents the typical bottleneck of limited labeled data but also yields protein scores that generalize across diverse datasets, all while substantially reducing computational overhead relative to Bayesian network–based approaches. Experimental evaluations on multiple benchmark datasets demonstrate that GraphPI delivers competitive accuracy with significant speed improvements, thus paving the way for real-time applications in large-scale proteomic studies.</span></span></p> <p><span><span>Second, we present DIANovo, an innovative deep learning method designed to tackle the inherent complexities of Data-Independent Acquisition (DIA) data for de novo peptide sequencing. Unlike conventional de novo approaches that often struggle with the multiplexed nature of DIA spectra, DIANovo incorporates a suite of strategies to manage coelution and spectral noise. Our approach begins by constructing a spectrum graph that captures the mass differences between peaks. Next, a Transformer-based encoder, enhanced with Rotary Positional Embeddings (RoPE), processes the graph by encoding these mass differences along its edges, effectively treating the spectrum graph as fully connected. Furthermore, DIANovo introduces a coelution-aware pretraining stage, where the model is first optimized to predict ion types from coeluting peptides. This pretraining step equips the network with a nuanced understanding of spectral interferences, thereby improving the fidelity of subsequent peptide sequence predictions. In addition, a two-stage decoding strategy is employed: the first stage identifies an optimal path through the spectrum graph, while the second refines this path to generate a final amino acid sequence by filling in mass tags. Comparative analyses against state-of-the-art methods reveal that DIANovo achieves significant improvements in both amino acid and peptide recall, especially when applied to high-quality narrow-window DIA data obtained from next-generation instruments such as the Orbitrap Astral. Moreover, we investigate whether DIA identifies more peptides than DDA in de novo sequencing by comparing their performance on the same biological sample under varying acquisition modes and parameters. Our results demonstrate that DIA only outperforms DDA when employing narrower isolation windows.</span></span></p> <p><span><span>The third component of this thesis presents a comprehensive theoretical analysis that sheds light on the performance limits of peptide identification methods. By linking the signal-to-noise profile to peptide identification accuracy, our study elucidates the inherent trade-offs between Data-Dependent Acquisition (DDA) and DIA strategies. We derive quantitative metrics to predict peptide identification performance under a range of experimental conditions, and these predictions are validated against empirical data. This framework not only explains why Astral DIA data can provide superior peptide coverage in certain scenarios but also delineates the conditions under which peptide identification is most favorable. These insights are crucial for guiding the design of future mass spectrometry experiments and for optimizing computational pipelines in proteomic research.</span></span></p> <p><span><span>Collectively, the three contributions of this thesis demonstrate the transformative potential of integrating deep learning with advanced computational frameworks in proteomics. GraphPI and DIANovo both showcase how novel neural network architectures can overcome longstanding challenges in protein inference and de novo peptide sequencing, while the theoretical analysis provides a foundation for understanding and further refining these methodologies. The experimental results across multiple datasets underscore the robustness, efficiency, and generalizability of our approaches, suggesting that deep learning–based strategies will play an increasingly central role in the future of proteomic analysis.</span></span></p> <p><span><span>In conclusion, this work not only advances the state-of-the-art in protein and peptide identification but also offers practical solutions for handling large-scale, complex proteomic data. By bridging the gap between theoretical insights and practical implementations, our integrated framework lays the groundwork for enhanced biomarker discovery, more accurate disease diagnosis, and a deeper understanding of biological systems at the molecular level.</span></span></p> <hr /><p><a href="https://uwaterloo.zoom.us/j/98549170661">Attend this PhD defence virtually on Zoom</a>.</p> </div> </div> </div> </div> </section> Tue, 10 Jun 2025 13:55:50 +0000 Joe Petrik 3959 at /computer-science Master’s Thesis Presentation • Algorithms and Complexity • NP-hardness of Testing Equivalence to Sparse and Constant-support Polynomials /computer-science/events/masters-thesis-presentation-algorithms-and-complexity-np-hardness-testing-equivalence-sparse-constant-support-polynomials <span class="field field--name-title field--type-string field--label-hidden">Master’s Thesis Presentation • Algorithms and Complexity • NP-hardness of Testing Equivalence to Sparse and Constant-support Polynomials </span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Mon, 06/02/2025 - 14:47</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This master’s thesis presentation will take place in DC 2310 and online.</span></span></h2> <p><span><span><strong>Omkar Baraskar, Master’s candidate</strong></span></span><br /><em><span><span>David R. Cheriton School of Computer Science</span></span></em></p> <p><span><span><strong>Supervisors</strong>: Professors Éric Schost, Rafael Oliveira</span></span></p> <p><span><span>Given a list of monomials of a n-variate polynomial f and an integer s, decide whether there exists a invertible transform A such that f(Ax) has less than s monomials. This problem is called the Equivalence testing to sparse polynomials (ETsparse). It was studied in [Grigoriev, Karpinski 93] over Q, in this work, they give an exponential in n^4 time algorithm for the problem. The lack of progress in the complexity of the problem over last three decades raises a question, is ETsparse hard? In this thesis we give an affirmative answer to the question by showing that it is NP-hard over any field.</span></span></p> <p><span><span>Sparse orbit complexity of a polynomial f is the smallest integer s_0 such that there exists an invertible transform A such that f(Ax) has s_0 monomials. Since ETsparse is NP-hard hence computing the sparse orbit complexity is also NP-hard. We also show that approximating the sparse orbit complexity up to a factor of s_f^{1/3-e} for any e belonging to (0,1/3) is NP-hard, where s_f is the number of monomials in f. Interestingly, this approximation result has been shown without invoking the celebrated PCP theorem.</span></span></p> <hr /><p>To attend this <span><span>master’s thesis presentation in person, please go to DC 2310. You can also <a href="https://uwaterloo.zoom.us/j/97046657695">attend virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Mon, 02 Jun 2025 18:47:09 +0000 Joe Petrik 3954 at /computer-science PhD Defence • Human-Computer Interaction • Using a Capability Sensitive Design Approach to Support Newcomers Well-being /computer-science/events/phd-defence-hci-using-a-capability-sensitive-design-approach-to-support-newcomers-well-being <span class="field field--name-title field--type-string field--label-hidden">PhD Defence • Human-Computer Interaction • Using a Capability Sensitive Design Approach to Support Newcomers Well-being</span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"><span lang="" about="/computer-science/users/jpetrik" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">Joe Petrik</span></span> <span class="field field--name-created field--type-created field--label-hidden">Wed, 05/28/2025 - 11:26</span> <section class="uw-contained-width uw-section-spacing--default uw-section-separator--none uw-column-separator--none layout layout--uw-1-col"><div class="layout__region layout__region--first"> <div class="uw-text-align--left block block-layout-builder block-inline-blockuw-cbl-copy-text"> <div class="uw-copy-text"> <div class="uw-copy-text__wrapper "> <h2><span><span>Please note: This PhD defence will take place online.</span></span></h2> <p><span><span><strong>Nabil Bin Hannan, PhD candidate</strong><br /><em>David R. Cheriton School of Computer Science</em></span></span></p> <p><span><span><strong>Supervisor</strong>: Professor Edith Law</span></span></p> <p><span><span>Newcomers transitioning to a new country face many challenges, and their well-being is impacted due to unfamiliarity with self-navigating in a new environment. This thesis explores how Capability Sensitive Design (CSD) can be operationalized to guide the end-to-end design and evaluation of technologies that support the well-being of newcomers during life transitions. While the CSD framework has recently been investigated in Human Computer Interaction (HCI) for its ethical focus on supporting what individuals have reason to value, there remains a gap in how it can be translated into concrete, scalable technology design processes.</span></span></p> <p><span><span>To address this, we present a multi-stage methodology that includes formative interviews, co-design sessions, prototype development, and a longitudinal field study to evaluate the application prototype. We begin by mapping the lived experiences of newcomers using a capability-oriented interview protocol and with the use of a capability board to surface valued goals and challenges. This informed a co-design process using modified capability cards, where both newcomers and organizational stakeholders ideated design features aligned with the ten central capabilities. Drawing on these insights, we developed the Newcomer App — a multilingual mobile platform offering four core features: goal-oriented planning, capability-aligned suggestions, resource search and browsing, and reflective tracking. We evaluated this platform in an eight-week field study that included in-app activity logging and post-study interviews.</span></span></p> <p><span><span>Our findings show that newcomers were able to identify capability-aligned goals which they found helpful, translate them into intentional plans, and reflect on both their achievements and the conversion factors that influenced outcomes. Importantly, we observed how CSD-informed features constructed self-discovery, increased agency, and facilitated social contribution, particularly in the capabilities of social connection, emotional well-being, and community participation. The study also highlighted the importance of contextual and social barriers in determining whether users could turn suggestions into meaningful actions. This thesis contributes an operational model for applying CSD across the full design lifecycle, offering insights for researchers and practitioners. By translating ethical commitments into deployable technologies, our work extends prior research in HCI and design social justice, demonstrating how technologies can support equitable pathways toward well-being for marginalized groups, such as newcomers in navigating complex transitions.</span></span></p> <hr /><p><span><span><a href="https://uwaterloo.zoom.us/j/97988123539">Attend this PhD defence virtually on Zoom</a>.</span></span></p> </div> </div> </div> </div> </section> Wed, 28 May 2025 15:26:52 +0000 Joe Petrik 3951 at /computer-science