Fall 2023 Seminars

Fall 2023

Organizer: Laura Israel

GKAT with Indicator Variables, Decompilation Verification in Nearly Linear Time

Speaker: Cheng Zhang

Cheng Zhang is a final year PhD student at Boston University working with Professor Marco Gaboardi. He is generally interested in the connection between mathematics and computer science. Specifically, he is interested in applying mathematical techniques in algebra to real world problems in program semantics and verification. Currently, most of his work is focused on derivatives of Kleene Algebra; both on the theoretical aspect and their applications in program verification.

Abstract

Decompilation is not only a method to solve practical problems, like binary analysis, binary migration, and binary optimization; it also gives us understanding of theoretical problems like the correspondence between structured and unstructured programs. Although plenty of works have been done in this area, with many automated tools developed; there are not many tools that can automatically verify the correctness of the decompilation result on a per-decompilation basis. In this talk, we will present a scalable algorithm based on techniques developed in Guarded Kleene Algebra with Tests (GKAT). This algorithm will automatically decide the trace-equivalence between a flowchart and its decompiled program in nearly linear time, thus verifying the correctness of the decompilation process.

Completeness Thresholds for Memory Safety: Unbounded Guarantees via Bounded Proofs

Speaker: Tobias Reinhard

Tobias Reinhard is a final-year PhD candidate at KU Leuven (Belgium). His main research interest lies in program verification, specifically in separation logic. Tobias' current work focuses on the intersection between bounded and unbounded proofs. Previously, he developed ghost signals, a separation logic extension to verify liveness properties. Besides his work in academia, Tobias interned at Amazon's Automated Reasoning Group in 2022 and 2021. During this time, he verified FreeRTOS' multicore scheduler and explored strategies to progress from bounded to unbounded verification.

Abstract

We present the first study of completeness thresholds (CTs) for infinite state systems. Any bounded proof reaching its CT is sound. Thereby, CTs allow us to reduce unbounded proofs to bounded ones. Specifically, we focus on memory safety proofs for programs that iterate over arbitrarily large data structures in a memory-layout-preserving way. We present a generic approach to extract CTs from a program's verification condition and show that it scales. Moreover, we demonstrate that we can characterise CTs for interesting classes of array traversing programs.

The Entity-Component-System pattern for PL People

Speaker: Patrick Redmond

I am a PhD student in the Languages, Systems, and Data Lab at the University of California Santa Cruz, advised by Lindsey Kuper. My research explores the design, implementation, and verification of distributed systems and pure functional runtimes. I want to bring software verification to industry, where rapidly changing production systems are difficult to verify with traditional manual proof-writing techniques. I have several years of industry experience conceptualizing, developing, and deploying high-availability distributed-systems and machine prediction at scale for Twitch (Amazon), Prismatic (defunct), and Vlingo (Nuance).

Abstract

The Entity-Component-System software design pattern (ECS) is commonly used in industry, especially video gaming, where it provides modularity and performance benefits; however ECS hasn't much attention in the PL research community. We present a core calculus for ECS ("Core ECS") which represents the essence of the pattern in a formal way. We argue that, given a few additional restrictions, Core ECS is a deterministic concurrency model. Given these benefits we seek to implement a compilation strategy from pure functional programs to ECS programs.

Just Another Quantum Assembly Language (Jaqal)

Speaker: Benjamin C. A. Morrison

Benjamin C. A. Morrison is a Ph.D. student at the University of New Mexico and works in the Center for Computing Research at Sandia National Laboratories. They study quantum computing at all levels of the quantum software stack, from quantum programming languages to compilers, transpilers, error correction, quantum simulation algorithms, and occasionally complexity theory.

Abstract

The Quantum Scientific Computing Open User Testbed (QSCOUT) is a trapped-ion quantum computer testbed realized at Sandia National Laboratories on behalf of the Department of Energy's Office of Science and its Advanced Scientific Computing (ASCR) program. We invented Jaqal, for Just another quantum assembly language, to specify programs executed on QSCOUT. I'll discuss the capabilities of the Jaqal language, our approach in designing it, and the reasons for its creation.

egglog: Bringing e-graphs to Python

Speaker: Saul Shanabrook

Saul Shanabrook has been working in the Python data science ecosystem, contributing to projects like Jupyter, Vega, and Numba, working at Quansight and Linea, for clients such as Heavy AI, GlueViz, Kite, Amazon, and D. E. Shaw, and through grants from the Chan Zuckerberg Initiative, the Sloan Foundation, and NumFOCUS. He is currently applying for Ph.D. programs in computer science to foster greater collaboration between the open-community and programming languages research, for example through the design and implementation of tools for building embedded domain-specific languages in Python.

Abstract

E-graphs are a data structure from the automated theory-proving community. This talk will give an overview of what they are, how they can be used for program analysis and optimization, as well as their potential in supporting the Python scientific computing ecosystem and connecting it to researching programming languages.

Nested Family Polymorphism for Extensible Programming and Verification

Speaker: Anastasiya Kravchuk-Kirilyuk

I am a 4th year PhD candidate in computer science at Harvard University, advised by Nada Amin. My research during this time has been mainly in extensible programming, and I am now moving into modular verification. My interest in this topic comes from my own experience with mechanized metatheory projects in Coq. Even though theorem provers are already quite expressive, I believe we can still do a lot to make them more user-friendly and to reduce verification effort for large proof developments. Outside of work, you can find me spending time with my three cats or taking a slow walk in nature.

Abstract

Many obstacles stand in the way of modular, extensible code. Some language constructs, such as pattern matching, are not easily extensible in a sound way. Inherited code may not be type safe in the presence of extended types. The burden of setting up design patterns can discourage users, and parameter clutter can make the code less readable. Given these challenges, it is no wonder that modularity often gives way to code duplication. Our solution, Persimmon, is a functional system with nested family polymorphism, extensible variant types, and extensible pattern matching. Most constructs in our language are built-in “extensibility hooks,” cutting down on the parameter clutter and user burden associated with extensible code. Persimmon preserves the relationships between nested families upon inheritance and supports composable extensions, enabling extensibility at a large scale. In our current work, we are taking inspiration from Persimmon to implement modular, extensible verification in Dafny. We are building on Dafny’s existing module refinement mechanism, which has a lot of similarities to the way we handle extensibility in Persimmon. We hope that our implementation will lower the barrier to extensible verification by building it directly into the language.

Why programming languages matter

Speaker: Andrew Black

Andrew Black is an Emeritus Professor of Computer Science at Portland State University. His first programming language was Algol 60, an accident for which he will always be grateful. As a graduate student, he at first refused a request to teach Fortran to undergraduates, on the grounds that it was unfair to subject students to such a language; he relented when his supervisor (Tony Hoare) pointed out that if the students were going to learn Fortran from someone, it would be better if they learned it from a teacher who understood its many deficiencies. Later, but after his involvement with the Emerald programming language, he learned, taught and contributed to Smalltalk, an encounter that permanently re-arranged his brain cells. As well as teaching at four universities, Professor Black has worked in the laboratories of four major companies (two of which still exist). He has published over 100 research papers, and been involved in the design, development and implementation of six programming languages. For the past ten years he has been the BDFL of the Grace Educational Programming Language, and the major committer to the minigrace implementation.

Abstract

I've spent most of my professional life working on programming languages: studying them, designing them, defining their semantics, comparing features, and even implementing them (mostly poorly). Why would an otherwise-sane person do such a thing? Because programming languages are the ultimate meta-tool: they give you the ability to create from pure thought. And yet: a programming language is not just a means for programmers to communicate with computers—it is also a means for programmers to communicate with programmers. That is: a good programming language is a social, as well as a technical, enabler. This talk will take a bumpy ride through the challenges and opportunities associated with the six languages that I’ve played a part in developing over the past forty-five years—most of which you won’t have heard of. Why do new languages face an uphill struggle for acceptance? What are the challenges facing the next generation of languages and language designers? Why should you care about any of this? Because those who do not know the past are doomed to repeat it.

Pulse: A DSL for verified imperative programming implemented in F*

Speaker: Guido Martinez

Abstract

Pulse is a new DSL built on top of F* for verified imperative programming based on separation logic. It is implemented as an F* extension, with its own syntax and typechecker, and builds over the Steel separation logic in F*. Pulse's type checker is also verified in F* to only accept correct programs. I will show a bit of the architecture of Pulse and then work through some examples, including some small parallel programs.

HasChor: Functional Choreographic Programming for All

Speaker: Gan Shen

Abstract

Choreographic programming is a programming paradigm where one describes the complete behavior of a distributed system as a single program and then compiles it down to individual programs that run on each node. In this way, the generated programs are guaranteed to be deadlock-free. In this talk, I'll present HasChor, a library for functional choreographic programming in Haskell. HasChor is an embedded domain-specific language where choreographies are expressed as computations in a monad. The embedded approach is lightweight, enabling HasChor to inherit features and libraries from Haskell for free. Moreover, HasChor's implementation utilizes freer monads, which leads to a natural realization of endpoint projection (the compilation process).

Reining in finite types: a new way to prove constructive finiteness that might or might not be useful

Speaker: Katie Casamento

Flexible Security Policy Enforcement with Tagged C

Speaker: Sean Noble Anderson

Abstract

We introduce Tagged C, a novel C variant with built-in tag- based reference monitoring that can be enforced by hardware mecha- nisms such as the PIPE (Processor Interlocks for Policy Enforcement) processor extension. Tagged C expresses security policies at the level of C source code. It is designed to express a variety of dynamic security policies, individually or in combination, and enforce them with compiler and hardware support. Tagged C supports multiple approaches to se- curity and varying levels of strictness. We demonstrate this range by providing examples of memory safety, compartmentalization, and secure information flow policies. We also give a full formalized semantics and a reference interpreter for Tagged C.