Spivak's list of categorical programming and computer science projects

edited April 2018 in Chat

Subtitle: Category Theory to Change the World!

This is such an exciting list, that I am reposting John's comment here on its own discussion.

Note: we are maintaining a policy that the ACT is primarily about math, and not programming. So please keep general programming and cs discussion within this discussion, and in the categories for the working hacker discussion. Thanks.

From John:

I have a lot of catching up to do on this thread, but I just got an email from David Spivak that might interest folks here:

I was talking with Ryan Wisnesky today, and he pointed me to a webpage full of project ideas that your students might be interested in.

A lot of these are coding projects. If anyone is seriously interested in trying them, please let David Spivak and me know!

  • Implement Functorial Data Migration using MapReduce. The iterative and join-based nature of many categorical algorithms make implementation using MapReduce/Hadoop particularly challenging. The goal of this project is to develop efficient algorithms for implementing functorial data migration on MapReduce.
  • Implement a web-based GUI for Functorial Data Migration. The FQL tool is text-based and not particularly well-suited to run in a browser. Having a simple web-based GUI will help demonstrate functorial data migration to a larger class of people than currently possible. The goal of this project is to develop such a GUI.
  • Implement a Categorical Theorem Prover. Modern theorem provers for first-order logic are not particularly well-suited for reasoning about categories, which requires purely equational reasoning, empty sorts, ground completeness, and other niche features. The goal of this project is to implement a theorem prover specifically tailored to categorical needs.
  • Implement a Flexible FQL to SQL compiler. The FQL tool can interact with SQL databases, provided those databases are made up entirely of binary tables. This is an onerous restriction in practice, and theoretically, it should be possible for FQL to interact with any database that can be viewed as a database in categorical normal form. The goal of this project is to extend the class of SQL databases that FQL can interact with by revisiting FQL's core SQL generation algorithms.
  • Develop an FQL Query Optimizer. Although the the equational theory for FQL is well defined (it is related to the internal language of cartesian closed categories), it is unclear how this equational theory can be used to actually optimize FQL queries. The goal of this project is to build an FQL query optimizer.
  • Develop FQL Scenarios. The goal of this project is to use the FQL tool to solve real-world information-integration problems.
  • Work through David Spivak's Book using FQL++. The FQL++ tool provides an extensive set of algorithms to compute with finite categories. The goal of this project is to use the FQL++ tool to solve the exercises in Category Theory for Scientists.
  • Develop Operads for Domain Specific Languages. The goal of this project is to formalize particular domain specific languages as operads, possibly using the OPL tool.
  • Integrate EASIK and FQL. Although the EASIK and FQL tools are based on similar categorical foundations, the tools do not interact particularly well. The goal of this project is to allow the tools to interoperate.
  • Integrate FQL and a File System. A file system file and directory forms an acyclic graph, and aliases can be represented as path equations. The files themselves are given by a set-valued functor. The goal of this project is to express a file system as FQL schemas and instances. More speculatively, this could be the first step in a categorical database management system.
  • Understand Geospatial databases using category theory. Interval-centric data can possibly be implemented using bi-modules.
  • Develop a Theory of Functorial Data Isomorphism. Describe isomorphisms such as Sigma(F o G)(I) = (Sigma(F) o Sigma(G))(I), and develop a method of inserting them automatically into FQL programs, so that users can program up to isomorphism.
  • Interpret Learn Reject. It may be possible for two people with no common ground to nevertheless develop a shared understanding of a domain by repeatedly following a loop that develops a span between schemas.
  • Generalize to categories other than set. Develop a theory of functorial program migration/integration by generalizing results from the FQL project on functorial data migration/integration from the topos of Sets to the effective topos or domain-like things.
  • Matching and inference. Generalize schema matching, entity resolution, and program synthesis techniques to infer schema mappings, instance mappings, and queries (bi-modules)
  • Develop a Categorical Theory of Entity Resolution/Record Linkage.
  • Extend the Functorial Data Model to include Nested Data, Aggregation, and Grouping.
  • Extend the Functorial Data Model to include Richer Constraints, such as Embedded Dependencies/finite-limit theories.
  • Extend the Functorial Data Model to include Updates and View-Updates
  • Extend the Functorial Data Model to include Uncertain Information
  • Consistency and Conservativity in Functorial Data MigrationThe Sigma data migration functor, and other related operations, such as pushouts, do not preserve the consistency of their input instances; in logical terms the resulting equational theories need not be conservative extensions of their schemas. The goal of this project is to develop algorithms to detect what an instance presentation is not consistent, and to give proofs of situations when inconsistency cannot arise.
  • Excel and FQL Interoperability. In principle spreadsheets have much structure relevant to FQL: each sheet is a table with unique row numbers, cells can reference other cells, mappings can be represented as two column tables, and data integrity constraints represented as formulas. The idea of this project is to import this structure into FQL, or, more speculatively, implement functorial data migration directly in Excel as a macro or extension.
Sign In or Register to comment.