The first stage of the project will consist of literature study of the selected area, and tentative identification of the problem you would like to address (what you'd hope to achieve). Over the semester you will refine this goal, state a concrete research result you hope to obtain, and work towards it. Identifying a problem to pursue, making it well-defined, and coming up with a plan towards addressing it, is your responsibility. However, you are allowed and encouraged to discuss your ideas with (and receive feedback from) the instructor, TAs, and fellow students, at all stages (you may also incorporate others' ideas in your project, as long as they are ok with it, you give them proper credit, and the project also reflects appropriate effort by each member of the group). Note that the research problem you choose to work on does not have to be an open problem that is stated in some paper or identified by an expert in the field -- it can be a problem of your own invention. It can also be an extension of a known result to new, unknown settings. The final report does not need to be a publishable result, nor must it conclude in successful resolution. One of the main goals of this course is to invoke interesting research ideas, and give you a taste of the research process. We encourage interesting projects that might end up unsuccessful (as long as all attempts are well documented and make sense overall), over a successful resolution of a trivial problem.
Specific milestones and requirements are as follows:
More details on what is expected for each of these milestones are below. You are encouraged to consult us early, and submit the proposal/progress reports early, and we will do our best to provide early feedback.
Proposal (before 3/6): The proposal should include the following components:
Progress Report (before 4/8): The progress report will include the initial project proposal, and additionally include updates and refinement on points 3 and 5 above. Specifically, discuss what you've learned so far, and what changes you'd like to make to your proposed research problem. At this stage, the open problem you state as your goal should be more concrete (for example, a formal statement of a theorem you would like to prove or disprove). If based on your study so far you think your initial plan is not feasible or not interesting, explain why and set a new goal which you believe is interesting and achievable within the time frame. In any case you should outline your planned approach towards satisfying your goal based on the progress you've made so far.
Final report (before 5/4): Your final report will consist of a paper describing what you have achieved. If you obtained a new research result, state it clearly (typically this would take the form of a formal theorem, but the main result could also be a new formal model / definition, a description of experimental findings, etc), and include background information and motivation. If you haven't solved a new open problem (which often happens, given the short duration of the project), your final report will including a literature survey, open problems, a description of why and where you got stuck, and what you learned from these obstacles, including a suggestion for the next research steps, and what might be obtainable.
Please keep in mind that there are two main goals for a project in this class, and your final report should demonstrate you have progressed on both (although the ratio between the two can vary).
There is no minimum (or maximum) number of papers you have to read or pages that you have to write, though we will try to guide students towards comparable (and reasonable) amount of work to complete your project. The expectation from a group will be calibrated to the group size, but the rule of thumb is: make an honest effort, start early, don't hesitate to request feedback.
Below we provide some resources for information theoretic crypto, and for other topics that were requested by students. If you still don't have an idea where to start, reach out to the teaching staff and provide some background on yourself and your interests -- we will help you. You can also use the piazza board for the class to look for project ideas and collaborators, so that your classmates can help you.
Information Theoretic Cryptography: We have a running document that includes many links, and discusses open problems that relate to what was taught in class:
Information Theoretic Cryptography: Lectures, Readings, Proposed Project Topics (evolving document)
The above document contains (in addition to short summaries of lecture), some general directions and concrete open problems in information-theoretic cryptography. We will update it with more directions, and can also help you identify more, on any of the topics listed there, depending on your interests.
Cryptography and Learning: There are many relationships between cryptography and learning. Below are some example areas, directions and resources (this is not meant to be comprehensive or accurate, just to give you a direction to start exploring, following a student's request).
Computational Learning Theory, Cryptography, and Circuit Lower Bounds. In some intuitive sense, cryptography and learning theory are complementary, where learnable classes are the "easy" ones, while classes that can be used for crypto are "hard". This intuition can be shown to be true in some specific limited ways (depending on details of the notions we use and parameter settings). This brings up the areas of (1) proving hardness of learning from cryptographic assumptions, and (2) building cryptography from hardness of learning. The former is the typical way that hardness of learning results are proved: showing that a class of functions contains a PRF family, and hence it is hard to learn (intuitively, since PRFs are indistinguishable from random functions, and random functions are obviously hard to learn). Conversely, if a class is learnable (under appropriate definitions), the class cannot contain PRFs. For the latter, [BFKL93] is an early paper that used hardness of learning of any class -- under a specific, average case, definition - in order to construct a one way functions. There are many related problems that are still open, for example strengthening the result (to a more standard, worst case definition) using modern techniques. This is related to the natural proof (Razborov-Rudich) barrier, and circuit lower bounds. Here are some resources (each can be viewed as a starting point for a separate thread -- talk to us for more information / updates).
Privacy and Security for Machine Learning Algorithms. With the prominence of ML algorithms in every aspect of our life, one goal is to execute these ML algorithms privately and/or securely. One aspect of this is secure inference: applying secure computation techniques to classification algorithms. For example, applying secure 2PC to the setting where one party holds a model (eg a neural network) and the other party an input, and they want to compute the output without leaking any additional information. There were many recent papers on this topic, using different techniques (homomorphic encryption, garbled circuits, secret-sharing based protocols). See the references in the following paper:
While secure inference deals with privacy during the model evaluation stage, another important aspect is protecting privacy of the data used at the training stage. Since the model must contain information about the (collective) data used at training, a natural notion of privacy to consider here is that of differential privacy, which protects privacy of individual inputs. One challenge is that differentially private protocols require specific types of noise generation, and sampling such noise in a malicious setting is expensive. Even more challenging, is the setting where training and inference are not done in two separate stages, but are integrated into a single process of on-line learning. That is, as examples arrive, they are used both as inputs for evaluation of the current model, as well as helping in training the model to become better and more accurate.
A related setting with many open problems is that of federated learning, where the data comes from a large number of parties.
Cryptographic Modeling of ML Tasks, such as Robust ML, Fairness. Classical papers that started the field of modern cryptography include definitional papers, laying down the right mathematical definitions for cryptographic tasks (e.g., Goldwasser and Micali's definition of semantic security -- security of encryption, followed by many other examples -- signature schemes, zero-knowledge proofs, secure computation, and many more). Putting forth the right definition -- one that captures a meaningful notion of security, yet useful and achievable -- is very important and often tricky. This is still largely missing in the security/privacy/fairness aspects of ML.
One example is robust ML, trying to protect against "adversarial examples" which are misclassified. There are many works trying to construct ML models that are robust to such attacks, but much of this literature is missing a foundational cryptographic approach. First, it often defines robustness in a way designed to avoid existing specific attacks, rather than identifying general security goals and attack models. Second, it often makes implicit unstated assumptions about which misclassified examples should count as adversarial, and which should count as just an error. Some open problems here include coming up with a definitional framework and tradeoffs for robust ML, or using cryptographic assumptions -- finding a way to embed a cryptographic primitive that will render adversarial examples hard to find, at least in some settings where this makes sense. Some relevant papers include the following, and references within (disclaimer: I have not read many of these).
Cryptography and Game Theory: Typically, when defining a cryptographic primitive, adversarial parties can be either "semi-honest" (aka "honest-but-curious") and guaranteed to follow the protocol specs, or malicious and behaving arbitrarily in the worst possible way. There's a branch of "rational cryptography", which models players as rational (trying to optimize some utility function, and would deviate from the protocol if it helps towards that goal). Other areas of intersection include combining privacy and security goals (such as private sharing) with incentives, and connections between the complexity of prolems related to game theory and to cryptography. Here are some (incomplete) references about these (ask if you want us to search for more detailed / complete references on one of these directions). Thanks to Roland Maio (and to Marshall Ball, but he is paid for this) for help with finding these.
Cryptography of Permutation Groups: If you are participating in Periklis' seminar, and would like to use some of his suggested open problems, this is ok (and he has agreed), but please let us know as soon as you decide this.