Önálló labor

Adminisztratív információk

Aktuális témák 2024

A laborban több aktív kutatási területeken lehet önálló labor, szakdolgozat, és diplomaterv témát választani. Ezeknek a területeknek a leírása található alább. Ha valamelyik tématerület érdekel, keresd meg a tématerületért felelős kollégánkat, és beszéljetek lehetséges konkrét feladatokról a területen belül. Ne feledjétek, hogy az önálló labor keretében egy-egy feladaton kisebb csoportban (team-ben) is lehet dolgozni. Az témáink a következő területekhez kapcsolódnak:
All, Embedded-Systems, Internet-of-Things, Malware, Machine-Learning, Software-Security, Security-Analysis, ICS/SCADA, Attack generation, Privacy, Security, Federated-Learning, Game-Theory, Economics

Federated Learning - Security & Privacy & Contribution Scores

Kategória: Privacy, Security, Federated-Learning, Game-Theory

Federated learning enables multiple actors to build a common, robust machine learning model without sharing data, thus allowing to address critical issues such as data privacy, data security, data access rights and access to heterogeneous data. Its applications are spread over a number of industries including defense, telecommunications, IoT, and pharmaceutics. Students can work on the following topics:

  • Federated Learning Framework for Medical Data: Federated learning is going to be adopted in health care, where different organizations want to train a common model for different purposes (tumor/disease classification, prediction of survival time, finding an explainable pattern of Covid on whole slide images of livers, etc.) but organizations lack of sufficient training data individually. The task is to develop federated learning framework for such tasks.
    (Contact: Gergely Ács)
  • Security and Privacy of Federated Learning: Federated learning allows multiple parties to collaborate in order to train a common model, by only sharing model updates instead of their training data (e.g., mobile devices train a common model for input text prediction, or hospitals train a better model for tumor classification). Even if this architecture seems more privacy-preserving at first sight, recent works have highlighted numerous privacy and security attacks to infer private and sensitive information. The task is to develop privacy and/or security attacks against federated learning (data poisoning, backdoors, reconstruction attacks), and/or mitigate these attacks.
    (Contact: Gergely Ács)
  • Free RIder Detection using Attacks (FRIDA): In Federated Learning, multiple individuals train a single model together in a privacy-friendly way, i.e., their underlying datasets remain hidden from the other participants. As a consequence of this distributed setup, dishonest participants might behave maliciously by free-riding (enjoying the commonly trained model while not contributing to it).
    The student's interdisciplinary task is to read about the Membership Inference Attacks and the free-riding problem in Federated Learning. Furthermore, to propose a framework that connects the two, i.e., use Membership Inference Attack to determine whether the participant used actual data or just random noise during training.
    (Contact: Balázs Pejó)
  • Contribution Score Poisoning: It is well known that it is possible to poison the training data to decrease the model's performance in general (un-targeted attack) or for a specific class (targeted attack). Moreover, it is also possible to poison the data such that the desired fairness objective is destroyed or the privacy of the data samples is compromised. Contribution measuring techniques, such as the Shapley value, assign values to each participant, reflecting their importance or usefulness for the training. The question naturally arises; by injecting malicious participants into the participant pool, is it possible to manipulate the contribution scores of other participants (i.e., arbitrarily increase or decrease).
    The student's task is to get familiar with Contribution Score Computation techniques as well as poisoning attacks within Federated Learning and empirically test (aka with experiments) whether such control is feasible and to what extent.
    (Contact: Balázs Pejó)
  • Fairness of Shapley Approximations: In any distributed setting with a single common product, such as in Federated Learning (where multiple participants train a Machine Learning model together in a privacy-friendly way), the contribution of the individuals is a crucial question. For instance, when several pharmaceutical companies train a model together, which leads to a huge breakthrough, how should they split the pay-off corresponding to the model? Equal distribution is as unfair as the one based on the dataset sizes, as neither considers the data quality. What does fair mean in the first place? Shapley defined four fundamental fairness properties and proved that his reward allocation scheme is the only one that satisfies all. On the other hand, it is exponentially hard to compute, so it is standard practice to approximate it in real life.
    The student's task is to study existing approximation methods and verify (theoretically or empirically) to what extent these methods respect the four desired properties.
    (Contact: Balázs Pejó or Gergely Biczók)
  • Own idea: If you have any own project idea related to the security/privacy of federated learning, and we find it interesting, you can work on that under our guidance.
    (Contact: Gergely Ács or Balázs Pejó or Gergely Biczók)

Required skills: none
Preferred skills: basic programming skills (e.g., python), machine learning (not required)

Létszám: 6 hallgató

Kapcsolat: Gergely Ács (CrySyS Lab), Balázs Pejó (CrySyS Lab), Gergely Biczók (CrySyS Lab)