Önálló labor

Adminisztratív információk

Aktuális témák 2024

A laborban több aktív kutatási területeken lehet önálló labor, szakdolgozat, és diplomaterv témát választani. Ezeknek a területeknek a leírása található alább. Ha valamelyik tématerület érdekel, keresd meg a tématerületért felelős kollégánkat, és beszéljetek lehetséges konkrét feladatokról a területen belül. Ne feledjétek, hogy az önálló labor keretében egy-egy feladaton kisebb csoportban (team-ben) is lehet dolgozni. Az témáink a következő területekhez kapcsolódnak:
All, Embedded-Systems, Internet-of-Things, Malware, Machine-Learning, Software-Security, Security-Analysis, ICS/SCADA, Attack generation, Privacy, Security, Federated-Learning, Game-Theory, Economics

Embedded systems security

Kategória: Embedded-Systems, Internet-of-Things, Malware, Machine-Learning

Embedded systems consists of special purpose embedded computers connected with networks, and nowadays, they are not isolated from the public Internet. Hence, embedded systems are subject to attacks originating from cyberspace. In the CrySyS Lab, we are working on making embedded systems secure and resistant to attacks. A particular domain of interest is malware detection on embedded IoT devices. We poposed a lightweigth, yet effective malware detection mechanism, called SIMBIoTA, which can be used on resource constrained embedded devices. We are looking for students interested in improving SIMBIoTA and comparing its capabilities to other malware detection approaches. We also have a large IoT malware dataset that we use for research purposes. Students can work on enhancing this dataset, e.g., by developing visualization tools for it.

Létszám: 3-4 students

Kapcsolat: Levente Buttyán (CrySyS Lab)

Privacy & Anonymization

Kategória: Privacy, Machine-Learning

The word privacy is derived from the Latin word "privatus" which means set apart from what is public, personal and belonging to oneself, and not to the state. There are multiple angles of privacy and multiple techniques to improve them to varying extent. Students can work on the following topics:

  • (De-)Anonymization of Medical Data: ECG (Electrocardiogram) and CTG (Cardiotocography), diagnostic images (MRI, X-ray), are very sensitive datasets containing the medical records of individuals.The task is to (de-)anonymize such datasets (or some aggregates computed over such data) for data sharing with strong, preferably provable privacy guarantees which are also GDPR compliant.
    (Contact: Gergely Ács)
  • Poisoning Differential Privacy: Differential Privacy is the de facto privacy model used to anonymize datasets (see US-Census data). Small noise is added to the data which hides the participation of any single individual in the dataset, but not the general statistics of the population as a whole. The noise is calibrated to the influence of any record. However, if the data is coming from untrusted sources, the attacker can inject fake records into the dataset in order to increase the added noise that eventually degrades the utility of the anonymized data. The task is to design and implement such an attack.
    (Contact: Gergely Ács)
  • Differential Privacy Amplification: Nowadays, the standard privacy-preserving mechanism is Differential Privacy. It aims to hide the presence or the absence of a data point in the final result by adding noise to the original query, making the two outcomes (one with and one without a single data point) statistically indistinguishable (up to the privacy- parameter). For example, the average salary of BME last year's graduates are published with added noise, so even if an adversary knows all alums' salaries except its target, it cannot deduce that with certainty. Besides the size of the added noise, privacy protection can further be increased by so-called amplification techniques, such as sampling from the data (instead of utilizing all). For instance, only half of the alums are considered for this statistic.
    The student's task is to learn and experiment with these amplification techniques and to find the optimal setting (amplification mechanisms and its parameters) to obtain a desirable trade-off between the provided privacy protection and the obtained accuracy.
    (Contact: Balázs Pejó)
  • Own idea: If you have any own project idea related to data privacy, and we find it interesting, you can work on that under our guidance.
    (Contact: Gergely Ács or Balázs Pejó)

Required skills: none
Preferred skills: basic programming skills (e.g., python)

Létszám: 6 hallgató

Kapcsolat: Gergely Ács (CrySyS Lab), Balázs Pejó (CrySyS Lab)

Machine Learning & Security & Privacy

Kategória: Privacy, Security, Machine-Learning

Machine Learning (Artificial Intelligence) has become undisputedly popular in recent years. The number of security critical applications of machine learning has been steadily increasing over the years (self-driving cars, user authentication, decision support, profiling, risk assessment, etc.). However, there are still many open security problems of machine learning. Students can work on the following topics:

  • Security of Machine learning based Malware Detection:   Adversarial examples are maliciously modified program code where the modification is hard to detect yet the prediction of the model on this slightly modified code is very different compared to the unmodified code. For example, the malware developer modifies a few bytes in the malware binary which causes the malware detector to misclassify the malware as benign. A potential task can be to develop solutions to detect adversarial examples, develop robust training algorithms for malware detection, or design backdoor and membership attacks.
    (Contact: Gergely Ács)
  • Robustness of Large Language Models:   Large Language Models (LLMs) are a new class of machine learning models that are trained on large text corpora. They are capable of generating text that is indistinguishable from human-written text. The increasing reliance on Large Language Models (LLMs) across academia and industry necessitates a comprehensive understanding of their robustness to prompts. The task is to study and test different adversarial prompts against LLMs (such as adversarial attacks, or prompt injection, or any other adversarial prompts). (Contact: Gergely Ács)
  • Detection and Attribution of Fake Images   Over the last year, there has been a growing interest in text-to-image generation models that create images based on prompt descriptions. While these models exhibit promising performance, there is a going concern about the potential misuse of the artificially generated images they produce. The task is develop detection and attribution methods of fake images generated by text-to-image generation models, that is, to detect which images are generated by AI and by which model exactly.
    (Contact: Gergely Ács)
  • Meta Learning: In online media, there is legit news as well as fake news. While the former is usually of higher quality, the latter is often associated with low-quality writing. Several machine learning models focus on classifying these two in the scientific literature. Moreover, in the scientific literature itself, there are lower and higher-quality publications as well. 
    The student's task is to get familiar with these models and experiment with their applicability to differentiate non-peer-reviewed scientific papers (e.g., on ArXiv) from articles that appeared in well-established venues (such as S&P, CCS, etc.). 
    (Contact: Balázs Pejó)
  • Own idea: If you have any own project idea related to the security/privacy of machine learning, and we find it interesting, you can work on that under our guidance.
    (Contact: Gergely Ács or Balázs Pejó)

Required skills: none
Preferred skills: basic programming skills (e.g., python), machine learning (not required)

Létszám: 6 hallgató

Kapcsolat: Gergely Ács (CrySyS Lab), Balázs Pejó (CrySyS Lab)

Economics of (cyber)security and (data)privacy

Kategória: Economics, Privacy, Security, Game-Theory, Machine-Learning

As evidenced in the last 10-15 years, cybersecurity is not a purely technical discipline. Decision-makers, whether sitting at security providers (IT companies), security demanders (everyone using IT) or the security industry, are mostly driven by economic incentives. Understanding these incentives are vital for designing systems that are secure in real-life scenarios. Parallel to this, data privacy has also shown the same characteristics: proper economic incentives and controls are needed to design systems where sharing data is beneficial to both data subject and data controller. An extreme example to a flawed attempt at such a design is the Cambridge Analytica case.
The prospective student will identify a cybersecurity or data privacy economics problem, and use elements of game theory and other domain-specific techniques and software tools to transform the problem into a model and propose a solution. Potential topics include:

  • CPSFlipIt: attacker-defender dynamics in cyber-physical systems
  • Risk management for cyber-physical/OT systems
  • Incentives in secure software development: why should programmers have proper security training?
  • Interdependent privacy: modeling inference with probabilistic graphical models
  • BYOT: Bring Your Own Topic!

Required skills: model thinking, good command of English
Preferred skills: basic knowledge of game theory, basic programming skills (e.g., python, matlab, NetLogo)

Létszám: 6 hallgató

Kapcsolat: Gergely Biczók (CrySyS Lab), Balázs Pejó (CrySyS Lab)