Research University - Warsaw University of Technology

Cybersecurity and Data Science

Global and local challenges

Digital transformation turns data into new oil. The increasing availability of big data, structured and unstructured datasets, creates new challenges in cybersecurity, efficient data processing, and knowledge extraction. The field of cybersecurity and data science fuels the data-driven economy. Innovations in this field require strong foundations in mathematics, statistics, machine learning and information security.

The unprecedented increase in the availability of data in many fields of science and technology (e.g. genomic data, data from industrial environments, sensory data of smart cities, social network data), ask for new methods and solutions for data processing, information extraction and decision support. This stimulates the development of new methods of data analysis, including those adapted to the analysis of new data structures and the growing volume of data. Research in Big Data is also of primary importance for the development of the European economy. Big Data Value Association (BDVA) in its Strategic Research and Innovation Agenda (SRIA) observed that "making more efficient use of Big Data, and understanding data as an economic asset, carries great potential for the EU economy and society".

Scientific excellence

This PRF includes five important subfields:

  • Cybersecurity (CS) with a primary interest in automated safety management systems; non-repudiation systems including blockchain-based; data protection using machine learning; detection of unknown attacks on ICT systems using big data/fast data algorithms; post-quantum cryptography.
  • BioMed Data Science (BMDS) with a primary interest in bioinformatics, biostatistics, computational medicine. Bio-med applications are rich in big and complex datasets that require scalable bioinformatics tools, in particular distributed algorithms and statistical methods for the analysis of multidimensional genomic and transcriptomic data generated using highthroughput next generation sequencing methods. These tools are used to solve biomedical problems, such as the discovery of novel genotype-phenotype correlations, biomarkers and find direct applications in the molecular diagnostics process.
  • Big and Stream Data Science (BSDS) with a primary interest in big data, distributed storage, batch and stream analytics (smart city, genomics). Big data and stream data applications need algorithms, methods and complex systems that combine the data storage layer with the processing layer in batch mode and the data processing layer in the streaming mode. The methods from this subfield are used to build, among others, intelligent dynamic multi-modal connection planning systems for public transport, which combine stream analytics, machine learning and graph theory methods.
  • Advanced Machine Learning (AML) with a primary interest in statistical learning methods, interpretable and explainable predictive models, clustering, classification, and data fusion. Predictive and personalized modeling needs algorithms and software explaining the decisions made by machine learning algorithms, verifying the transparency of predictive models, tools that automate and support data mining and models. The developed tools are implemented in areas such as personalized medicine or credit risk. This also covers tools for learning with partial supervision and without supervision.
  • Mathematical Foundations for Data Science (MF4DS) with a primary interest in mathematical foundations of data modeling and analysis, statistics, probability, graphs and networks and soft computing. This covers phenomena based on data modeling (in complex networks, operational research, financial processes, decision support, sports analytics) using a wide range of mathematical approaches -stochastic methods, dynamic systems, differential equations, discrete mathematics, graph theory, sets fuzzy etc. This approach allows not only to predict future observations, but also to gain insight into the nature of the mechanisms that generate these processes.

The research activities at WUT cover basic research in the field of mathematics and basic and applied research in the area of data analysis. The Data Science team cooperates extensively with global scientific and industrial centers (eg Big Data: Technion, Fraunhofer Institute, IBM Research, Samsung, Daftcode, Nethone, Applica, Data Juice, Genomics: Baylor College of Medicine (USA), ETH Zurich, Oslo University Hospital, CHU de Nantes, Abertay University and in Poland: the Warsaw Medical University, the Institute of Mother and Child in Warsaw.

Cybersecurity is a relatively young field integrating many disciplines such as teleinformatics, management, law, ethics, psychology, sociology. Cyberspace, as a virtual world, combining the spectrum of phenomena, goes beyond the mere transposition of the physical world into the world of machines, it is a creation in which other principles and laws operate than those that are our physical environment, for example, Newton's Laws of Dynamics are not directly applied. To the same degree cyberspace penetrates the real world and therefore, the key issue is to ensure IT security.

This type of activity consists of the security of the operation of computer systems and ICT networks in the conditions of various types of threats, the scale of which increases with the proliferation of mobile devices. Security is now critical to the functioning of various branches of the economy and administration of both the Polish state and EU structures, as well as to ensure the safety of users of public networks. The CS team cooperates with many research and industrial centers: NASK, Cryptomage SA, Intel, Cyber Security and Information Systems Analysis Center, US Army Research Laboratory, George Mason University, Virginia Tech, Fraunhofer FKIE, Bonn, TU Darmstadt, KTH Royal University of Technology and in Poland: Orange Poland, National Institute of Telecommunications.

An important aspect of the current work within the area is related to the implementation of grants won under national programs (National Science Center for basic research: eg MLGenSig or DALEX, National Center for Research and Development for applied research: eg GTS-LOG project - GPU-based stream generation from high volume text logs, Forensics: Advanced digital forensics lab, PDAS: Network anomaly detection platform), national grants supported by the European Union (eg preparation and launch of Master's studies in the field of Data Science, co-financed from EU funds under the European Social Fund), international projects funded by the European Union in Horizon 2020 (eg ICT-16 VaVeL project - Variety, Veracity, VaLue: Handling the Multiplicity of Urban Sensors, SIMARGL project - Secure Intelligent Methods for Advanced Recognition of Malware and Stegomalware) and industry-led projects.

Research plans

Our plans for the future will be focused on creating synergies between sub-areas, in particular between cybersecurity and BSDS and AML. From cybersecurity perspective, the key issue is the analysis of large data sets in terms of correlating phenomena associated with attacks. An important issue is both the speed of algorithms (so-called fast data), preferably operating in real time or close to it, as well as the reliability of results (reducing of false positives and false negatives). Another important feature is the ability to keep the history of events in order to later analyze possible relationships between security incidents. This requires a deliberate reduction of data subjected to further analysis.

We are planning to substantially increase the research output of the group by using additional funding expected from the "Excellence Initiative – Research University" programme to:

  • increase the share of major international research projects; this will be done by stimulating more high quality applications (proposals) for EU/international funding with particular attention to the intersection of cybersecurity and data analytics,
  • applying IPR management that will include the use of open licensing of some of the software developed by the group to promote the cooperation with international researchers in the field,
  • increase the contribution of the group to major academia and industry-focused international events (presentations, co-organising workshops and tutorials), which is expected to strengthen the cooperation with foreign research groups on the topics such as high-performance stream mining,
  • initiate Summer Data Schools for Central Europe program to attract the best young researchers to cooperate with WUT groups.

The group plans to increase the number of top-tier research papers, to improved dissemination methods, and to develop new software. Additional funding made available within priority research areas will also enable the development of international studies in Data Science. Furthermore, research endeavours will be combined with joint R&D projects performed with technology partners and aiming at large scale deployment of the algorithms and methods developed by the group. Such cooperation with the City of Warsaw (1.8 mln inhabitants) and Orange Polska telecom (over 10 mln clients) has been already established within VaVeL project and will be continued to provide the basis for new applied research projects.

Drawing on the experience gained through current research work performed under scientific grants, it is planned to develop research in the area of human genome analysis using parallel algorithms running on GPU processors. The computational cluster constructed for the analysis of genomic data will become a place where the synergy between different fields of science will contribute to achieving significant progress in each of them.

In connection with the development of research in this area, further recruitment of researchers and academic teachers, conducting research and teaching activities in the area of cybersecurity and data science is planned. The additional funding foreseen for the priority research area will enable broader participation of employees and doctoral students in the highest-ranked global conferences (including A and A* rank conferences in CORE ranking), as well as the introduction of a research funding system promoting further achievements of international quality.

Importantly, cybersecurity and data science have recently attracted significant interest in Poland. This is reflected, among other, in the success of Data Science Summit conference (hosted by the Faculty of Mathematics and Information Science), which attracted over 1 000 professionals from industry and academia in 2018.

Education programs

In the academic year 2017/2018 at the Faculty Electronics and Information Technology (EIT), ICT and Cybersecurity specialization was launched as a part of MSc studies in Telecommunication. The Faculty's offer also includes highly popular postgraduate studies Information security in ICT: design and audit, and Information Systems Security with Biometric Techniques. In 2015, Cybersecurity Unit was started as a part of the Institute of Telecommunications of WUT. In October 2019, at EIT, studies in Cybersecurity at BSc level will be launched. Start of the MSc studies in Cybersecurity is planned for February 2023.

In the academic year 2017/2018, the Faculty of Mathematics and Information Science (FMIS) launched studies in the field of Data Science at a BSc level, and in the academic year 2018/2019 also at the master's level. The program entails a particularly high share of classes focused on building creative problem-solving sills (e.g. design thinking method), including project-based classes. Data Science BSc studies have already attracted a record-breaking number of candidates for one place (the highest number at WUT), incidentally very good candidates, which was reflected in the highest point threshold at WUT. Individual modules are done in cooperation, among others with global IT leaders (such as Amazon, SAS Institute). FMIS lecturers hold relevant industry certificates (cloud technologies, PMP and PRINCE2 Foundation/Practitioner). Based on the agreement with SAS Institute, FMIS students can get international SAS Data Science Joint Certificate awarded upon passing SAS-recognised modules offered by FMIS. Data Science program offered by FMIS is the only Polish master's program listed in Education Hub - a repository of educational programs in Big Data run by BDVA. FMIS contributes also to the MBA Digital Transformation program of WUT Business School.

From the academic year 2020/2021, FMIS is planning to start the MSc Data Science program in English. Judging from over 20 years of experience in running studies in Computer Science in English, attended by groups of international students, the high interest of candidates and a high degree of internationalization of Data Science program are also expected.

In the field of cybersecurity and data science, PhD studies are also offered. Some of PhD candidates work in cooperation with the public and private sector, which constitutes a part of implementation doctorate program developed by the Ministry of Science and Higher Education.

Human capital

Cybersecurity and data science field includes over 50 researchers at WUT specialised in computer science, telecommunications, information security, and mathematics.