• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Genetic Prediction of Cancer Recurrence: Scientists Verify Reliability of Computer Models

Genetic Prediction of Cancer Recurrence: Scientists Verify Reliability of Computer Models

© iStock

In biomedical research, machine learning algorithms are often used to analyse data—for instance, to predict cancer recurrence. However, it is not always clear whether these algorithms are detecting meaningful patterns or merely fitting random noise in the data. Scientists from HSE University, IBCh RAS, and Moscow State University have developed a test that makes it possible to determine this distinction. It could become an important tool for verifying the reliability of algorithms in medicine and biology. The study has been published on arXiv.

Machine learning methods help analyse complex biological data, ie for predicting the likelihood of cancer recurrence based on gene expression, which reflects the activity levels of specific DNA regions within cells. However, it is not always clear whether these algorithms are detecting meaningful patterns or merely fitting random noise in the data.

A team of scientists from HSE University, IBCh RAS, and Moscow State University has developed a test to assess how reliably the classifier distinguishes between different patient groups. In this case, the two groups were patients who experienced a recurrence of the disease and those who did not. A model performs correctly if it effectively captures biologically meaningful differences. If the algorithm simply separates the data at random, its accuracy may appear deceptively high. The researchers focused on linear classifiers, one of the most widely used ML tools in biomedicine.

Anton Zhiyanov

'We aimed to test whether randomly generated (synthetic) data could be separated by a linear classifier as effectively as real biological samples. To do this, we calculated an upper bound on the p-value, which indicates the likelihood that the model is merely "guessing." The lower this p-value, the more reliable the classifier,' explains Anton Zhiyanov, Research Fellow at the HSE Laboratory of Molecular Physiology. 

The researchers conducted a series of experiments using synthetic data, allowing them to precisely control the degree of differences between classes. They then applied the new test to real-world medical models that predict the risk of breast cancer recurrence. 

The results showed that most classifiers failed to capture any meaningful differences between patients with and without recurrence. Further analysis revealed that 559 out of 570 models produced results consistent with random chance. This suggests that many algorithms may appear accurate, while in reality their predictions are driven by coincidences rather than genuine patterns.

However, the researchers also identified reliable models that reveal biologically meaningful patterns. One such model was a classifier that focused on the activity levels of the ELOVL5 and IGFBP6 genes. This algorithm was further tested on an independent data sample, confirming that differences in the expression of these genes are indeed linked to the risk of cancer recurrence.

Each point on the graph represents a patient, with the expression levels of two genes measured: IGFBP6 on the X-axis and ELOVL5 on the Y-axis. The orange dots represent patients with a recurrence, while the blue dots represent those without. In the first graph, these points (patients) are clearly separated by a straight line, representing a linear classifier. In the second graph, the points are randomly distributed, and the classifier fails to identify any patterns between gene expression and actual recurrence.

Alexander Tonevitsky

'Our test could become an important tool for verifying the reliability of algorithms in biology and medicine. It helps prevent false conclusions and emphasises models that truly identify important patterns, which is crucial for making decisions about patient treatment,' comments Alexander Tonevitsky, Professor at the HSE Faculty of Biology and Biotechnology.

The study was conducted with support from HSE University's Basic Research Programme within the framework of the Centres of Excellence project.

See also:

HSE Psycholinguists Launch Digital Tool to Spot Dyslexia in Children

Specialists from HSE University's Centre for Language and Brain have introduced LexiMetr, a new digital tool for diagnosing dyslexia in primary school students. This is the first standardised application in Russia that enables fast and reliable assessment of children’s reading skills to identify dyslexia or the risk of developing it. The application is available on the RuStore platform and runs on Android tablets.

Physicists Propose New Mechanism to Enhance Superconductivity with 'Quantum Glue'

A team of researchers, including scientists from HSE MIEM, has demonstrated that defects in a material can enhance, rather than hinder, superconductivity. This occurs through interaction between defective and cleaner regions, which creates a 'quantum glue'—a uniform component that binds distinct superconducting regions into a single network. Calculations confirm that this mechanism could aid in developing superconductors that operate at higher temperatures. The study has been published in Communications Physics.

Neural Network Trained to Predict Crises in Russian Stock Market

Economists from HSE University have developed a neural network model that can predict the onset of a short-term stock market crisis with over 83% accuracy, one day in advance. The model performs well even on complex, imbalanced data and incorporates not only economic indicators but also investor sentiment. The paper by Tamara Teplova, Maksim Fayzulin, and Aleksei Kurkin from the Centre for Financial Research and Data Analytics at the HSE Faculty of Economic Sciences has been published in Socio-Economic Planning Sciences.

Larger Groups of Students Use AI More Effectively in Learning

Researchers at the Institute of Education and the Faculty of Economic Sciences at HSE University have studied what factors determine the success of student group projects when they are completed with the help of artificial intelligence (AI). Their findings suggest that, in addition to the knowledge level of the team members, the size of the group also plays a significant role—the larger it is, the more efficient the process becomes. The study was published in Innovations in Education and Teaching International.

New Models for Studying Diseases: From Petri Dishes to Organs-on-a-Chip

Biologists from HSE University, in collaboration with researchers from the Kulakov National Medical Research Centre for Obstetrics, Gynecology, and Perinatology, have used advanced microfluidic technologies to study preeclampsia—one of the most dangerous pregnancy complications, posing serious risks to the life and health of both mother and child. In a paper published in BioChip Journal, the researchers review modern cellular models—including advanced placenta-on-a-chip technologies—that offer deeper insights into the mechanisms of the disorder and support the development of effective treatments.

Using Two Cryptocurrencies Enhances Volatility Forecasting

Researchers from the HSE Faculty of Economic Sciences have found that Bitcoin price volatility can be effectively predicted using Ethereum, the second-most popular cryptocurrency. Incorporating Ethereum into a predictive model reduces the forecast error to 23%, outperforming neural networks and other complex algorithms. The article has been published in Applied Econometrics.

Administrative Staff Are Crucial to University Efficiency—But Only in Teaching-Oriented Institutions

An international team of researchers, including scholars from HSE University, has analysed how the number of non-academic staff affects a university’s performance. The study found that the outcome depends on the institution’s profile: in research universities, the share of administrative and support staff has no effect on efficiency, whereas in teaching-oriented universities, there is a positive correlation. The findings have been published in Applied Economics.

Physicists at HSE University Reveal How Vortices Behave in Two-Dimensional Turbulence

Researchers from the Landau Institute for Theoretical Physics of the Russian Academy of Sciences and the HSE University's Faculty of Physics have discovered how external forces affect the behaviour of turbulent flows. The scientists showed that even a small external torque can stabilise the system and extend the lifetime of large vortices. These findings may improve the accuracy of models of atmospheric and oceanic circulation. The paper has been published in Physics of Fluids.

Solvent Instead of Toxic Reagents: Chemists Develop Environmentally Friendly Method for Synthesising Aniline Derivatives

An international team of researchers, including chemists from HSE University and the A.N. Nesmeyanov Institute of Organoelement Compounds of the Russian Academy of Sciences (INEOS RAS), has developed a new method for synthesising aniline derivatives—compounds widely used in the production of medicines, dyes, and electronic materials. Instead of relying on toxic and expensive reagents, they proposed using tetrahydrofuran, which can be derived from renewable raw materials. The reaction was carried out in the presence of readily available cobalt salts and syngas. This approach reduces hazardous waste and simplifies the production process, making it more environmentally friendly. The study has been published in ChemSusChem.

How Colour Affects Pricing: Why Art Collectors Pay More for Blue

Economists from HSE University, St Petersburg State University, and the University of Florida have found which colours in abstract paintings increase their market value. An analysis of thousands of canvases sold at auctions revealed that buyers place a higher value on blue and favour bright, saturated palettes, while showing less appreciation for traditional colour schemes. The article has been published in Information Systems Frontiers.