Actualité - Publication

DNA barcodes to identify groups of cells


A team from Institut Curie and CNRS has developed an open-access software and experimental guide to help researchers analyze cell genealogy. Their study was published in Nature Computational on Monday February the 19th. These results could have numerous applications in cancer research, in particular to identify groups of cells involved in certain mechanisms, such as resistance to treatment.

code-barre ADN

© Wenjie Sun, Institut Curie / CNRS / Sorbonne

The emergence of techniques to trace cell origins has marked an important turning point in developmental and cancer biology. One technique known as DNA cellular barcoding makes it possible to identify the ancestry of a group of cells.


Cell family names

This technique amounts to giving a family name to groups of cells, which will then be passed down through cellular generations

Explains Dr. Wenjie Sun, postdoctoral fellow in the Quantitative Immuno-hematology team (CNRS UMR168 / Sorbonne University), headed by Dr. Leïla Perié at Institut Curie.


In practice, researchers integrate unique nucleotide sequences into the cell’s genome. At each DNA replication, the artificial sequence - corresponding to the family name in the analogy proposed by Dr. Wenjie Sun - is transmitted to the daughter cells, enabling us to trace their origin.

To read the family name, the DNA is sequenced, and the artificial sequence is searched for, to identify the genealogy and thus define to which family the studied cell belongs.

The limitations of reading cellular family names

DMany teams use this technique but there is currently no consensus on the way the results are analyzed, making it difficult to compare results between studies, and thus slowing down discoveries. The problem lies in reading the family name. Indeed, DNA sequencing techniques can lead to the insertion of errors in the artificial sequence, resulting in misspellings of cell family names. We need to decide whether a specific sequence corresponds to a family name or a name with spelling mistakes, like Tintin's Dupont and Dupond for example.

Explains Dr. Leïla Perié.

In this study, published in Nature Computational, Leïla Perié's team has developed a software to analyze these data, and a decision tree to help researchers analyze them.

Different parameters need to be considered, in the analysis of this data, depending on the nature of the cells studied, for example. If we assign a family name to cancer cells, which proliferate a lot, the large size of the family may have an impact on the reading of the name. On the other hand, if we're studying neurons, which are non-dividing cells, we'll need to adapt this analysis strategy. So, it’s necessary to decide upstream of the experiment and data analysis, which approaches will be used. And that's exactly what we're proposing here, with free access for all researchers: an experimental guide for DNA Cellular Barcoding data analysis

Concludes Dr. Leïla Perié.


The software tool and its experimental guide are already being used by other teams at Institut Curie and will be used to identify the groups of cells involved in treatment resistance in certain cancers, such as triple-negative breast cancer. Open access to these tools could standardize the analysis of DNA cellular barcoding data, making it easier to compare results between different teams.


Référence : Wenjie Sun, Meghan Perkins, Mathilde Huyghe, Marissa M. Faraldo, Silvia Fre, Leïla Perié, Anne-Marie Lyne. 19 Février 2024 Extracting, filtering and simulating cellular barcodes using CellBarcode tools, Nature Computationnal Science, doi : 10.1038/s43588-024-00595-7