A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining

Gambardella, Gennaro and di Bernardo, Diego (2019) A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining. Frontiers in Genetics, 10. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/2/package-entries/fgene-10-00734.pdf] Text
pubmed-zip/versions/2/package-entries/fgene-10-00734.pdf - Published Version

Download (1MB)

Abstract

Gene expression in individual cells can now be measured for thousands of cells in a single experiment thanks to innovative sample-preparation and sequencing technologies. State-of-the-art computational pipelines for single-cell RNA-sequencing data, however, still employ computational methods that were developed for traditional bulk RNA-sequencing data, thus not accounting for the peculiarities of single-cell data, such as sparseness and zero-inflated counts. Here, we present a ready-to-use pipeline named gf-icf (gene frequency–inverse cell frequency) for normalization of raw counts, feature selection, and dimensionality reduction of scRNA-seq data for their visualization and subsequent analyses. Our work is based on a data transformation model named term frequency–inverse document frequency (TF-IDF), which has been extensively used in the field of text mining where extremely sparse and zero-inflated data are common. Using benchmark scRNA-seq datasets, we show that the gf-icf pipeline outperforms existing state-of-the-art methods in terms of improved visualization and ability to separate and distinguish different cell types.

Item Type: Article
Subjects: Afro Asian Library > Medical Science
Divisions: Faculty of Law, Arts and Social Sciences > School of Art
Depositing User: Unnamed user with email support@afroasianlibrary.com
Date Deposited: 09 Feb 2023 08:25
Last Modified: 24 Aug 2024 13:21
URI: http://classical.academiceprints.com/id/eprint/190

Actions (login required)

View Item
View Item