NONCODE is a database of all kinds of noncoding RNAs (except tRNAs and rRNAs). It is distinguished from other ncRNA databases by:
- The data amount of NONCODE is big, and almost all traditional ncRNA classes are included. NONCODE authors regularly update it to maintain an up-to-date and comprehensive resource.
- All the sequences are confirmed by consulting the references manually, more than 80% data are from experiments.
- We introduced the concept of lncRNA genes to help systematic understanding of lncRNAs, based on alternative splicing pattern similar to mRNAs.
- We present expression profile of lncRNA genes by graphs based on public RNA-seq data for human and mouse, as well as predict functions of these lncRNA genes.
- NONCODE also provides an ID conversion tool from RefSeq or Ensembl ID to NONCODE ID and a service of lncRNA identification.
Since NONCODE v3.0 was released 2 years ago, discovery of novel ncRNAs has been promoted by high-throughput RNA sequencing (RNA-Seq). In NONCODE v4, we expand the ncRNA dataset by collection of newly identified ncRNAs from literature published in the last two years and through integration of the latest version of RefSeq and Ensembl. Particularly, the number of long noncoding RNA (lncRNA) has increased sharply from 73,327 to 210,831. Due to similar alternative splicing pattern to mRNAs, the concept of lncRNA genes was put forward to help systematic understanding of lncRNAs. 56,018 and 46,475 lncRNA genes were generated from 95,135 and 67,628 lncRNAs for human and mouse, respectively. Additionally, we present expression profile of lncRNA genes by graphs based on public RNA-seq data for human and mouse, as well as predict functions of these lncRNA genes. The improvements brought to the database also include an incorporation of an ID conversion tool from RefSeq or Ensembl ID to NONCODE ID and a service of lncRNA identification based on RNA-seq.
If you make use of the data presented here, please cite the following article in addition to the primary data sources: