FAQ is Frequently Asked Questions (with answers!) about NONCODE database.
Question:Where have the data been collected from?
Answer:We collected ncRNAs data from three sources: 1) Literature mining, 2) GenBank, 3) Specialized Database such as Ensembl, RefSeq, lncRNAdb, lncipedia.
Question:How and how often will the database be updated?
Answer:First, we use a keywords list to automatically filter GenBank every six months. Second, we do large scale screens and analysis for new lncRNAs in human and mouse. New data will be added to the database gradually.
Question:Will the database offer more services?
Answer:We are making efforts to offer more services to meet the requirements of users. Up to now, NONCODE not only provide basic services like Blast and Genome Browser, but also provide an ID conversion tool and an online pipeline for identification of lncRNAs. In the NONCODEv5, we add the information of lncRNA and disease, exosome expression profile of lncRNA and lncRNA Secondary Structure.
Question:What is the relationship of "lncRNA gene(NONHSAG*)" and "lncRNA(NONHSAT*)"? Are they similar to the defination of "gene" and "transcript(isoform)" in mRNA, that each gene may has seveal different transcripts due to alternative splicing? If so, how can I get the information of which "lncRNAs" are generated from a certain "lncRNA gene"?
Answer:LncRNAs have similar alternative splicing pattern to mRNAs. Therefore, the concept of lncRNA genes was put forward in this updation to help systematic understanding of lncRNAs. The definition of "gene" and "transcript (isoform)" appearing in NONCODE v4 is similar to that of mRNAs.
Question: If I want to convert the bed file to fasta file for RNA-seq analysis like mapping reads to transcripts, should I consider the difference of ID(NONHSAG* and NONHSAT*)?
Answer: In RNA-seq analysis, if you are concerned about expression of lncRNA genes, you may consider both NONHSAG* and NONHSAT* probabaly by assembling them into a gtf file, since that integrating expression level of transcripts of a specific gene generates expression profile of the given gene. If you regard transcripts independently, you may consider NONHSAT* only.
Question:New file is available as NONCODEv5_human.func, I wonder which method are you utilized for predicting the function of lncRNA? Through coding-non-coding expression network(ncFANS) or other method mentioned in papers?
Answer: Functions for lncRNA genes were predicted by lnc-GFP (ref to PMID: 23132350), a bi-colored network-based global function predictor.
Question:what does the "_R" mean after the tissue sample type in the attached image mean?For example there is a Liver sample and a Liver_R sample?
Answer:Our RNA-seq data comes from the paper Integrative Annotation of Human Large Intergenic Non-Coding RNAs Reveals Global Properties and Specific Subclasses.