English  |  正體中文  |  简体中文  |  Items with full text/Total items : 12145/12927 (94%)
Visitors : 912380      Online Users : 1176
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://ir.nhri.org.tw/handle/3990099045/3739


    Title: Utilization of virtual samples to facilitate cancer identification for DNA microarray data in the early stages of an investigation
    Authors: Li, DC;Fang, YH;Lai, YY;Hu, SC
    Contributors: Division of Biostatistics and Bioinformatics
    Abstract: DNA microarray datasets are generally small in size, high dimensional with many non-discriminative genes, and non-linear with outliers. Their size/dimension ratio suggests that DNA microarray datasets are identified as small-sample problems. Recently, researchers have developed various gene selection algorithms to discover genes that are most relevant to a specific disease, and thus to reduce computation. Most gene selection algorithms improve learning performance and efficiency, but still suffer from the limitation of insufficient training samples in the datasets. Moreover, in the early stage of diagnosing a new disease, very limited data can be obtained. Therefore, the derived diagnostic model is usually unreliable to identify the new disease. Consequently, the diagnostic performance cannot always be robust, even with the gene selection algorithms. To solve the problem of very limited training dataset with non-linear data or outliers, we propose the method GVSG (Group Virtual Sample Generation), which is a non-linear Virtual Sample Generation algorithm. This non-linear method detects the characteristics in the very limited data, forms discrete groups of each discriminative gene, and systematically generates virtual samples for each of these to accelerate and stabilize the modeling process. The results show that this method significantly improves the learning accuracy in the early stage of DNA microarray data.
    Date: 2009-07-20
    Relation: Information Sciences. 2009 Jul 20;179(16):2740-2753.
    Link to: http://dx.doi.org/10.1016/j.ins.2009.04.003
    JIF/Ranking 2023: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=NHRI&SrcApp=NHRI_IR&KeyISSN=0020-0255&DestApp=IC2JCR
    Cited Times(WOS): https://www.webofscience.com/wos/woscc/full-record/WOS:000267562500003
    Cited Times(Scopus): http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=67349233125
    Appears in Collections:[其他] 期刊論文

    Files in This Item:

    File Description SizeFormat
    SCP67349233125.pdf1041KbAdobe PDF667View/Open


    All items in NHRI are protected by copyright, with all rights reserved.

    Related Items in TAIR

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback