Data format

  • Data must be stored in a tab-separated text file with Windows-format line ending.
  • The first row is the header that describes each column.
  • Each row, starting from the second, corresponds to each gene/probe.
  • Columns are separated by tabs. The best way to do this is select appropriate columns in your Excel files and save them as text.
    • First column, Probe Set Id
    • Second column, Gene Symbol
    • Third column, UniGene Id
    • Fourth column, Rep. Pub. Id
    • Fifth column, Gene Title and other information

    Note: the first five columns may have missing values. Missing or even inaccurate values on the first five columns do not affect our analysis. Note, however, that missing or inaccurate values may cause inaccurate analyses by external resources such as NCBI, GeneMANIA, DAVID, and GCAT.

    • Sixth column (and the rest) specifies the expression value of each replicate for a treatment.
    • Replicates for each treatment group must be consecutive. The first one must be Control group.
    • Expression values must be non-log values.

    Here's a snapshot of a sample dataset (shown in Excel to see better, but you must convert to text format before uploading).