From LipidomicsWiki
Contents |
Preprocessing of Agilent Single Color Arrays
This SPP describes preprocessing of single-colored agilent-array data after the feature extraction-step. The following settings are based on suggestions made by the Agilent GeneSpring Software for their arrays. It may be necessary to change some of these settings in certain experiments, so this SPPs should be understood as a suggestion. The pipeline is implemented as an R-script and is available from Institute for Clinical Chemistry in Regensburg.
In order to use it you need to have R installed. (available from www.r-project.org ) .
Input
This R-script reads three kinds of files. Output files for one microarray experiment as provided by Feature Extractor, an annotation file with the filenames ending on “.txt” as provided by Agilent, and a tab delimited textfile containing the information about the groups the samples belongs to e. g. treatment or control and contain the the exact filenames of the samples it refers to in the first column. It should have the endings .tsv. The later two will not be necessary for the preprocessing described in this SPP, but be read by the script if provided. The files should be stored in subfolders of the current working directory called “annotation” for feature annotation, “groupings” for group information and “arrays” for the Intensity-files from the Feature Extractor. The names are important since the files are identified by their folder. “annotation” and “grouping” may not contain more than one file.
Output
Transformed and normalized Feature intensities.
The processed data is stored in a list called data.norm containing several matrices with normalized and transformed intensities ("expr"), feature annotation("feat.anno"), grouping (e.g. information like kind of threatment) ("samp.anno"), and also matrices of the flag-values selected by "flags".(See the list below).
Processing
The R-script basicly does the following steps and optionally a "Per Gene Normalization". Other files as annotation flags and sample annotation are just read and stored.
Transformation
For further computation feature intensities should be transformed using a logarithm with logbase 2.
Normalization
Normalization reduces systematic bias from microarray experiments. GeneSpring uses a method, which is called "Shift to 75.0 percentile". This means that the 75th percentile is computed for every sample. The log values of the probe intensities of on sample is substracted by a factor so that every sample has the same 75th percentile.
Baseline Transformation
This step emphasizes similar expression of all genes. There are two different ways of baseline transformation. Common to both is the substraction of a value
for a probe from every intensity belonging to that probe. This value can either be the median of this intensities or the median of the intensities of a
control group.
Baseline transformation sometimes also called Per-Gene-Normalization seems to be unusual in microarray analysis and therefore be considered as highly optional.
Proposals and discussion points
Write here any proposals and discussion points you may have.
