AdditionalGroups > Adipocytes > Adipocyte and adipose tissue > Affymetrix SPP > Preprocessing Agilent Arrays

Preprocessing of Single Colored Agilent Arrays

From LipidomicsWiki

Jump to: navigation, search

Contents

Preprocessing of Agilent Single Color Arrays

This SPP describes preprocessing of single-colored agilent-array data after the feature extraction-step. The following settings are based on suggestions made by the Agilent GeneSpring Software for their arrays. It may be necessary to change some of these settings in certain experiments, so this SPPs should be understood as a suggestion. The pipeline is implemented as an R-script and is available from Institute for Clinical Chemistry in Regensburg.

In order to use it you need to have R installed. (available from www.r-project.org ) .


Input

This R-script reads three kinds of files. Output files for one microarray experiment as provided by Feature Extractor, an annotation file with the filenames ending on “.txt” as provided by Agilent, and a tab delimited textfile containing the information about the groups the samples belongs to e. g. treatment or control and contain the the exact filenames of the samples it refers to in the first column. It should have the endings .tsv. The later two will not be necessary for the preprocessing described in this SPP, but be read by the script if provided. The files should be stored in subfolders of the current working directory called “annotation” for feature annotation, “groupings” for group information and “arrays” for the Intensity-files from the Feature Extractor. The names are important since the files are identified by their folder. “annotation” and “grouping” may not contain more than one file.

Output

Transformed and normalized Feature intensities.

The processed data is stored in a list called data.norm containing several matrices with normalized and transformed intensities ("expr"), feature annotation("feat.anno"), grouping (e.g. information like kind of threatment) ("samp.anno"), and also matrices of the flag-values selected by "flags".(See the list below).


Processing

The R-script basicly does the following steps and optionally a "Per Gene Normalization". Other files as annotation flags and sample annotation are just read and stored.



Transformation

For further computation feature intensities should be transformed using a logarithm with logbase 2.


Normalization

Normalization reduces systematic bias from microarray experiments. GeneSpring uses a method, which is called "Shift to 75.0 percentile". This means that the 75th percentile is computed for every sample. The log values of the probe intensities of on sample is substracted by a factor so that every sample has the same 75th percentile.


Baseline Transformation


This step emphasizes similar expression of all genes. There are two different ways of baseline transformation. Common to both is the substraction of a value
for a probe from every intensity belonging to that probe. This value can either be the median of this intensities or the median of the intensities of a
control group.

Baseline transformation sometimes also called Per-Gene-Normalization seems to be unusual in microarray analysis and therefore be considered as highly optional.


Proposals and discussion points

Write here any proposals and discussion points you may have.




Back to Microarray Standard Processing Procedures

Views
Personal tools

Navigation
Toolbox