Description
Most of the breast cancer samples used in clinical research contain multiple cell types other than epithelial cells alone. The non-epithelial cell types have have a substantial effect on the gene expression-profile, which is used to define molecular subtypes of the tumours. The purpose of this data set is to retrieve gene-expression profile within tumour epithelial cells. We collected 9 breast cancer epithelial cell lines and 5 tumour sampes from which epithelial cells were sorted and enriched using BerEp4 antibody coated beads. We profiled the mRNA expression level of these samples and classified probe sets into epithelial genes which were those genes with present calls in at least 50% of the samples. Then we derived an 23-gene signature based on only the epithelial genes to stratify breast cancer.