Instructions for reproducing the database:
File Name | Description | Size |
---|---|---|
Adeptus2.zip | All data, including raw expression data | 4,468MB |
Supervised_data_atleast_10000_genes_max_percet_na_10.RData |
The final preprocessed supervised database, can be easily loaded into an R session. Contains:
|
2,291MB |
sample2study.RData | A mapping of samples to their studies. | 158KB |
classification_performance_scores_summary.RData | Results of the leave-study-out SVM cross-validation. Contains a list called classifier_scores_matrices. It has several score matrices. In each one the rows are labels and the first column is the score of the SVM-based classifier. The sublist classifier2selected_diseases contains (as the first entry) the list of 68 well-classified labels (13 tissue controls and 55 disease-related). | 16,652B |
classification_results_40k_samples.RData | An R object with the leave-study-out SVM cross validation results – the actual predictions | 242MB |
gene_dataset_p_matrices.RData | A list with an entry for each label. Each entry has a p-value matrix of genes vs labels. Each p-value is a result of comparing the label’s samples to the other samples in that study. | 228MB |
gene_pb_roc_scores.RData | PB-ROC scores: a matrix of genes vs labels. | 17,980KB |
gene_pn_roc_scores.RData | PN-ROC scores: a matrix of genes vs labels. | 14,501KB |
gene_edge_based_son2rocs.RData | Results of the edge-based analysis: a list with an entry for each disease label. Each entry is a matrix of genes vs the parents of the label (typically a single parent) | 14,971KB |
selected_genes_adeptus2.RData | A list with the selected genes for each label | 20,004KB |
gpl_mappings_to_entrez.RData | A list that maps the probes in each microarray platform into Entrez gene ids. | 9,383KB |