Overview

TCGA publishes different types of data (clinical, exon, methylation, miRNA etc.) for each tumor due to which we have distributed a single tumor data into 3 different SPARQL endpoints. We have hosted data of RDFized tumors via several SPARQL endpoints. The following data is available under the original TCGA Data Use Certification Agreement.

Tumor TypeEndpoint AEndpoint BEndpoint CGraph Name
Adrenocortical carcinoma (ACC)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/acc
Bladder cancer (BLCA)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/blca
Breast cancer (BRCA)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/brca
Lymphoid Neoplasm Diffuse Large B-cell (DLBC)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/dlbc
Esophageal carcinoma (ESCA)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/esca
Cervical (CESC)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/cesc
Head and neck squamous cell (HNSC)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/hnsc
Papillary Kidney (KIRP)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/kirp
Acute Myeloid Leukemia (LAML)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/laml
Lower Grade Glioma (LGG)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/lgg
Lung squamous carcinoma (LUSC)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/lusc
Prostate adenocarcinoma (PRAD)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/prad
Rectal adenocarcinoma (READ)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/read
Cutaneous melanoma (SKCM)Exon ExpressionsMethylationAll other typeshttp://tcga.deri.ie/graph/skcm

Distribution of tumors across different SPARQL endpoints can also be visualized here.

Schema

Linked TCGA Schema For the clinical data (e.g drug, follow-up, radiation etc.), we have only included important properties as the complete list of properties is around 300.

Dataspace

We publish 27 tumors data, which, when converted into RDF require more than ~445GB of disk space yielding a dataspace with ~20.4 billion triples. To download RDF dump of any particular tumor and upload it in a triple store locally, we have also provided RDF dumps of each converted tumor here. For instructions on how to upload a RDF dump into a triple store, step by step instructions can be found here. In the following, we provide an overview of the statistics for TCGA tumors:

S.No.Tumor TypeOriginal Size (GB)Refined Size (GB)RDFized Size (GB)Triples (Million)
1Lymphoid Neoplasm Diffuse Large B-cell (DLBC)0.370.200.8335
2Cutaneous melanoma (UCS)1.20.642.6113
3Glioblastoma multiforme (GBM)2.30.772.8132
4Esophageal carcinoma (ESCA)1.50.883.4149
5Adrenocortical carcinoma (ACC)1.60.903.6158
6Pancreatic adenocarcinoma (PAAD)2.61.14.5200
7Kidney Chromophobe (KICH)3.71.45.3242
8Sarcoma (SARC)3.81.55.9267
9Cervical (CESC)8.752.448.86400.19
10Ovarian serous cystadenocarcinoma (OV)8.22.48.7410
11Rectal adenocarcinoma (READ)8.072.259.04413.31
12Papillary Kidney (KIRP)10.402.9010.4469.65
13Stomach adenocarcinoma (STAD)5.52.912529
14Liver hepatocellular carcinoma (LIHC)8.23.112550
15Bladder cancer (BLCA)12.163.3912.3556.38
16Acute Myeloid Leukemia (LAML)14.854.1415.1684.05
17Lower Grade Glioma (LGG)17.084.7617.1778.82
18Prostate adenocarcinoma (PRAD)18.055.0318.1821.01
19Lung squamous carcinoma (LUSC)20.635.7520.5927.08
20Cutaneous melanoma (SKCM)23.226.4723.21050.94
21Uterine Corpus Endometrial Carcinoma (UCEC)135.9824.21070
22Colon adenocarcinoma (COAD)186.64261175
23Head and neck squamous cell(HNSC)27.67.6927.51245.37
24Lung adenocarcinoma (LUAD)239.1361611
25Kidney renal clear cell carcinoma (KIRC)249.4371658
26Thyroid carcinoma (THCA)2610.1401796
27Breast invasive carcinoma (BRCA)4517652959

Achievements

Support & Development

The RDF datasets adhere to TCGA’s copyright policy.

Who is behind this

The Linked TCGA project is developed and maintained by a team of researchers from renowned research labs:

Reporting bugs and ideas

If you find a bug or have a suggestion for an enhancement, please use the issue tracker on GoogleCode.