To avoid data duplication and save hard drive space, we provide access to a selection of public datasets frequently used in AI/ML research. The datasets are Ready-to-use under COMMON_DATASETS=/proj/common-datasets.
Users are encouraged to contact us to request corrections, updates, or the addition of new datasets.
| Resource | Version | Directory | Access | License / Terms |
|---|---|---|---|---|
| Protein structure and bioinformatics | ||||
| AlphaFold | ||||
| BFD | Not versioned | AlphaFold |
Ready-to-use | CC BY 4.0 |
| MGnify | 2022_05, 2018_12 | AlphaFold |
Ready-to-use | CC0 |
| PDB70 | from_mmcif_200401 | AlphaFold |
Ready-to-use | CC BY 4.0 |
| PDB | Not versioned | AlphaFold |
Ready-to-use | CC0 |
| PDB seqres | Not versioned | AlphaFold |
Ready-to-use | CC0 |
| UniRef30 | 2021_03 | AlphaFold |
Ready-to-use | CC BY-SA 4.0 |
| UniProt | 2022_05 | AlphaFold |
Ready-to-use | CC BY 4.0 |
| UniRef90 | 2022_05 | AlphaFold |
Ready-to-use | CC BY 4.0 |
| Parameters | 2022-12-06 | AlphaFold |
Ready-to-use | Apache 2.0 |
| AlphaFold 3 | ||||
| BFD small | Not versioned | AlphaFold3 |
Ready-to-use | CC BY 4.0 |
| MGnify | 2022_05 | AlphaFold3 |
Ready-to-use | CC0 |
| PDB | 2022_09_28 | AlphaFold3 |
Ready-to-use | CC0 |
| PDB seqres | 2022_09_28 | AlphaFold3 |
Ready-to-use | CC0 |
| UniProt | 2021_04 | AlphaFold3 |
Ready-to-use | CC BY 4.0 |
| UniRef90 | 2022_05 | AlphaFold3 |
Ready-to-use | CC BY 4.0 |
| NT | 2023_02_23 | AlphaFold3 |
Ready-to-use | Not specified |
| RFam | 14_9 | AlphaFold3 |
Ready-to-use | CC |
| RNACentral | Not versioned | AlphaFold3 |
Ready-to-use | CC0 |
| Model parameters | Not hosted | Not provided on Berzelius | Obtain separately | Terms |
| Foldseek | ||||
| afdb | 2025-10-08 | Foldseek |
Ready-to-use | CC BY 4.0 |
| afdb50 | 2025-10-07 | Foldseek |
Ready-to-use | CC BY 4.0 |
| bfvd | 2025-09-12 | Foldseek |
Ready-to-use | CC BY 4.0 |
| OpenFold | ||||
| Trained parameters | Not versioned | OpenFold |
Ready-to-use | Apache 2.0 |
| SoloSeq trained parameters | Not versioned | OpenFold |
Ready-to-use | Apache 2.0 |
| ColabFold's environmental database | 202108 | OpenFold |
Ready-to-use | MIT |
| Alignments | Not versioned | OpenFold |
Ready-to-use | CC0 |
| Alignment DBs | Not versioned | OpenFold |
Ready-to-use | CC0 |
| Data caches | Not versioned | OpenFold |
Ready-to-use | Apache 2.0 |
| Computer vision and image classification | ||||
| CIFAR-10/100 | Not versioned | CIFAR |
Ready-to-use | Not specified |
| COCO | Not versioned | COCO |
Ready-to-use | CC BY 4.0 |
| DomainNet | Not versioned | DomainNet |
Ready-to-use | Terms |
| Fashion-MNIST | Not versioned | Fashion-MNIST |
Ready-to-use | MIT |
| ImageNet | Not versioned | ImageNet |
Request via SUPR | Terms |
| Imagenette | Not versioned | Imagenette |
Request via SUPR | Terms |
| MNIST | Not versioned | MNIST |
Ready-to-use | CC BY-SA 3.0 |
| Places365 | Not versioned | Places |
Request via support | Terms |
| Autonomous driving and robotics | ||||
| Other autonomous driving datasets | ||||
| Argoverse | v1.1, v2.0 | Argoverse |
Request via SUPR | Terms |
| KITTI | Not versioned | KITTI |
Ready-to-use | Not specified |
| KITTI-360 | Not versioned | KITTI-360 |
Ready-to-use | Not specified |
| MAN-TruckScenes | v1.0 | MAN-TruckScenes |
Ready-to-use | CC BY-NC-SA 4.0 |
| nuImages | v1.0 | nuImages |
Request via SUPR | Terms |
| nuPlan | v1.1 | nuPlan |
Request via SUPR | Terms |
| Zenseact Open Dataset | Not versioned | Zenseact-Open-Dataset |
Request via SUPR | Terms |
| nuScenes | ||||
| Panoptic | v1.0 | nuScenes |
Request via SUPR | Terms |
| Lidarseg | v1.0 | nuScenes |
Request via SUPR | Terms |
| CAN bus expansion | v1.0 | nuScenes |
Request via SUPR | Terms |
| Map expansion | v1.3 | nuScenes |
Request via SUPR | Terms |
| Full dataset | v1.0 | nuScenes |
Request via SUPR | Terms |
| Waymo Open Dataset | ||||
| Motion Dataset | 1.2.1, 1.3.0 | Waymo |
Request via support | Terms |
| Perception Dataset | 1.4.3, 2.0.1 | Waymo |
Request via support | Terms |
| Marine imaging and plankton datasets | ||||
| SMHI IFCB Plankton | version 2 | SMHI-IFCB-Plankton |
Ready-to-use | CC BY 4.0 |
| SYKE-plankton_IFCB_2022 | 20220201 | SYKE-plankton_IFCB_2022 |
Ready-to-use | CC BY 4.0 |
| SYKE-plankton_IFCB_Utö_2021 | 20220428 | SYKE-plankton_IFCB_Utö_2021 |
Ready-to-use | CC BY 4.0 |
| WHOI-Plankton | Not versioned | WHOI-Plankton |
Ready-to-use | MIT |
User Area
Guides, documentation and FAQ.
Applying for projects and login accounts.