Skip to content

TauferLab/nsdf-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tutorial: Large-Scale Scientific Data Analysis with the National Science Data Fabric (NSDF)

Tutorial Goals

In this interactive half-day tutorial, participants explore the advanced applications of the National Science Data Fabric (NSDF) services and comprehensive strategies for end-to-end scientific data analysis.

The tutorial targets a broad audience—from researchers and students to developers and scientists—each finding valuable insights into managing and analyzing large datasets, with a particular focus on datasets exceeding 100 TB.

Attendees gain hands-on experience constructing modular workflows, leveraging public and private data storage and streaming solutions, and deploying sophisticated visualization and analysis dashboards for scientific discovery.

The tutorial highlights NSDF's role in supporting the VIS conference themes by providing scalable solutions for advances in visualization and visual analytics. It covers topics ranging from an overview of NSDF capabilities and common pain points in large-scale data analysis to hands-on exercises using NSDF services for Earth science datasets.

Advanced modules include handling and visualizing massive datasets in domains requiring high-resolution data management. Participants leave the tutorial with a deeper understanding of how NSDF services integrate into their research workflows to enhance data accessibility, sharing, and collaborative scientific discovery.

This tutorial advances knowledge in data-intensive computing and empowers attendees to harness the full potential of NSDF in their research domains.

Tutorial Modules

The tutorial is organized into four progressive modules that guide participants from environment setup to large-scale scientific data analysis using the National Science Data Fabric (NSDF).

Each module builds on the previous one and introduces increasingly advanced capabilities for data-intensive scientific workflows.

Module Duration Objective
I 30 mins Overview of the National Science Data Fabric (NSDF) and discussion of common challenges in large-scale scientific data analysis identified through user interviews.
II 1 hour Hands-on introduction to NSDF services, including visualization and dashboard creation using Earth science datasets.
III 1 hour Advanced NSDF capabilities for managing and analyzing datasets exceeding 100 TB, including scalable data access and processing workflows.
IV 30 mins Interactive Q&A session and discussion on how NSDF can support research across multiple scientific domains.

Quick Start

Participants can run the tutorial using three supported environments:

  • GitHub Codespaces (recommended for quick access)
  • Docker containers
  • ACCESS Jetstream2 cloud resources

Detailed setup instructions are provided in Module 1.

Acknowledgment

This material is based upon work supported by the National Science Foundation (NSF) under Grant No. 2138811.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages