Details
The Internal Pipeline & Data Release Tools
subsystem is a critical part of the Neuroinformatics Software Development Kit (SDK), embodying the Pipeline/Workflow, Data-Centric, and Data Standards Integration architectural patterns. It automates the journey of raw experimental data through processing, quality control, and final preparation for public dissemination.
Pipeline Orchestration & Execution
This component comprises independent Python scripts that serve as distinct stages or workflows within the data processing pipeline. These modules are designed for standalone execution, often within a LIMS environment, and encapsulate specific computational tasks from data transformation and analysis to visualization and metadata generation. It orchestrates the flow of data through various processing steps.
LIMS Data Access & Integration
This layer provides the foundational utilities for interacting with the LIMS (Laboratory Information Management System) database. PipelineModule
serves as a base class for pipeline stages, managing input/output data serialization/deserialization and integrating with LIMS for file path resolution and metadata updates. lims_utilities
offers direct query capabilities to LIMS.
Metadata Generation & Serialization
This component is dedicated to generating and writing comprehensive metadata tables for behavioral and optical physiology (ophys) project data. It aggregates information from various sources (e.g., LIMS) and organizes it into standardized tables, which are then written to disk. It also includes schemas for validating the metadata structure.
NWB Data Standardization
This module is specifically designed for converting processed experimental data into the Neurodata Without Borders (NWB) format. NWB is a widely adopted standardized format for neurophysiology data, making this component crucial for data sharing, interoperability, and reproducibility in neuroinformatics. It leverages a base NWBWriter for consistent output.
Public Data Release & Copying
This component provides essential utilities for preparing and copying processed data and associated files for public release. It ensures that data is correctly moved, organized, and made accessible for external consumption, including handling file existence checks and managing session uploads.