What is Structured Processing
Purpose
This page explains the basic concepts and processing flow of RDE structured processing. You will understand why structured processing is necessary and how it works.
Challenges and Background
Research data management faced the following challenges:
- Data Format Diversity: Different data formats across experimental instruments and research fields
- Inconsistent Metadata: Varying metadata description methods among researchers
- Reproducibility Assurance: Need for standardization and recording of data processing procedures
- Quality Control: Requirement for data consistency and validity verification
The concept of RDE structured processing was created to address these challenges.
Key Concepts
Three Phases
RDE structured processing consists of the following three phases:
flowchart LR
A[Initialization Phase] --> B[Custom Processing Phase]
B --> C[Termination Phase]
A1[Directory Creation
File Validation
Configuration Loading] --> A
B1[User-defined Processing
Data Transformation
Metadata Generation] --> B
C1[Output Validation
Log Finalization
Cleanup] --> C
1. Initialization Phase
Preparation stage automatically executed by the system:
- Directory Structure Creation: Automatic generation of standard output directories
- Input File Validation: Format and consistency checking
- Configuration Loading: Application of processing modes and configuration parameters
- Log System Initialization: Start recording processing history
2. Custom Processing Phase
Core processing stage defined by researchers:
- Data Transformation: Processing raw data according to research objectives
- Quality Control: Data validity verification and anomaly detection
- Metadata Generation: Recording information about processing results
- Visualization: Generation of graphs and charts
3. Termination Phase
Completion stage automatically executed by the system:
- Output Validation: Consistency verification of generated files
- Log Finalization: Recording processing results and statistical information
- Temporary File Cleanup: Deletion of unnecessary files
- Execution Result Reporting: Success/failure determination and reporting
Standard Directory Structure
Structured processing uses the following standard directory structure:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Data Flow
flowchart TD
A[Raw Data] --> B[Input Validation]
B --> C[Custom Processing]
C --> D[Quality Control]
D --> E[Metadata Generation]
E --> F[Output Validation]
F --> G[RDE Dataset]
H[Configuration Files] --> C
I[Schema Definition] --> B
I --> F
Processing Modes
RDEToolKit provides four processing modes according to data types and registration methods:
Mode | Purpose | Features |
---|---|---|
Invoice | Standard single dataset registration | Most basic mode |
ExcelInvoice | Batch registration of multiple datasets | Efficient registration via Excel files |
MultiDataTile | Integrated management of related data | Managing multiple data tiles as one dataset |
RDEFormat | Migration/mock creation of existing data | Registration of pre-structured data |
Benefits and Effects
Benefits for Researchers
- Standardization: Unified data formats and metadata
- Reproducibility: Recording and re-execution of processing procedures
- Efficiency: Time savings through automation
- Quality Improvement: Reliability assurance through validation features
Benefits for Organizations
- Data Management: Unified data management system
- Searchability: Advanced search through structured metadata
- Sharing: Efficient sharing of standardized data
- Long-term Preservation: Sustainable data storage formats
Summary
RDE structured processing is a comprehensive framework for ensuring the quality and reproducibility of research data. Key features include:
- Three-stage Processing Flow: Clear separation of initialization, custom processing, and termination
- Standardized Directory Structure: Consistent data organization
- Flexible Processing Modes: Support for various research needs
- Automation and Validation: Reduction of human errors and quality assurance
Next Steps
After understanding the concepts of structured processing, refer to the following practical guides:
- Experience structured processing with Quick Start
- Learn details of each mode in Processing Modes
- Check customization methods in Configuration Files
- Learn troubleshooting in Error Handling