Skip to content

Changelog

Version Index

Version Release Date Key Changes Details
v1.6.0 2026-02-18 BREAKING: Python 3.9 support dropped (minimum 3.10+) / direct pytz dependency removed / gen-invoice command added / AI agent guide embedded / Agent Skills added v1.6.0
v1.5.3 2026-02-03 SmartTable sample.ownerId auto-assignment / Missing rawfile error improvement / invoicefile.py bug fixes v1.5.3
v1.5.2 2026-01-27 CLI validate exit codes / metadata-def validation fix / Config error messages / Python 3.9 deprecation / PBT infrastructure v1.5.2
v1.5.1 2026-01-21 SmartTable row data direct access / Variable array feature support in description v1.5.1
v1.5.0 2026-01-09 Result type / Typer CLI + validate / Timestamped logs / Lazy imports / Python 3.14 v1.5.0
v1.4.3 2025-12-25 SmartTable data integrity fixes / csv2graph HTML destination + legend/log-scale tweaks v1.4.3
v1.4.2 2025-12-18 Invoice overwrite validation / Excel invoice consolidation / csv2graph auto single-series / MultiDataTile empty input v1.4.2
v1.4.1 2025-11-05 SmartTable rowfile accessor / legacy fallback warnings v1.4.1
v1.4.0 2025-10-24 SmartTable metadata.json auto-generation / LLM-friendly traceback / CSV visualization utility / gen-config v1.4.0
v1.3.4 2025-08-21 Stable SmartTable validation v1.3.4
v1.3.3 2025-07-29 Fixed ValidationError handling / Added sampleWhenRestructured schema v1.3.3
v1.3.2 2025-07-22 Strengthened SmartTable required-field validation v1.3.2
v1.3.1 2025-07-14 Excel invoice empty-sheet fix / Stricter extended_mode validation v1.3.1
v1.2.0 2025-04-14 MinIO integration / Archive generation / Report tooling v1.2.0

Release Details

v1.6.0 (2026-02-18)

Breaking Changes

This release drops Python 3.9 support. Minimum Python version is now 3.10+.

References

Breaking Changes: Python 3.9 Support Dropped

Overview

Python 3.9 support has been removed from rdetoolkit v1.6.0. The minimum supported Python version is now 3.10.

Rationale

  • End of Life: Python 3.9 reached end-of-life on October 31, 2025
  • Security: Continuing support increases maintenance and security risks
  • Modernization: Python 3.10+ provides improved typing features, better performance, and language enhancements

Impact

  • Users on Python 3.9: pip install rdetoolkit will automatically resolve to the last version compatible with Python 3.9 (currently v1.5.2)
  • CI/CD Pipelines: GitHub Actions and tox configurations no longer test Python 3.9
  • PyPI Metadata: Python 3.9 classifier removed from package metadata

Migration Options

Option 1: Upgrade Python (Recommended)

1
2
3
# Install Python 3.10 or higher
# Then reinstall rdetoolkit
pip install --upgrade rdetoolkit

Supported versions: Python 3.10, 3.11, 3.12, 3.13, 3.14

Option 2: Pin to Last Compatible Version

1
2
# Stay on the last version compatible with Python 3.9 (currently v1.5.2)
pip install "rdetoolkit<1.6.0"

Warning

Pinning to the last version compatible with Python 3.9 (currently v1.5.2) means you will not receive new features, bug fixes, or security updates from v1.6.0+.

For more information on Python version support lifecycle, see: https://endoflife.date/python


Technical Changes

Package Metadata

  • Updated requires-python to >=3.10 in pyproject.toml
  • Removed Programming Language :: Python :: 3.9 classifier
  • Updated ruff configuration to target Python 3.10 (target-version = "py310")

CI/CD Pipelines

  • Removed Python 3.9 from GitHub Actions workflows (pypi-release.yml, docs-ci.yml)
  • Updated tox configuration to test only Python 3.10-3.14 environments
  • Removed all py39-* test environment sections from tox.ini

Code Cleanup

  • Removed Python 3.9 compatibility code and version branches:
  • sys.version_info checks in __init__.py, result.py, command.py
  • Version-specific dataclass parameter handling (_DATACLASS_KWARGS)
  • Conditional slots=True logic (now always enabled)
  • Optimized typing imports to use Python 3.10+ built-ins:
  • Never, ParamSpec, TypeAlias from typing module (instead of typing_extensions)
  • Simplified type annotations using native union syntax (X | Y)

Documentation

  • Updated installation documentation (English and Japanese) to reflect Python 3.10+ requirement
  • Updated usage documentation (CLI, quickstart, validation, object storage) to specify Python 3.10+
  • Updated development documentation to remove Python 3.9 references
  • Adjusted deprecation notices in README.md to reflect v1.6.0 removal

Dependencies

  • Regenerated lock files (uv.lock, requirements.lock, requirements-dev.lock) with Python 3.10+ constraints
  • Removed 2,986 lines of Python 3.9-specific package wheels and markers
  • Removed direct pytz and types-pytz dependencies from rdetoolkit; migrated all timezone handling to datetime.timezone.utc (#375)

New Features

invoice.json Generation Feature (Issue #371)

Added API and CLI functionality to generate invoice.json directly from invoice.schema.json definitions.

API Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from rdetoolkit.invoice_generator import generate_invoice_from_schema

# Generate with all fields and defaults, write to file
invoice_data = generate_invoice_from_schema(
    schema_path="tasksupport/invoice.schema.json",
    output_path="invoice/invoice.json",
    fill_defaults=True,
    required_only=False,
)

# Generate required fields only, return dict without file
invoice_data = generate_invoice_from_schema(
    schema_path="tasksupport/invoice.schema.json",
    fill_defaults=False,
    required_only=True,
)

CLI Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Basic usage - generates invoice.json in current directory
rdetoolkit gen-invoice tasksupport/invoice.schema.json

# Specify output path
rdetoolkit gen-invoice tasksupport/invoice.schema.json -o container/data/invoice/invoice.json

# Generate required fields only
rdetoolkit gen-invoice tasksupport/invoice.schema.json --required-only

# Generate with compact formatting
rdetoolkit gen-invoice tasksupport/invoice.schema.json --format compact

Default Value Priority 1. Schema default field 2. First item from schema examples 3. Type-based defaults: string→"", number→0.0, integer→0, boolean→false


Result Type Extension: unwrap_or_else Method (Issue #363)

Added unwrap_or_else method to the Result type. On failure, it calls a default value generator function that receives the error as an argument.

1
2
3
4
5
6
7
8
9
from rdetoolkit.result import Success, Failure

# Success: returns the value directly (default_fn is not called)
result = Success(42)
value = result.unwrap_or_else(lambda e: 0)  # 42

# Failure: calls default_fn
result = Failure(ValueError("error"))
value = result.unwrap_or_else(lambda e: -1)  # -1

AI Coding Assistant Agent Guide (Issue #380)

Added embedded guide for AI coding assistants (Claude Code, GitHub Copilot, Cursor, etc.).

Features - rdetoolkit.agent_guide(): Function to retrieve guide text - rdetoolkit agent-guide: CLI command to display the guide

Usage Examples

1
2
3
4
5
import rdetoolkit

# Get the agent guide
guide = rdetoolkit.agent_guide()
print(guide)

1
2
# Display guide from CLI
rdetoolkit agent-guide

Agent Skills for AI Coding Assistants

Added Agent Skills (.agents/SKILL.md) to provide contextual guidance for AI coding assistants (Claude Code, etc.) when developing RDE structured programs.

Difference from existing Agent Guide (_agent/)

Aspect Agent Guide (_agent/) Agent Skills (.agents/)
Distribution Bundled in package (available after pip install) In source repository (auto-detected during development)
Access rdetoolkit.agent_guide() API / CLI Auto-discovered and applied by Claude Code
Purpose General-purpose agent guide Contextual guidance during development sessions

Structure

1
2
3
4
5
6
7
.agents/
├── SKILL.md                    # Entry point (activation trigger definitions)
└── references/
    ├── preferred-apis.md       # fileops / csv2graph API details
    ├── cli-workflow.md         # CLI execution order guide
    ├── config.md               # Configuration file spec (YAML/TOML)
    └── modes.md                # 5-mode detailed reference

Key Features

  • Encoding-safe file I/O (rdetoolkit.fileops) mandatory usage guidance
  • Processing mode selection flowchart for 5 modes (Invoice / ExcelInvoice / SmartTableInvoice / MultiDataTile / RDEFormat)
  • Correct execution order guide for CLI template editing and validation
  • rdeconfig.yaml / pyproject.toml configuration file spec reference
  • Common mistakes and fixes troubleshooting table

csv2graph Module Refactoring

Significantly refactored the csv2graph module for improved modularity and testability.

  • Extracted dataclasses (MatplotlibArtifact, NormalizedColumns, RenderCollections) to models.py
  • Moved internal helper functions to appropriate modules (config.py, normalizers.py, etc.)
  • Created strategies/render_coordinator.py as rendering coordination layer
  • Reduced csv2graph.py from 662 to 512 lines (22.7% reduction)
  • Maintained 100% API backward compatibility

Dataclass Memory Efficiency Improvements

With Python 3.9 support dropped, enabled slots=True on all 13 dataclasses in rde2types.py. This reduces memory usage per instance and improves attribute access performance.


Testing

All core tests pass successfully under Python 3.10-3.14: - Unit tests: 1,603 passed - Quality checks: ruff, mypy, pytest all pass - Integration tests: All workflows validated


v1.5.3 (2026-02-03)

References

Highlights

  • Fixed SmartTable sample registration to automatically set sample.ownerId to basic.dataOwnerId. CSV-specified values take precedence when explicitly provided
  • Improved check_exist_rawfiles error message to report all missing raw files in alphabetical order
  • Quality improvements to invoicefile.py: removed stale TODO comment, fixed mapping_rules parameter being ignored, fixed docstring typo

SmartTable sample.ownerId Auto-Assignment (Issue #389)

Bug Fix

  • Problem: When registering samples in SmartTable mode, sample.ownerId retained the sample owner ID from the temporary sample selected in the invoice screen instead of being updated to the data registrant's ID. This caused:
  • The sample owner to be different from the data registrant for new sample registration
  • Registration errors when the sample owner was not a member of the research team

  • Solution: Added automatic sample.ownerId assignment in SmartTableInvoiceInitializer._apply_smarttable_row method

Priority Rules

Case Behavior
SmartTable CSV specifies sample/ownerId Use CSV value (respect user intent)
SmartTable CSV does not specify sample/ownerId Use basic.dataOwnerId
basic.dataOwnerId is missing/empty Log warning, preserve original value

Implementation Details

  • Added _set_sample_owner_id helper method
  • Added csv_has_sample_owner_id flag to track explicit CSV specification
  • Added new test cases (edge cases and precedence verification)

check_exist_rawfiles Error Message Improvement (Issue #398)

Bug Fix

  • Problem: The check_exist_rawfiles function used .pop() to report only one missing file. This caused:
  • When multiple raw files were missing, users had to run repeatedly to discover all missing file names
  • Error messages varied between runs due to set order dependency, making log comparison and snapshot tests unreliable

  • Solution: Changed from .pop() to sorted() + ', '.join() to report all missing files in alphabetical order

Error Message Example

1
2
3
4
5
# Before: Reports only one file (non-deterministic)
ERROR: raw file not found: file_b.dat

# After: Reports all files in alphabetical order
ERROR: raw file not found: file_a.dat, file_b.dat, file_c.dat

invoicefile.py Quality Improvements (Issue #399)

Bug Fixes

  • Stale TODO Comment Removal: Removed obsolete # [TODO] Correction of type definitions in version 0.1.6 comment from v0.1.6 era
  • mapping_rules Parameter Bug Fix: Fixed bug in get_apply_rules_obj method where mapping_rules parameter was ignored and self.rules was always used instead
  • Docstring Typo Fix: Corrected tructuredError to StructuredError in check_exist_rawfiles function's Raises section

Migration / Compatibility

SmartTable sample.ownerId

  • Backward Compatible: Workflows not specifying sample/ownerId in CSV will automatically use basic.dataOwnerId
  • Explicit Specification: If CSV specified sample/ownerId, that value continues to take precedence
  • Link Registration: sample.ownerId is set but not used for link registration (set as a safe default)

Error Messages

  • Error String Change: The check_exist_rawfiles error message format has changed. Update any string comparison tests accordingly
  • Functional Compatibility: Behavior is backward compatible

invoicefile.py

  • Backward Compatible: Changes are internal quality improvements with no API compatibility impact

Known Issues

  • None reported at this time.

v1.5.2 (2026-01-27)

References

Highlights

  • Standardized CLI validate command exit codes (0/1/2) for CI/CD pipeline integration
  • Fixed metadata-def.json validation to use correct Pydantic model (MetadataDefinitionValidator)
  • Improved configuration error messages with detailed context (file path, line/column info, documentation links)
  • Added Python 3.9 deprecation warning with v2.0 removal timeline
  • Introduced Hypothesis Property-Based Testing (PBT) infrastructure with 75 tests across 5 modules

CLI Validate Exit Code Standardization (Issue #362, #381)

Enhancements

  • Standardized Exit Codes: Implemented consistent exit codes for CI/CD integration:
  • Exit code 0: All validations passed (success)
  • Exit code 1: Validation failures (data/schema issues)
  • Exit code 2: Usage/configuration errors (invalid arguments, missing files)
  • Bug Fix: Fixed typer.Exit being incorrectly caught by generic except Exception handler, which caused spurious "Internal error during validation:" messages
  • CLI Help Update: Added exit code documentation to all validate subcommand docstrings
  • User Documentation: Added CI/CD integration examples for GitHub Actions, GitLab CI, and shell scripts

Usage Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# CI/CD pipeline example
rdetoolkit validate --all ./data

if [ $? -eq 0 ]; then
    echo "Validation passed"
elif [ $? -eq 1 ]; then
    echo "Validation failed - check output for details"
    exit 1
elif [ $? -eq 2 ]; then
    echo "Command error - check arguments"
    exit 2
fi

Metadata Definition Validation Fix (Issue #382)

Enhancements

  • New Pydantic Models: Added MetadataDefEntry and MetadataDefinition models for proper metadata-def.json schema validation
  • Required fields: name.ja, name.en, schema.type
  • Optional fields: unit, description, uri, mode, order, originalName
  • extra="allow" to ignore undefined fields (e.g., variable)
  • New Validator: Added MetadataDefinitionValidator class for metadata definition file validation
  • CLI Fix: Updated MetadataDefCommand to use MetadataDefinitionValidator instead of the incorrect MetadataValidator

Configuration Error Message Improvements (Issue #361)

Enhancements

  • ConfigError Exception: New custom exception class with comprehensive error information:
  • File path that failed to load
  • Line/column information for parse errors (when available)
  • Field name and validation reason for schema errors
  • Documentation link for resolution guidance
  • File Not Found: Clear error messages with gen-config command guidance
  • Parse Errors: YAML/TOML syntax errors now include line/column information
  • Validation Errors: Pydantic validation errors show specific field names and valid values

Example Error Messages

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# File not found
ConfigError: Configuration file not found: '/path/to/rdeconfig.yaml'.
Create a configuration file or use 'rdetoolkit gen-config' to generate one.
See: https://nims-mdpf.github.io/rdetoolkit/usage/config/config/

# Parse error with line info
ConfigError: Failed to parse '/path/to/rdeconfig.yaml': invalid YAML syntax at line 15.

# Schema validation error
ConfigError: Invalid configuration in '/path/to/rdeconfig.yaml':
'extended_mode' must be one of ['MultiDataTile', 'RDEFormat'].

Python 3.9 Deprecation Warning (Issue #360)

Enhancements

  • DeprecationWarning: Added warning when rdetoolkit is imported under Python 3.9
  • Clear Timeline: Warning message indicates support removal in v2.0
  • Session-Safe: Warning appears only once per session to avoid noise
  • Documentation: Updated README, CHANGELOG, and installation docs (English/Japanese) with deprecation notices

Warning Message

1
2
DeprecationWarning: Python 3.9 support is deprecated and will be removed in rdetoolkit v2.0.
Please upgrade to Python 3.10 or later.

Property-Based Testing Infrastructure (Issue #372)

Enhancements

  • Hypothesis Library: Added hypothesis>=6.102.0 to dev dependencies
  • Test Directory: Created tests/property/ with shared strategies and profile configuration
  • 75 PBT Tests across 5 modules:
  • graph.normalizers (14 tests): Column normalization and validation
  • graph.textutils (20 tests): Filename sanitization, text transformations
  • graph.io.path_validator (13 tests): Path safety validation
  • rde2util.castval (15 tests): Type casting and error handling
  • validation (10 tests): Invoice validation invariants
  • CI Integration: Added HYPOTHESIS_PROFILE: ci environment variable for optimized CI execution
  • Tox Integration: Added passenv = HYPOTHESIS_PROFILE to all tox environments

Running PBT Tests

1
2
3
4
5
6
7
8
# Run all tests (example-based + property-based)
tox -e py312-module

# Run only property-based tests
pytest tests/property/ -v -m property

# Run with CI profile (faster, fewer examples)
HYPOTHESIS_PROFILE=ci pytest tests/property/ -v -m property

Workflow Documentation Alignment (Issue #370)

Enhancements

  • Docstring Update: Aligned workflows.run Note section with actual implementation
  • Mode Selection: Documented real mode precedence and allowed extended_mode values
  • New Tests: Added EP/BV tables and unit tests for _process_mode covering priority order and failure handling

Documentation Fixes (Issue #358, #359)

Fixes

  • Badge URLs: Updated README badges (release, license, issue tracking, workflow) from nims-dpfc to nims-mdpf organization
  • Typo Fix: Corrected display_messsage to display_message in README and Japanese documentation

Dependency Fix (Issue #373)

Fixes

  • pytz Dependency: Added pytz>=2024.1 to runtime dependencies to fix CI failures
  • Root cause: pandas 2.2 removed its pytz dependency, but archive.py and tests still import pytz directly
  • Resolution: Explicit pytz dependency in pyproject.toml and regenerated lock files

Migration / Compatibility

CLI Validate Exit Codes

  • Exit code change: Internal errors now return exit code 2 (was 3) for consistency
  • CI scripts: Update any scripts checking for exit code 3 to check for exit code 2

Metadata Definition Validation

  • Backward Compatible: Existing valid metadata-def.json files will now validate correctly
  • Error messages: More accurate error messages for invalid metadata definitions

Python 3.9

  • Deprecation only: Python 3.9 continues to work but shows deprecation warning
  • Action required: Plan upgrade to Python 3.10+ before rdetoolkit v2.0

Configuration Errors

  • Backward Compatible: ConfigError is a new exception type; existing error handling continues to work
  • Enhanced debugging: More informative error messages for configuration issues

Known Issues

  • None reported at this time.

v1.5.1 (2026-01-21)

References

Highlights

  • Added smarttable_row_data property to RdeDatasetPaths for direct row data access in SmartTable mode, eliminating the need for users to manually read and parse CSV files
  • Feature-flagged items from variable array in metadata.json are now transcribed to description in array format ([A,B,C])
  • Added feature_description configuration flag to enable/disable automatic feature transcription to description

SmartTable Row Data Direct Access (Issue #207)

Enhancements

  • New Attribute: Added smarttable_row_data: dict[str, Any] | None to RdeOutputResourcePath dataclass
  • New Property: Added smarttable_row_data property to RdeDatasetPaths for user callback access
  • Processor Update: Modified SmartTableInvoiceInitializer to parse CSV and store row data in context
  • Type Stubs: Updated .pyi files for IDE autocomplete support
  • Comprehensive Tests: Added unit tests and integration tests covering new and legacy signatures

Usage Examples

Before (existing method still works):

1
2
3
4
5
def custom_dataset(paths: RdeDatasetPaths):
    csv_path = paths.smarttable_rowfile
    if csv_path:
        df = pd.read_csv(csv_path)
        sample_name = df.iloc[0]["sample/name"]

After (new improved API):

1
2
3
4
def custom_dataset(paths: RdeDatasetPaths):
    row_data = paths.smarttable_row_data  # dict[str, Any] | None
    if row_data:
        sample_name = row_data.get("sample/name")


Variable Array Feature Support in Description (Issue #210)

Enhancements

  • New Helper Function: __collect_values_from_variable collects all values for a specified key from the variable array
  • New Helper Function: __format_description_entry centralizes formatting logic (DRY principle)
  • Extended Function: update_description_with_features now supports variable array lookup
  • constant values take priority over variable values (backward compatible)
  • Multiple values formatted as [A,B,C], single value remains unchanged
  • New Config Flag: Added feature_description boolean to SystemSettings
  • Default: True (backward compatible)
  • Controls automatic feature transcription to description
  • Configurable via rdeconfig.yaml or pyproject.toml

Configuration Examples

rdeconfig.yaml:

1
2
system:
  feature_description: false  # Disable auto-transfer to description

pyproject.toml:

1
2
[tool.rdetoolkit.system]
feature_description = false


Migration / Compatibility

SmartTable Row Data Access

  • Backward Compatible: Existing smarttable_rowfile path access continues to work
  • Gradual Migration: Both old (file path) and new (dict) approaches can coexist
  • Non-SmartTable Modes: smarttable_row_data returns None in non-SmartTable modes

Variable Array Feature Support

  • Backward Compatible: Existing constant-only feature behavior unchanged
  • Priority Rules: constant values always take precedence over variable values
  • Config Default: feature_description defaults to True, preserving existing behavior
  • Opt-out Available: Set feature_description: false to disable automatic transcription

Known Issues

  • None reported at this time.

v1.5.0 (2026-01-09)

References

Highlights

  • Introduced Result type pattern (Result[T, E]) for explicit, type-safe error handling without exceptions
  • System logs now use timestamped filenames (rdesys_YYYYMMDD_HHMMSS.log) instead of static rdesys.log, enabling per-run log management and preventing log collision in concurrent or successive executions
  • CLI modernized with Typer, adding validate subcommands, rdetoolkit run, and init template path options while preserving python -m rdetoolkit compatibility
  • Lazy imports across core, workflow, CLI, and graph stacks reduce startup overhead and defer heavy dependencies until needed
  • Added optional structured invoice.json export, expanded Magic Variables, and official Python 3.14 support

Result Type Pattern (Issue #334)

Enhancements

  • New Result Module (rdetoolkit.result):
  • Success[T]: Immutable frozen dataclass for successful results with value
  • Failure[E]: Immutable frozen dataclass for failed results with error
  • Result[T, E]: Type alias for Success[T] | Failure[E]
  • try_result decorator: Converts exception-based functions to Result-returning functions
  • Full generic type support with TypeVar and ParamSpec for type safety
  • Functional methods: is_success(), map(), unwrap()
  • Result-based Workflow Functions:
  • check_files_result(): File classification with explicit Result type
  • Returns Result[tuple[RawFiles, Path | None, Path | None], StructuredError]
  • Result-based Mode Processing Functions:
  • invoice_mode_process_result(): Invoice processing with Result type
  • Returns Result[WorkflowExecutionStatus, Exception]
  • Type Stubs: Complete .pyi files for IDE autocomplete and type checking
  • Documentation: Comprehensive API docs in English and Japanese (docs/api/result.en.md, docs/api/result.ja.md)
  • Public API: Result types exported from rdetoolkit.__init__.py for easy import
  • 100% Test Coverage: 40 comprehensive unit tests for Result module

Usage Examples

Result-based error handling:

1
2
3
4
5
6
7
8
9
from rdetoolkit.workflows import check_files_result

result = check_files_result(srcpaths, mode="invoice")
if result.is_success():
    raw_files, excel_path, smarttable_path = result.unwrap()
    # Process files
else:
    error = result.error
    print(f"Error {error.ecode}: {error.emsg}")

Traditional exception-based (still works):

1
2
3
4
5
6
from rdetoolkit.workflows import check_files

try:
    raw_files, excel_path, smarttable_path = check_files(srcpaths, mode="invoice")
except StructuredError as e:
    print(f"Error {e.ecode}: {e.emsg}")


Timestamped Log Filenames (Issue #341)

Enhancements

  • Added generate_log_timestamp() utility function to create filesystem-safe timestamp strings
  • Modified workflows.run() to generate unique timestamped log files for each workflow execution
  • Fixed P2 bug: Handler accumulation when run() called multiple times in the same process
  • Root cause: Logger singleton retained old LazyFileHandlers with different filenames
  • Solution: Clear existing LazyFileHandlers before adding new ones
  • Impact: Ensures 1 execution = 1 log file, preventing log cross-contamination
  • Replaced custom LazyFileHandler with standard logging.FileHandler(delay=True) for better maintainability
  • Updated all documentation to reference the new timestamped log filename pattern

Benefits

  • Per-run isolation: Each workflow execution creates a separate log file, preventing log mixing
  • Concurrent execution: No log collision when running multiple workflows simultaneously
  • Easy comparison: Compare logs from different runs without manual separation
  • Simplified auditing: Collect and archive logs per execution for debugging and compliance
  • Better maintainability: Standard library FileHandler is well-tested and widely understood

CLI Modernization and Validation (Issues #247, #262, #337, #338)

Enhancements

  • Migrated CLI to Typer with lazy imports; preserved python -m rdetoolkit invocation and command names (init, version, gen-config, make-excelinvoice, artifact, csv2graph)
  • Added rdetoolkit run <module_or_file::attr> to load a function dynamically, reject classes/callables, and ensure the function accepts two positional arguments
  • Added rdetoolkit validate commands (invoice-schema, invoice, metadata-def, metadata, all) with --format text|json, --quiet, --strict/--no-strict, and CI-friendly exit codes (0/1/2/3)
  • Added init template path options (--entry-point, --modules, --tasksupport, --inputdata, --other) and persist them to pyproject.toml / rdeconfig.yaml

Init Template Path Options Details (Issue #262)

Added template path options to rdetoolkit init command, enabling project initialization from custom templates.

Use Cases:

  • Initialize with commonly used utility files pre-placed in modules/ folder
  • Customize main.py to preferred format
  • Include frequently used config files as templates
  • Specify custom object-oriented script templates

Added Options:

  • --entry-point: Place entry point (.py file) in container/ directory
  • --modules: Place modules in container/modules/ directory (folder specification includes subdirectories)
  • --tasksupport: Place config files in tasksupport/ directory (folder specification includes subdirectories)
  • --inputdata: Place input data in container/data/inputdata/ directory (folder specification includes subdirectories)
  • --other: Place other files in container/ directory (folder specification includes subdirectories)

Config Persistence:

  • CLI-specified paths are automatically saved to pyproject.toml or rdeconfig.yaml(yml)
  • Auto-generates pyproject.toml if no config file exists
  • Overwrites existing settings when present

Safety Measures:

  • Self-copy (same path) detection and skip
  • Invalid path and empty string validation with error reporting

Config File Example (pyproject.toml):

1
2
3
4
5
6
7
8
9
[tool.rdetoolkit.init]
entry_point = "path/to/your/template/main.py"
modules = "path/to/your/template/modules/"
tasksupport = "path/to/your/template/config/"
inputdata = "path/to/your/template/inputdata/"
other = [
    "path/to/your/template/file1.txt",
    "path/to/your/template/dir2/"
]

Config File Example (rdeconfig.yaml):

1
2
3
4
5
6
7
8
init:
  entry_point: "path/to/your/template/main.py"
  modules: "path/to/your/template/modules/"
  tasksupport: "path/to/your/template/config/"
  inputdata: "path/to/your/template/inputdata/"
  other:
    - "path/to/your/template/file1.txt"
    - "path/to/your/template/dir2/"


Startup Performance Improvements (Issues #323-330)

Enhancements

  • Implemented lazy exports in rdetoolkit and rdetoolkit.graph to avoid importing heavy submodules until needed
  • Deferred heavy dependencies in invoice/validation/encoding, core utilities, workflows, CLI commands, and graph renderers
  • Updated Ruff per-file ignores to allow intentional PLC0415 in lazy-import modules

Type Safety and Refactors (Issues #333, #335, #336)

Enhancements

  • Replaced models.rde2types aliases with NewType definitions and validated path classes; added FileGroup / ProcessedFileGroup for safer file grouping
  • Broadened read-only inputs to Mapping and mutable inputs to MutableMapping, including Validator.validate() accepting Mapping and normalizing to dict
  • Replaced if/elif chains with dispatch tables for rde2util.castval, invoice sheet processing, and archive format selection, preserving behavior with new tests

Workflow and Config Enhancements (Issues #3, #301)

Enhancements

  • Added system.save_invoice_to_structured (default false) and StructuredInvoiceSaver to optionally copy invoice.json into the structured directory after thumbnail generation
  • Expanded Magic Variable patterns: ${invoice:basic:*}, ${invoice:custom:*}, ${invoice:sample:names:*}, ${metadata:constant:*}, with warnings on skipped values and strict validation for missing fields

Tooling and Platform Support (Issue #249)

Enhancements

  • Added official Python 3.14 support across classifiers, tox environments, and CI build/test matrices

Migration / Compatibility

Result Type Pattern

  • Backward Compatible: All original exception-based functions remain unchanged
  • Gradual Migration: Both patterns (exception-based and Result-based) can coexist
  • Delegation Pattern: Original functions delegate to *_result() versions internally
  • Type Safety: Use isinstance(result, Failure) for type-safe error checking
  • Error Preservation: All error information (StructuredError attributes, Exception details) preserved in Failure

Timestamped Log Filenames

  • Log file naming change: System logs are now written to data/logs/rdesys_YYYYMMDD_HHMMSS.log instead of data/logs/rdesys.log
  • Finding logs: Use wildcard patterns to find logs: ls -t data/logs/rdesys_*.log | head -1 for the latest log
  • Scripts and tools: Update any scripts or monitoring tools that directly reference rdesys.log to use pattern matching with rdesys_*.log
  • Log collection: Automated log collection systems should be updated to handle multiple timestamped files instead of a single static file
  • Old log files: Existing rdesys.log files from previous versions will remain in place and are not automatically removed
  • No configuration needed: The new behavior is automatic; no configuration changes are required

CLI (Typer Migration and New Commands)

  • Invocation unchanged: python -m rdetoolkit ... continues to work; command names and options are preserved
  • Dependency update: Click is removed in favor of Typer; avoid importing Click-specific objects from rdetoolkit.cli
  • Validation commands: New rdetoolkit validate subcommands return exit codes 0/1/2/3 for CI automation

Init Template Paths

  • Config persistence: Template paths are stored in pyproject.toml / rdeconfig.yaml when provided; existing configs remain valid

Structured Invoice Export

  • Opt-in behavior: system.save_invoice_to_structured defaults to false; enabling it creates structured/invoice.json after thumbnail generation

Magic Variables

  • Expanded patterns: ${invoice:basic:*}, ${invoice:custom:*}, ${invoice:sample:names:*}, ${metadata:constant:*} are now supported
  • Error handling: Missing required fields raise errors; empty segments are skipped with warnings to avoid double underscores

Mapping Type Hints

  • Type-only change: Mapping / MutableMapping widen input types without changing runtime behavior
  • Validation inputs: Validator.validate(obj=...) now copies mappings into a dict at the boundary

Python 3.14 Support

  • Compatibility: Python 3.14 is now a supported runtime with CI and packaging updates

Known Issues

  • Only invoice_mode_process has Result-based version; other mode processors will be migrated in future releases

v1.4.3 (2025-12-25)

References

Highlights

  • SmartTable split processing now preserves sample.ownerId, respects boolean columns, and prevents empty cells from inheriting values from previous rows, restoring per-row data integrity.
  • csv2graph defaults HTML artifacts to the CSV directory (structured) and adds html_output_dir for overrides, while aligning Plotly/Matplotlib legend text and log-scale tick formatting for consistent outputs.

Enhancements

  • Cache the base invoice once in SmartTable processing and pass deep copies per row so divided invoices retain sample.ownerId.
  • Detect empty SmartTable cells and clear mapped basic/custom/sample fields instead of reusing prior-row values.
  • Cast "TRUE" / "FALSE" (case-insensitive) according to schema boolean types so Excel-derived strings convert correctly during SmartTable writes.
  • Added html_output_dir / --html-output-dir to csv2graph, defaulting HTML saves to the CSV directory, and refreshed English/Japanese docs and samples.
  • Standardized Plotly legend labels to series names (trimmed before :) and enforced decade-only log ticks with 10^ notation across Plotly and Matplotlib.
  • Added EP/BV-backed regression tests for SmartTable ownerId inheritance, empty-cell clearing, boolean casting, csv2graph HTML destinations, and renderer legend/log formatting.

Fixes

  • Resolved loss of sample.ownerId on SmartTable rows after the first split.
  • Prevented empty SmartTable cells from carrying forward prior-row values into basic/description and sample/composition/description fields.
  • Fixed boolean casting that treated "FALSE" as truthy; schema-driven conversion now produces correct booleans.
  • Corrected csv2graph HTML placement when output_dir pointed to other_image, defaulting HTML to the CSV/structured directory unless overridden.
  • Normalized Plotly legends that previously showed full headers (e.g., total:intensity) and removed 2/5 minor log ticks while enforcing exponential labels.

Migration / Compatibility

  • csv2graph now saves HTML alongside the CSV by default (typically data/structured). Use html_output_dir (API) or --html-output-dir (CLI) to target a different directory.
  • SmartTable no longer reuses prior-row values for empty cells; provide explicit values on each row where inheritance was previously relied upon.
  • String booleans "TRUE" / "FALSE" are force-cast to real booleans. Review workflows that accidentally depended on non-empty strings always evaluating to True.
  • No other compatibility changes.

Known Issues

  • None reported at this time.

v1.4.2 (2025-12-18)

References

Highlights

  • InvoiceFile.overwrite() now accepts dictionaries, validates them through InvoiceValidator, and can fall back to the existing invoice_path.
  • Excel invoice reading is centralized inside ExcelInvoiceFile, with read_excelinvoice() acting as a warning-backed compatibility wrapper slated for v1.5.0 removal.
  • csv2graph detects when a single series is requested and suppresses per-series plots unless the CLI flag explicitly demands them, keeping CLI and API defaults in sync.
  • MultiDataTile pipelines continue to run—and therefore validate datasets—even when the input directory only contains Excel invoices or is empty.

Enhancements

  • Updated InvoiceFile.overwrite() to accept mapping objects, apply schema validation through InvoiceValidator, and default the destination path to the instance’s invoice_path; refreshed docstrings and docs/rdetoolkit/invoicefile.md to describe the new API.
  • Converted read_excelinvoice() into a wrapper that emits a deprecation warning and delegates to ExcelInvoiceFile.read(), updated src/rdetoolkit/impl/input_controller.py to use the class API directly, and clarified docstrings/type hints so df_general / df_specific may be None.
  • Adjusted Csv2GraphCommand so no_individual is typed as bool | None, added CLI plumbing that inspects ctx.get_parameter_source() to detect explicit user input, and documented the overlay-only default in docs/rdetoolkit/csv2graph.md.
  • Added assert_optional_frame_equal and new regression tests that cover csv2graph CLI/API flows plus MultiFileChecker behaviors for Excel-only, empty, single-file, and multi-file directories.

Fixes

  • Auto-detecting single-series requests avoids generating empty per-series artifacts and aligns CLI defaults with the Python API.
  • _process_invoice_sheet(), _process_general_term_sheet(), and _process_specific_term_sheet() now correctly return pd.DataFrame objects, avoiding attribute errors in callers that expect frame operations.
  • MultiFileChecker.parse() returns [()] when no payload files are detected so MultiDataTile validation runs even on empty input directories, matching Invoice mode semantics.

Migration / Compatibility

  • Code calling InvoiceFile.overwrite() can now supply dictionaries directly; omit the destination argument to write to the instance path, and expect schema validation errors when invalid structures are provided.
  • read_excelinvoice() is officially deprecated and scheduled for removal in v1.5.0—migrate to ExcelInvoiceFile().read() or ExcelInvoiceFile.read() helpers.
  • csv2graph now generates only the overlay/summary graph when --no-individual is not specified and there is one (or zero) value columns; pass --no-individual=false to force legacy per-series output or --no-individual to always skip them.
  • MultiDataTile runs on empty directories no longer short-circuit; expect validation failures to surface when required payload files are absent.

Known Issues

  • None reported at this time.

v1.4.1 (2025-11-05)

References

Highlights

  • Dedicated SmartTable row CSV accessors replace ad-hoc rawfiles[0] lookups without breaking existing callbacks.
  • MultiDataTile workflows now guarantee a returned status and surface the failing mode instead of producing silent job artifacts.
  • CSV parsing tolerates metadata comments and empty data windows, removing spurious parser exceptions.
  • Graph helpers (csv2graph, plot_from_dataframe) are now exported directly via rdetoolkit.graph for simpler imports.

Enhancements

  • Introduced the smarttable_rowfile field on RdeOutputResourcePath and exposed it via ProcessingContext.smarttable_rowfile and RdeDatasetPaths.
  • SmartTable processors populate the new field automatically; when fallbacks hit rawfiles[0] a FutureWarning is emitted to prompt migration while preserving backward compatibility.
  • Refreshed developer guidance so SmartTable callbacks expect the dedicated row-file accessor.
  • Re-exported csv2graph and plot_from_dataframe from rdetoolkit.graph, aligning documentation and samples with the simplified import path.

Fixes

  • Ensured MultiDataTile mode always returns a WorkflowExecutionStatus and raises a StructuredError that names the failing mode if the pipeline fails to report back.
  • Updated CSVParser._parse_meta_block() and _parse_no_header() to ignore #-prefixed metadata rows and return an empty DataFrame when no data remains, eliminating ParserError / EmptyDataError.

Migration / Compatibility

  • Existing callbacks using resource_paths.rawfiles[0] continue to work, but now emit a FutureWarning; migrate to smarttable_rowfile to silence it.
  • The rawfiles tuple itself remains the primary list of user-supplied files—only the assumption that its first entry is always the SmartTable row CSV is being phased out.
  • No configuration changes are required for CSV ingestion; the parser improvements are backward compatible.
  • Prefer from rdetoolkit.graph import csv2graph, plot_from_dataframe; the previous rdetoolkit.graph.api path remains available for now.

Known Issues

  • None reported at this time.

v1.4.0 (2025-10-24)

References

Highlights

  • SmartTableInvoice automatically writes meta/ columns to metadata.json
  • Compact AI/LLM-friendly traceback output (duplex mode)
  • CSV visualization utility csv2graph
  • Configuration scaffold generator gen-config

Enhancements

  • Added the csv2graph API with multi-format CSV support, direction filters, Plotly HTML export, and 220+ tests.
  • Added the gen-config CLI with template presets, bilingual interactive mode, and --overwrite safeguards.
  • SmartTableInvoice now maps meta/ prefixed columns—converted via metadata-def.json—into the constant section of metadata.json, preserving existing values and skipping if definitions are missing.
  • Introduced selectable traceback formats (compact, python, duplex) with sensitive-data masking and local-variable truncation.
  • Consolidated RDE dataset callbacks around a single RdeDatasetPaths argument while emitting deprecation warnings for legacy signatures.

Fixes

  • Resolved a MultiDataTile issue where StructuredError failed to stop execution when ignore_errors=True.
  • Cleaned up SmartTable error handling and annotations for more predictable failure behavior.

Migration / Compatibility

  • Legacy two-argument callbacks continue to work but should migrate to the single-argument RdeDatasetPaths form.
  • Projects using SmartTable meta/ columns should ensure metadata-def.json is present for automatic mapping.
  • Traceback format configuration is optional; defaults remain unchanged.

Known Issues

  • None reported at this time.

v1.3.4 (2025-08-21)

References

  • Key issue: #217 (SmartTable/Invoice validation reliability)

Highlights

  • Stabilized SmartTable/Invoice validation flow.

Enhancements

  • Reworked validation and initialization to block stray fields and improve exception messaging.

Fixes

  • Addressed SmartTableInvoice validation edge cases causing improper exception propagation or typing mismatches.

Migration / Compatibility

  • No breaking changes.

Known Issues

  • None reported at this time.

v1.3.3 (2025-07-29)

References

Highlights

  • Fixed ValidationError construction and stabilized Invoice processing.
  • Added sampleWhenRestructured schema for copy-restructure workflows.

Enhancements

  • Introduced the sampleWhenRestructured pattern so copy-restructured invoice.json files requiring only sampleId validate correctly.
  • Expanded coverage across all sample-validation patterns to preserve backward compatibility.

Fixes

  • Replaced the faulty ValidationError.__new__() usage with SchemaValidationError during _validate_required_fields_only checks.
  • Clarified optional fields for InvoiceSchemaJson and Properties, fixing CI/CD mypy failures.

Migration / Compatibility

  • No configuration changes required; existing invoice.json files remain compatible.

Known Issues

  • None reported at this time.

v1.3.2 (2025-07-22)

References

Highlights

  • Strengthened required-field validation for SmartTableInvoice.

Enhancements

  • Added schema enforcement to restrict invoice.json to required fields, preventing unnecessary defaults.
  • Ensured validation runs even when pipelines terminate early.

Fixes

  • Tidied exception handling and annotations within SmartTable validation.

Migration / Compatibility

  • Backward compatible, though workflows adding extraneous invoice.json fields should remove them.

Known Issues

  • None reported at this time.

v1.3.1 (2025-07-14)

References

Highlights

  • Fixed empty-sheet exports in Excel invoice templates.
  • Enforced stricter validation for extended_mode.

Enhancements

  • Added serialization_alias to invoice_schema.py, ensuring $schema and $id serialize correctly in invoice.schema.json.
  • Restricted extended_mode in models/config.py to approved values and broadened tests for save_raw / save_nonshared_raw behavior.
  • Introduced save_table_file and SkipRemainingProcessorsError to SmartTable for finer pipeline control.
  • Updated models/rde2types.py typing and suppressed future DataFrame warnings.
  • Refreshed Rust string formatting, build.rs, and CI workflows for reliability.

Fixes

  • Added raw-directory existence checks to prevent copy failures.
  • Ensured generalTerm / specificTerm sheets appear even when attributes are empty and corrected variable naming errors.
  • Specified orient in FixedHeaders to silence future warnings.

Migration / Compatibility

  • Invalid extended_mode values now raise errors; normalize configuration accordingly.
  • Review SmartTable defaults if relying on prior save_table_file behavior.
  • tqdm dependency removal may require adjustments in external tooling.

Known Issues

  • None reported at this time.

v1.2.0 (2025-04-14)

References

Highlights

  • Introduced MinIO storage integration.
  • Delivered artifact archiving and report-generation workflows.

Enhancements

  • Implemented the MinIOStorage class for object storage access.
  • Added commands for archive creation (ZIP / tar.gz) and report generation.
  • Expanded documentation covering object-storage usage and reporting APIs.

Fixes

  • Updated dependencies and modernized CI configurations.

Migration / Compatibility

  • Fully backward compatible; enable optional dependencies when using MinIO.

Known Issues

  • None reported at this time.