Skip to content

Changelog

Version Index

Version Release Date Key Changes Details
v1.5.0 2026-01-09 Result type / Typer CLI + validate / Timestamped logs / Lazy imports / Python 3.14 v1.5.0
v1.4.3 2025-12-25 SmartTable data integrity fixes / csv2graph HTML destination + legend/log-scale tweaks v1.4.3
v1.4.2 2025-12-18 Invoice overwrite validation / Excel invoice consolidation / csv2graph auto single-series / MultiDataTile empty input v1.4.2
v1.4.1 2025-11-05 SmartTable rowfile accessor / legacy fallback warnings v1.4.1
v1.4.0 2025-10-24 SmartTable metadata.json auto-generation / LLM-friendly traceback / CSV visualization utility / gen-config v1.4.0
v1.3.4 2025-08-21 Stable SmartTable validation v1.3.4
v1.3.3 2025-07-29 Fixed ValidationError handling / Added sampleWhenRestructured schema v1.3.3
v1.3.2 2025-07-22 Strengthened SmartTable required-field validation v1.3.2
v1.3.1 2025-07-14 Excel invoice empty-sheet fix / Stricter extended_mode validation v1.3.1
v1.2.0 2025-04-14 MinIO integration / Archive generation / Report tooling v1.2.0

Release Details

v1.5.0 (2026-01-09)

References

Highlights

  • Introduced Result type pattern (Result[T, E]) for explicit, type-safe error handling without exceptions
  • System logs now use timestamped filenames (rdesys_YYYYMMDD_HHMMSS.log) instead of static rdesys.log, enabling per-run log management and preventing log collision in concurrent or successive executions
  • CLI modernized with Typer, adding validate subcommands, rdetoolkit run, and init template path options while preserving python -m rdetoolkit compatibility
  • Lazy imports across core, workflow, CLI, and graph stacks reduce startup overhead and defer heavy dependencies until needed
  • Added optional structured invoice.json export, expanded Magic Variables, and official Python 3.14 support

Result Type Pattern (Issue #334)

Enhancements

  • New Result Module (rdetoolkit.result):
  • Success[T]: Immutable frozen dataclass for successful results with value
  • Failure[E]: Immutable frozen dataclass for failed results with error
  • Result[T, E]: Type alias for Success[T] | Failure[E]
  • try_result decorator: Converts exception-based functions to Result-returning functions
  • Full generic type support with TypeVar and ParamSpec for type safety
  • Functional methods: is_success(), map(), unwrap()
  • Result-based Workflow Functions:
  • check_files_result(): File classification with explicit Result type
  • Returns Result[tuple[RawFiles, Path | None, Path | None], StructuredError]
  • Result-based Mode Processing Functions:
  • invoice_mode_process_result(): Invoice processing with Result type
  • Returns Result[WorkflowExecutionStatus, Exception]
  • Type Stubs: Complete .pyi files for IDE autocomplete and type checking
  • Documentation: Comprehensive API docs in English and Japanese (docs/api/result.en.md, docs/api/result.ja.md)
  • Public API: Result types exported from rdetoolkit.__init__.py for easy import
  • 100% Test Coverage: 40 comprehensive unit tests for Result module

Usage Examples

Result-based error handling:

1
2
3
4
5
6
7
8
9
from rdetoolkit.workflows import check_files_result

result = check_files_result(srcpaths, mode="invoice")
if result.is_success():
    raw_files, excel_path, smarttable_path = result.unwrap()
    # Process files
else:
    error = result.error
    print(f"Error {error.ecode}: {error.emsg}")

Traditional exception-based (still works):

1
2
3
4
5
6
from rdetoolkit.workflows import check_files

try:
    raw_files, excel_path, smarttable_path = check_files(srcpaths, mode="invoice")
except StructuredError as e:
    print(f"Error {e.ecode}: {e.emsg}")


Timestamped Log Filenames (Issue #341)

Enhancements

  • Added generate_log_timestamp() utility function to create filesystem-safe timestamp strings
  • Modified workflows.run() to generate unique timestamped log files for each workflow execution
  • Fixed P2 bug: Handler accumulation when run() called multiple times in the same process
  • Root cause: Logger singleton retained old LazyFileHandlers with different filenames
  • Solution: Clear existing LazyFileHandlers before adding new ones
  • Impact: Ensures 1 execution = 1 log file, preventing log cross-contamination
  • Replaced custom LazyFileHandler with standard logging.FileHandler(delay=True) for better maintainability
  • Updated all documentation to reference the new timestamped log filename pattern

Benefits

  • Per-run isolation: Each workflow execution creates a separate log file, preventing log mixing
  • Concurrent execution: No log collision when running multiple workflows simultaneously
  • Easy comparison: Compare logs from different runs without manual separation
  • Simplified auditing: Collect and archive logs per execution for debugging and compliance
  • Better maintainability: Standard library FileHandler is well-tested and widely understood

CLI Modernization and Validation (Issues #247, #262, #337, #338)

Enhancements

  • Migrated CLI to Typer with lazy imports; preserved python -m rdetoolkit invocation and command names (init, version, gen-config, make-excelinvoice, artifact, csv2graph)
  • Added rdetoolkit run <module_or_file::attr> to load a function dynamically, reject classes/callables, and ensure the function accepts two positional arguments
  • Added rdetoolkit validate commands (invoice-schema, invoice, metadata-def, metadata, all) with --format text|json, --quiet, --strict/--no-strict, and CI-friendly exit codes (0/1/2/3)
  • Added init template path options (--entry-point, --modules, --tasksupport, --inputdata, --other) and persist them to pyproject.toml / rdeconfig.yaml

Init Template Path Options Details (Issue #262)

Added template path options to rdetoolkit init command, enabling project initialization from custom templates.

Use Cases:

  • Initialize with commonly used utility files pre-placed in modules/ folder
  • Customize main.py to preferred format
  • Include frequently used config files as templates
  • Specify custom object-oriented script templates

Added Options:

  • --entry-point: Place entry point (.py file) in container/ directory
  • --modules: Place modules in container/modules/ directory (folder specification includes subdirectories)
  • --tasksupport: Place config files in tasksupport/ directory (folder specification includes subdirectories)
  • --inputdata: Place input data in container/data/inputdata/ directory (folder specification includes subdirectories)
  • --other: Place other files in container/ directory (folder specification includes subdirectories)

Config Persistence:

  • CLI-specified paths are automatically saved to pyproject.toml or rdeconfig.yaml(yml)
  • Auto-generates pyproject.toml if no config file exists
  • Overwrites existing settings when present

Safety Measures:

  • Self-copy (same path) detection and skip
  • Invalid path and empty string validation with error reporting

Config File Example (pyproject.toml):

1
2
3
4
5
6
7
8
9
[tool.rdetoolkit.init]
entry_point = "path/to/your/template/main.py"
modules = "path/to/your/template/modules/"
tasksupport = "path/to/your/template/config/"
inputdata = "path/to/your/template/inputdata/"
other = [
    "path/to/your/template/file1.txt",
    "path/to/your/template/dir2/"
]

Config File Example (rdeconfig.yaml):

1
2
3
4
5
6
7
8
init:
  entry_point: "path/to/your/template/main.py"
  modules: "path/to/your/template/modules/"
  tasksupport: "path/to/your/template/config/"
  inputdata: "path/to/your/template/inputdata/"
  other:
    - "path/to/your/template/file1.txt"
    - "path/to/your/template/dir2/"


Startup Performance Improvements (Issues #323-330)

Enhancements

  • Implemented lazy exports in rdetoolkit and rdetoolkit.graph to avoid importing heavy submodules until needed
  • Deferred heavy dependencies in invoice/validation/encoding, core utilities, workflows, CLI commands, and graph renderers
  • Updated Ruff per-file ignores to allow intentional PLC0415 in lazy-import modules

Type Safety and Refactors (Issues #333, #335, #336)

Enhancements

  • Replaced models.rde2types aliases with NewType definitions and validated path classes; added FileGroup / ProcessedFileGroup for safer file grouping
  • Broadened read-only inputs to Mapping and mutable inputs to MutableMapping, including Validator.validate() accepting Mapping and normalizing to dict
  • Replaced if/elif chains with dispatch tables for rde2util.castval, invoice sheet processing, and archive format selection, preserving behavior with new tests

Workflow and Config Enhancements (Issues #3, #301)

Enhancements

  • Added system.save_invoice_to_structured (default false) and StructuredInvoiceSaver to optionally copy invoice.json into the structured directory after thumbnail generation
  • Expanded Magic Variable patterns: ${invoice:basic:*}, ${invoice:custom:*}, ${invoice:sample:names:*}, ${metadata:constant:*}, with warnings on skipped values and strict validation for missing fields

Tooling and Platform Support (Issue #249)

Enhancements

  • Added official Python 3.14 support across classifiers, tox environments, and CI build/test matrices

Migration / Compatibility

Result Type Pattern

  • Backward Compatible: All original exception-based functions remain unchanged
  • Gradual Migration: Both patterns (exception-based and Result-based) can coexist
  • Delegation Pattern: Original functions delegate to *_result() versions internally
  • Type Safety: Use isinstance(result, Failure) for type-safe error checking
  • Error Preservation: All error information (StructuredError attributes, Exception details) preserved in Failure

Timestamped Log Filenames

  • Log file naming change: System logs are now written to data/logs/rdesys_YYYYMMDD_HHMMSS.log instead of data/logs/rdesys.log
  • Finding logs: Use wildcard patterns to find logs: ls -t data/logs/rdesys_*.log | head -1 for the latest log
  • Scripts and tools: Update any scripts or monitoring tools that directly reference rdesys.log to use pattern matching with rdesys_*.log
  • Log collection: Automated log collection systems should be updated to handle multiple timestamped files instead of a single static file
  • Old log files: Existing rdesys.log files from previous versions will remain in place and are not automatically removed
  • No configuration needed: The new behavior is automatic; no configuration changes are required

CLI (Typer Migration and New Commands)

  • Invocation unchanged: python -m rdetoolkit ... continues to work; command names and options are preserved
  • Dependency update: Click is removed in favor of Typer; avoid importing Click-specific objects from rdetoolkit.cli
  • Validation commands: New rdetoolkit validate subcommands return exit codes 0/1/2/3 for CI automation

Init Template Paths

  • Config persistence: Template paths are stored in pyproject.toml / rdeconfig.yaml when provided; existing configs remain valid

Structured Invoice Export

  • Opt-in behavior: system.save_invoice_to_structured defaults to false; enabling it creates structured/invoice.json after thumbnail generation

Magic Variables

  • Expanded patterns: ${invoice:basic:*}, ${invoice:custom:*}, ${invoice:sample:names:*}, ${metadata:constant:*} are now supported
  • Error handling: Missing required fields raise errors; empty segments are skipped with warnings to avoid double underscores

Mapping Type Hints

  • Type-only change: Mapping / MutableMapping widen input types without changing runtime behavior
  • Validation inputs: Validator.validate(obj=...) now copies mappings into a dict at the boundary

Python 3.14 Support

  • Compatibility: Python 3.14 is now a supported runtime with CI and packaging updates

Known Issues

  • Only invoice_mode_process has Result-based version; other mode processors will be migrated in future releases

v1.4.3 (2025-12-25)

References

Highlights

  • SmartTable split processing now preserves sample.ownerId, respects boolean columns, and prevents empty cells from inheriting values from previous rows, restoring per-row data integrity.
  • csv2graph defaults HTML artifacts to the CSV directory (structured) and adds html_output_dir for overrides, while aligning Plotly/Matplotlib legend text and log-scale tick formatting for consistent outputs.

Enhancements

  • Cache the base invoice once in SmartTable processing and pass deep copies per row so divided invoices retain sample.ownerId.
  • Detect empty SmartTable cells and clear mapped basic/custom/sample fields instead of reusing prior-row values.
  • Cast "TRUE" / "FALSE" (case-insensitive) according to schema boolean types so Excel-derived strings convert correctly during SmartTable writes.
  • Added html_output_dir / --html-output-dir to csv2graph, defaulting HTML saves to the CSV directory, and refreshed English/Japanese docs and samples.
  • Standardized Plotly legend labels to series names (trimmed before :) and enforced decade-only log ticks with 10^ notation across Plotly and Matplotlib.
  • Added EP/BV-backed regression tests for SmartTable ownerId inheritance, empty-cell clearing, boolean casting, csv2graph HTML destinations, and renderer legend/log formatting.

Fixes

  • Resolved loss of sample.ownerId on SmartTable rows after the first split.
  • Prevented empty SmartTable cells from carrying forward prior-row values into basic/description and sample/composition/description fields.
  • Fixed boolean casting that treated "FALSE" as truthy; schema-driven conversion now produces correct booleans.
  • Corrected csv2graph HTML placement when output_dir pointed to other_image, defaulting HTML to the CSV/structured directory unless overridden.
  • Normalized Plotly legends that previously showed full headers (e.g., total:intensity) and removed 2/5 minor log ticks while enforcing exponential labels.

Migration / Compatibility

  • csv2graph now saves HTML alongside the CSV by default (typically data/structured). Use html_output_dir (API) or --html-output-dir (CLI) to target a different directory.
  • SmartTable no longer reuses prior-row values for empty cells; provide explicit values on each row where inheritance was previously relied upon.
  • String booleans "TRUE" / "FALSE" are force-cast to real booleans. Review workflows that accidentally depended on non-empty strings always evaluating to True.
  • No other compatibility changes.

Known Issues

  • None reported at this time.

v1.4.2 (2025-12-18)

References

Highlights

  • InvoiceFile.overwrite() now accepts dictionaries, validates them through InvoiceValidator, and can fall back to the existing invoice_path.
  • Excel invoice reading is centralized inside ExcelInvoiceFile, with read_excelinvoice() acting as a warning-backed compatibility wrapper slated for v1.5.0 removal.
  • csv2graph detects when a single series is requested and suppresses per-series plots unless the CLI flag explicitly demands them, keeping CLI and API defaults in sync.
  • MultiDataTile pipelines continue to run—and therefore validate datasets—even when the input directory only contains Excel invoices or is empty.

Enhancements

  • Updated InvoiceFile.overwrite() to accept mapping objects, apply schema validation through InvoiceValidator, and default the destination path to the instance’s invoice_path; refreshed docstrings and docs/rdetoolkit/invoicefile.md to describe the new API.
  • Converted read_excelinvoice() into a wrapper that emits a deprecation warning and delegates to ExcelInvoiceFile.read(), updated src/rdetoolkit/impl/input_controller.py to use the class API directly, and clarified docstrings/type hints so df_general / df_specific may be None.
  • Adjusted Csv2GraphCommand so no_individual is typed as bool | None, added CLI plumbing that inspects ctx.get_parameter_source() to detect explicit user input, and documented the overlay-only default in docs/rdetoolkit/csv2graph.md.
  • Added assert_optional_frame_equal and new regression tests that cover csv2graph CLI/API flows plus MultiFileChecker behaviors for Excel-only, empty, single-file, and multi-file directories.

Fixes

  • Auto-detecting single-series requests avoids generating empty per-series artifacts and aligns CLI defaults with the Python API.
  • _process_invoice_sheet(), _process_general_term_sheet(), and _process_specific_term_sheet() now correctly return pd.DataFrame objects, avoiding attribute errors in callers that expect frame operations.
  • MultiFileChecker.parse() returns [()] when no payload files are detected so MultiDataTile validation runs even on empty input directories, matching Invoice mode semantics.

Migration / Compatibility

  • Code calling InvoiceFile.overwrite() can now supply dictionaries directly; omit the destination argument to write to the instance path, and expect schema validation errors when invalid structures are provided.
  • read_excelinvoice() is officially deprecated and scheduled for removal in v1.5.0—migrate to ExcelInvoiceFile().read() or ExcelInvoiceFile.read() helpers.
  • csv2graph now generates only the overlay/summary graph when --no-individual is not specified and there is one (or zero) value columns; pass --no-individual=false to force legacy per-series output or --no-individual to always skip them.
  • MultiDataTile runs on empty directories no longer short-circuit; expect validation failures to surface when required payload files are absent.

Known Issues

  • None reported at this time.

v1.4.1 (2025-11-05)

References

Highlights

  • Dedicated SmartTable row CSV accessors replace ad-hoc rawfiles[0] lookups without breaking existing callbacks.
  • MultiDataTile workflows now guarantee a returned status and surface the failing mode instead of producing silent job artifacts.
  • CSV parsing tolerates metadata comments and empty data windows, removing spurious parser exceptions.
  • Graph helpers (csv2graph, plot_from_dataframe) are now exported directly via rdetoolkit.graph for simpler imports.

Enhancements

  • Introduced the smarttable_rowfile field on RdeOutputResourcePath and exposed it via ProcessingContext.smarttable_rowfile and RdeDatasetPaths.
  • SmartTable processors populate the new field automatically; when fallbacks hit rawfiles[0] a FutureWarning is emitted to prompt migration while preserving backward compatibility.
  • Refreshed developer guidance so SmartTable callbacks expect the dedicated row-file accessor.
  • Re-exported csv2graph and plot_from_dataframe from rdetoolkit.graph, aligning documentation and samples with the simplified import path.

Fixes

  • Ensured MultiDataTile mode always returns a WorkflowExecutionStatus and raises a StructuredError that names the failing mode if the pipeline fails to report back.
  • Updated CSVParser._parse_meta_block() and _parse_no_header() to ignore #-prefixed metadata rows and return an empty DataFrame when no data remains, eliminating ParserError / EmptyDataError.

Migration / Compatibility

  • Existing callbacks using resource_paths.rawfiles[0] continue to work, but now emit a FutureWarning; migrate to smarttable_rowfile to silence it.
  • The rawfiles tuple itself remains the primary list of user-supplied files—only the assumption that its first entry is always the SmartTable row CSV is being phased out.
  • No configuration changes are required for CSV ingestion; the parser improvements are backward compatible.
  • Prefer from rdetoolkit.graph import csv2graph, plot_from_dataframe; the previous rdetoolkit.graph.api path remains available for now.

Known Issues

  • None reported at this time.

v1.4.0 (2025-10-24)

References

Highlights

  • SmartTableInvoice automatically writes meta/ columns to metadata.json
  • Compact AI/LLM-friendly traceback output (duplex mode)
  • CSV visualization utility csv2graph
  • Configuration scaffold generator gen-config

Enhancements

  • Added the csv2graph API with multi-format CSV support, direction filters, Plotly HTML export, and 220+ tests.
  • Added the gen-config CLI with template presets, bilingual interactive mode, and --overwrite safeguards.
  • SmartTableInvoice now maps meta/ prefixed columns—converted via metadata-def.json—into the constant section of metadata.json, preserving existing values and skipping if definitions are missing.
  • Introduced selectable traceback formats (compact, python, duplex) with sensitive-data masking and local-variable truncation.
  • Consolidated RDE dataset callbacks around a single RdeDatasetPaths argument while emitting deprecation warnings for legacy signatures.

Fixes

  • Resolved a MultiDataTile issue where StructuredError failed to stop execution when ignore_errors=True.
  • Cleaned up SmartTable error handling and annotations for more predictable failure behavior.

Migration / Compatibility

  • Legacy two-argument callbacks continue to work but should migrate to the single-argument RdeDatasetPaths form.
  • Projects using SmartTable meta/ columns should ensure metadata-def.json is present for automatic mapping.
  • Traceback format configuration is optional; defaults remain unchanged.

Known Issues

  • None reported at this time.

v1.3.4 (2025-08-21)

References

  • Key issue: #217 (SmartTable/Invoice validation reliability)

Highlights

  • Stabilized SmartTable/Invoice validation flow.

Enhancements

  • Reworked validation and initialization to block stray fields and improve exception messaging.

Fixes

  • Addressed SmartTableInvoice validation edge cases causing improper exception propagation or typing mismatches.

Migration / Compatibility

  • No breaking changes.

Known Issues

  • None reported at this time.

v1.3.3 (2025-07-29)

References

Highlights

  • Fixed ValidationError construction and stabilized Invoice processing.
  • Added sampleWhenRestructured schema for copy-restructure workflows.

Enhancements

  • Introduced the sampleWhenRestructured pattern so copy-restructured invoice.json files requiring only sampleId validate correctly.
  • Expanded coverage across all sample-validation patterns to preserve backward compatibility.

Fixes

  • Replaced the faulty ValidationError.__new__() usage with SchemaValidationError during _validate_required_fields_only checks.
  • Clarified optional fields for InvoiceSchemaJson and Properties, fixing CI/CD mypy failures.

Migration / Compatibility

  • No configuration changes required; existing invoice.json files remain compatible.

Known Issues

  • None reported at this time.

v1.3.2 (2025-07-22)

References

Highlights

  • Strengthened required-field validation for SmartTableInvoice.

Enhancements

  • Added schema enforcement to restrict invoice.json to required fields, preventing unnecessary defaults.
  • Ensured validation runs even when pipelines terminate early.

Fixes

  • Tidied exception handling and annotations within SmartTable validation.

Migration / Compatibility

  • Backward compatible, though workflows adding extraneous invoice.json fields should remove them.

Known Issues

  • None reported at this time.

v1.3.1 (2025-07-14)

References

Highlights

  • Fixed empty-sheet exports in Excel invoice templates.
  • Enforced stricter validation for extended_mode.

Enhancements

  • Added serialization_alias to invoice_schema.py, ensuring $schema and $id serialize correctly in invoice.schema.json.
  • Restricted extended_mode in models/config.py to approved values and broadened tests for save_raw / save_nonshared_raw behavior.
  • Introduced save_table_file and SkipRemainingProcessorsError to SmartTable for finer pipeline control.
  • Updated models/rde2types.py typing and suppressed future DataFrame warnings.
  • Refreshed Rust string formatting, build.rs, and CI workflows for reliability.

Fixes

  • Added raw-directory existence checks to prevent copy failures.
  • Ensured generalTerm / specificTerm sheets appear even when attributes are empty and corrected variable naming errors.
  • Specified orient in FixedHeaders to silence future warnings.

Migration / Compatibility

  • Invalid extended_mode values now raise errors; normalize configuration accordingly.
  • Review SmartTable defaults if relying on prior save_table_file behavior.
  • tqdm dependency removal may require adjustments in external tooling.

Known Issues

  • None reported at this time.

v1.2.0 (2025-04-14)

References

Highlights

  • Introduced MinIO storage integration.
  • Delivered artifact archiving and report-generation workflows.

Enhancements

  • Implemented the MinIOStorage class for object storage access.
  • Added commands for archive creation (ZIP / tar.gz) and report generation.
  • Expanded documentation covering object-storage usage and reporting APIs.

Fixes

  • Updated dependencies and modernized CI configurations.

Migration / Compatibility

  • Fully backward compatible; enable optional dependencies when using MinIO.

Known Issues

  • None reported at this time.