コンテンツにスキップ

Input Controller API

Purpose

This module defines file operation processing corresponding to various input modes in RDEToolKit. It provides functionality for input mode determination, file validation, and processing control.

Key Features

Input Mode Management

  • Automatic determination of mode from input file patterns
  • Support for Invoice, ExcelInvoice, RDEFormat, and MultiFile modes
  • Input file validation and preprocessing

File Operation Control

  • Acquisition and classification of input files
  • File format validation
  • Control of processing workflow

src.rdetoolkit.impl.input_controller.InvoiceChecker(unpacked_dir_basename)

Bases: IInputFileChecker

A checker class to determine and parse the invoice mode.

This class groups and checks invoice files, specifically identifying zip files, Excel invoice files, and other types of files.

Attributes:

Name Type Description
out_dir_temp Path

Temporary directory for the unpacked content.

Note

For the purpose of this checker, notable files are primarily Excel invoices with a specific naming convention.

checker_type: str property

Return the type identifier for this checker.

out_dir_temp: Incomplete = unpacked_dir_basename instance-attribute

parse(src_dir_input)

Parses the source input directory, grouping files based on their type.

Parameters:

Name Type Description Default
src_dir_input Path

Source directory containing the input files.

required

Returns:

Type Description
tuple[RawFiles, Path | None]

tuple[RawFiles, Optional[Path]]:

  • RawFiles: A list of tuples where each tuple contains file paths grouped as 'other files'.
  • Optional[Path]: This is always None for this implementation.

src.rdetoolkit.impl.input_controller.ExcelInvoiceChecker(unpacked_dir_basename)

Bases: IInputFileChecker

A checker class to determine and parse the ExcelInvoice mode.

This class is used to identify, group, and validate the files in ExcelInvoice mode. The primary focus is on determining the presence and validity of ZIP files, Excel Invoice files, and other file types.

Attributes:

Name Type Description
out_dir_temp Path

Temporary directory for unpacked content.

Methods:

Name Description
parse

Path) -> tuple[RawFiles, Optional[Path]]: Parse the source input directory, validate the file groups, and return the raw files and the Excel Invoice file.

checker_type: str property

Return the type identifier for this checker.

out_dir_temp: Incomplete = unpacked_dir_basename instance-attribute

get_index(paths, sort_items)

Retrieves the index of the divided folder.

Parameters:

Name Type Description Default
paths Path

Directory path of the raw files.

required
sort_items Sequence

A list of files sorted in the order described in the Excel invoice.

required

Returns:

Name Type Description
int int

The index number.

parse(src_dir_input)

Parse the source input directory, group files by their type, validate the groups, and return the raw files and Excel Invoice file.

Parameters:

Name Type Description Default
src_dir_input Path

Source directory containing the input files.

required

Returns:

Type Description
tuple[RawFiles, Path | None]

tuple[RawFiles, Optional[Path]]:

  • RawFiles: List of tuples containing paths of raw files.
  • Optional[Path]: Path to the Excel Invoice file.

src.rdetoolkit.impl.input_controller.RDEFormatChecker(unpacked_dir_basename)

Bases: IInputFileChecker

A checker class to identify and parse the RDE Format.

This class is designed to handle files in the RDE Format. It checks the presence of ZIP files, unpacks them, and retrieves raw files from the unpacked content.

Attributes:

Name Type Description
out_dir_temp Path

Temporary directory for unpacked content.

checker_type: str property

Return the type identifier for this checker.

out_dir_temp: Incomplete = unpacked_dir_basename instance-attribute

parse(src_dir_input)

Parse the source input directory, identify ZIP files, unpack the ZIP file, and return the raw files.

Parameters:

Name Type Description Default
src_dir_input Path

Source directory containing the input files.

required

Returns:

Type Description
tuple[RawFiles, Path | None]

tuple[RawFiles, Optional[Path]]:

  • RawFiles: List of tuples containing paths of raw files.
  • Optional[Path]: This will always return None for this implementation.

src.rdetoolkit.impl.input_controller.MultiFileChecker(unpacked_dir_basename)

Bases: IInputFileChecker

A checker class to identify and parse the MultiFile mode.

This class is designed to handle multiple file modes. It checks the files in the source input directory, groups them, and retrieves the raw files.

Attributes:

Name Type Description
out_dir_temp Path

Temporary directory used for certain operations.

checker_type: str property

Return the type identifier for this checker.

out_dir_temp: Incomplete = unpacked_dir_basename instance-attribute

parse(src_dir_input)

Parse the source input directory, group ZIP files and other files, and return the raw files.

Parameters:

Name Type Description Default
src_dir_input Path

Source directory containing the input files.

required

Returns:

Type Description
tuple[RawFiles, Path | None]

tuple[RawFiles, Optional[Path]]:

  • RawFiles: List of tuples containing paths of raw files.
  • Optional[Path]: This will always return None for this implementation.

Practical Usage

Invoice Mode Processing

invoice_mode_processing.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from rdetoolkit.impl.input_controller import InvoiceChecker
from rdetoolkit.models.rde2types import RdeInputDirPaths
from pathlib import Path

# Configure input paths
input_paths = RdeInputDirPaths(
    inputdata=Path("data/input"),
    invoice=Path("data/invoice"),
    tasksupport=Path("data/tasksupport")
)

# Create an Invoice checker
invoice_checker = InvoiceChecker(input_paths)

try:
    # Parse the Invoice file
    parsed_data = invoice_checker.parse()
    print(f"✓ Invoice parsing successful: {parsed_data}")

    # Get file groups
    file_groups = invoice_checker._get_group_by_files()
    print(f"Number of file groups: {len(file_groups)}")

    for i, group in enumerate(file_groups):
        print(f"Group {i+1}: {group}")

except Exception as e:
    print(f"✗ Invoice processing error: {e}")

ExcelInvoice Mode Processing

excel_invoice_mode_processing.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
from rdetoolkit.impl.input_controller import ExcelInvoiceChecker
from rdetoolkit.models.rde2types import RdeInputDirPaths
from pathlib import Path

# Configure input paths
input_paths = RdeInputDirPaths(
    inputdata=Path("data/input"),
    invoice=Path("data/invoice"),
    tasksupport=Path("data/tasksupport")
)

# Create an ExcelInvoice checker
excel_checker = ExcelInvoiceChecker(input_paths)

try:
    # Read the Excel invoice
    excel_data = excel_checker.read()
    print("✓ Excel invoice read successfully")

    # Get index
    index = excel_checker.get_index()
    print(f"Index: {index}")

    # Get raw data files
    rawfiles = excel_checker._get_rawfiles()
    print(f"Number of raw data files: {len(rawfiles)}")

    # Validate files
    validation_result = excel_checker._validate_files()
    if validation_result:
        print("✓ File validation successful")
    else:
        print("✗ File validation failed")

        # Detect invalid files
        invalid_zips = excel_checker._detect_invalid_zipfiles()
        invalid_excels = excel_checker._detect_invalid_excel_invoice_files()
        invalid_others = excel_checker._detect_invalid_other_files()

        if invalid_zips:
            print(f"Invalid ZIP files: {invalid_zips}")
        if invalid_excels:
            print(f"Invalid Excel files: {invalid_excels}")
        if invalid_others:
            print(f"Other invalid files: {invalid_others}")

except Exception as e:
    print(f"✗ ExcelInvoice processing error: {e}")

RDEFormat Mode Processing

rde_format_mode_processing.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
from rdetoolkit.impl.input_controller import RDEFormatChecker
from rdetoolkit.models.rde2types import RdeInputDirPaths
from pathlib import Path

# Configure input paths
input_paths = RdeInputDirPaths(
    inputdata=Path("data/input"),
    invoice=Path("data/invoice"),
    tasksupport=Path("data/tasksupport")
)

# Create an RDEFormat checker
rde_checker = RDEFormatChecker(input_paths)

try:
    # Parse the RDE format
    parsed_data = rde_checker.parse()
    print(f"✓ RDE format parsing successful: {parsed_data}")

    # Get ZIP files
    zipfiles = rde_checker._get_zipfiles()
    print(f"Number of ZIP files: {len(zipfiles)}")

    # Unpack files
    unpacked_files = rde_checker._unpacked()
    print(f"Number of unpacked files: {len(unpacked_files)}")

    # Get raw data files
    rawfiles = rde_checker._get_rawfiles()
    print(f"Number of raw data files: {len(rawfiles)}")

    for rawfile in rawfiles:
        print(f"  - {rawfile}")

except Exception as e:
    print(f"✗ RDEFormat processing error: {e}")

MultiFile Mode Processing

multifile_mode_processing.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
from rdetoolkit.impl.input_controller import MultiFileChecker
from rdetoolkit.models.rde2types import RdeInputDirPaths
from pathlib import Path

# Configure input paths
input_paths = RdeInputDirPaths(
    inputdata=Path("data/input"),
    invoice=Path("data/invoice"),
    tasksupport=Path("data/tasksupport")
)

# Create a MultiFile checker
multifile_checker = MultiFileChecker(input_paths)

try:
    # Parse the MultiFile input
    parsed_data = multifile_checker.parse()
    print(f"✓ MultiFile parsing successful: {parsed_data}")

    # Get file groups
    file_groups = multifile_checker._get_group_by_files()
    print(f"Number of file groups: {len(file_groups)}")

    for i, group in enumerate(file_groups):
        print(f"Group {i+1}: {len(group)} files")
        for file_path in group:
            print(f"  - {file_path}")

    # Unpack files (if there are compressed files)
    unpacked_files = multifile_checker._unpacked()
    if unpacked_files:
        print(f"Number of unpacked files: {len(unpacked_files)}")
        for unpacked_file in unpacked_files:
            print(f"  - {unpacked_file}")

except Exception as e:
    print(f"✗ MultiFile processing error: {e}")

Integrated Input Control System

integrated_input_control.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
from rdetoolkit.impl.input_controller import (
    InvoiceChecker, ExcelInvoiceChecker,
    RDEFormatChecker, MultiFileChecker
)
from rdetoolkit.models.rde2types import RdeInputDirPaths
from pathlib import Path

class InputModeController:
    """Integrated input mode control system"""

    def __init__(self, input_paths: RdeInputDirPaths):
        self.input_paths = input_paths
        self.checkers = {
            "Invoice": InvoiceChecker(input_paths),
            "ExcelInvoice": ExcelInvoiceChecker(input_paths),
            "RDEFormat": RDEFormatChecker(input_paths),
            "MultiFile": MultiFileChecker(input_paths)
        }

    def detect_input_mode(self) -> str:
        """Automatic detection of input mode"""

        # Check for Excel invoice files
        excel_files = list(self.input_paths.invoice.glob("*.xlsx"))
        if excel_files:
            return "ExcelInvoice"

        # Check for JSON invoice files
        json_files = list(self.input_paths.invoice.glob("*.json"))
        if json_files:
            return "Invoice"

        # Check for ZIP files
        zip_files = list(self.input_paths.inputdata.glob("*.zip"))
        if zip_files:
            return "RDEFormat"

        # Default to MultiFile
        return "MultiFile"

    def process_input(self) -> dict:
        """Execute processing based on detected input mode"""

        detected_mode = self.detect_input_mode()
        print(f"Detected input mode: {detected_mode}")

        try:
            checker = self.checkers[detected_mode]

            if detected_mode == "ExcelInvoice":
                data = checker.read()
                index = checker.get_index()
                rawfiles = checker._get_rawfiles()

                return {
                    "mode": detected_mode,
                    "status": "success",
                    "data": data,
                    "index": index,
                    "rawfiles": rawfiles
                }

            else:
                parsed_data = checker.parse()
                file_groups = checker._get_group_by_files() if hasattr(checker, '_get_group_by_files') else []

                return {
                    "mode": detected_mode,
                    "status": "success",
                    "data": parsed_data,
                    "file_groups": file_groups
                }

        except Exception as e:
            return {
                "mode": detected_mode,
                "status": "error",
                "error": str(e)
            }

# Example usage
input_paths = RdeInputDirPaths(
    inputdata=Path("data/input"),
    invoice=Path("data/invoice"),
    tasksupport=Path("data/tasksupport")
)

controller = InputModeController(input_paths)
result = controller.process_input()

print(f"Result: {result}")
if result["status"] == "success":
    print(f"✓ Successfully processed in {result['mode']} mode")
else:
    print(f"✗ Error in {result['mode']} mode: {result['error']}")