Changes

Nick Bywell · d6eb0f36
--- a/Configuration-files.md
+++ b/Configuration-files.md
+[[_TOC_]]
+
+## Introduction
+
+This section contains a description of the configuration files, for reference purposes. To understand the relationship between the hierarchical csv files and the various configuration files it is necessary to read the [Configuration](https://git.lse.ac.uk/hub/lse-digital-toolkit/-/wikis/Configuration) and [Configuration scenarios](https://git.lse.ac.uk/hub/lse-digital-toolkit/-/wikis/Configuration-scenarios) sections, and to experiment by changing the content of the hierarchical csv files and making corresponding changes to the associated configuration files.
+
+## gfs_arkivum_column_header_info.csv
+
+It is only necessary to take note of this file if there is an intention to upload a zip file to Arkivum's Digital Preservation Platform (Perpetua), and only then, if you wish to modify the default set of column headers that are contained in the metadata.csv file (a file that forms part of the upload package that is created by the gfs_create_arkivum_upload.py script).
+
+The functional significance of this configuration file is that if there is an isadg column header in a tranche csv file, and that same column header appears in this file, when the gfs_create_arkivum_upload.py script is executed, the content of the column will be transferred to the metadata.csv file.
+
+The names and sequence of the column headers in this configuration file must match up with the way that Arkivum have configured the user's instance of Perpetua.
+
+## gfs_project_core_column_header_info.csv
+
+This file contains a list of column headers that must be present within a corresponding project csv file. Its function is to allow a super-administrator to devolve responsibility for configuring some areas of the GFS to other administrators while retaining some control over what constitutes a core set of column headings that should always be present within a project csv file.
+  
+## gfs_project_validation_info.csv
+
+This file allows a hierarchical project csv file to be validated with regard to the existence and sequence of the column headers. It also enables the administrator to flag whether it is mandatory for the cataloguer to enter content in each column of the project csv file.
+
+## gfs_tranche_core_column_header_info.csv
+
+This file contains a list of column headers that must be present within a corresponding tranche csv file. Its function is to allow a super-administrator to devolve responsibility for configuring some areas of the GFS to other administrators while retaining some control over what constitutes a core set of column headings that should always be present within a tranche csv file.
+
+## gfs_tranche_library_column_header_info.csv
+
+This file contains a list of "Library Processing"-related column headers that must be present in a tranche csv file when the content of the "gfs.libraryProcessing" column has been set to "y" for the tranche in the corresponding project csv file. It ensures that the gfs_create_arkivum_upload.py script fails in a controlled manner if one of the "Library Processing" column headers is missing from a tranche csv file.
+
+## gfs_tranche_validation_info.csv
+
+This file allows a tranche csv file to be validated with regard to the existence and sequence of the column headers it contains. It also enables the administrator to flag whether it is mandatory for a cataloguer to enter content in a particular column of the tranche csv file.
+
+## gfs_folder_type_info.csv
+
+This file pairs up folder names (that describe a particular file-type or format) with the name of the corresponding file-extension.
+
+The functional significance of this configuration file is that it is accessed when a script is run in order to check that the folder-type names that have been quoted on the command-line are valid. If a user enters a typo while quoting a folder-type on a command line, it is best that the script's processing be terminated straight away.
+
+If there is a requirement for a new file-type/format to be stored within the GFS, a new row should be inserted into this file in the appropriate alphabetical sequence. 
+
+The reason for having both a folder name and a file extension name is to cater for those instances when more than one file with the same file extension is present in one child-item-set. For example, one tif file might give rise to both Alto and Exif derivatives. They both have ".xml" file extensions so a mechanism is needed to distinguish between them.
+
+## gfs_folder_types_excluded_from_total_check_info.csv
+
+When the gfs_validate_tranche_folder.py script is executed, it checks that there is an equivalence in the number of various file derivatives compared to the master. For example, it will check that there are an equivalent number of alto, jpg, and tif files.
+
+However, when a set of files have been concatenated to form, for example, a pdf file, no equivalence in number is expected. If the folder-type is listed within this configuration file, this number equivalence is not tested for, and erroneous error messages are avoided.
+
+[Return to documentation home page](https://git.lse.ac.uk/hub/lse-digital-toolkit/-/wikis/LSE-Digital-Toolkit)
\ No newline at end of file