R/read.genesys.R
    read.genesys.Rdread.genesys reads PGR data in a Darwin Core - germplasm zip 
archive downloaded from genesys database and creates a flat file
data.frame from it.
read.genesys(zip.genesys, scrub.names.space = TRUE, readme = TRUE)A character vector giving the file path to the downloaded zip file from Genesys.
logical. If TRUE, all space characters are 
removed from name field in names extension (see Details).
logical. If TRUE, the genesys zip file readme is printed
to console.
A data.frame with the flat file form of the genesys data.
This function helps to import to R environment, the PGR data 
downloaded from genesys database https://www.genesys-pgr.org/ as a 
Darwin Core - germplasm (DwC-germplasm) zip archive. The different csv files 
in the archive are merged as a flat file into a single data.frame.
All the space characters can be removed from the fields corresponding to 
accession names such as acceNumb, collNumb, ACCENAME, COLLNUMB, DONORNUMB and
OTHERNUMB using the argument scrub.names.space to facilitate creation 
of KWIC index with KWIC function and subsequent
matching operations to identify probable duplicates with
ProbDup function.
The argument readme can be used to print the readme file in the 
archive to console, if required.
# \dontshow{
threads_dt <- data.table::getDTthreads()
threads_OMP <- Sys.getenv("OMP_THREAD_LIMIT")
data.table::setDTthreads(2)
data.table::setDTthreads(2)
Sys.setenv(`OMP_THREAD_LIMIT` = 2)
# }
if (FALSE) {
# Import the DwC-Germplasm zip archive "genesys-accessions-filtered.zip"
PGRgenesys <- read.genesys("genesys-accessions-filtered.zip",
                           scrub.names.space = TRUE, readme = TRUE)
}
# \dontshow{
data.table::setDTthreads(threads_dt)
Sys.setenv(`OMP_THREAD_LIMIT` = threads_OMP)
# }