Revision as of 06:01, December 12, 2019 by Tanyai (talk | contribs)
Jump to: navigation, search

Tanyai6

About Data Export Capability

Data Export capability is enabled in select Cloud deployments to periodically copy the data that is stored in the Genesys historical database (called the Info Mart database) into local .csv files, so that the data is available for further import into a data warehouse (the target database) for the purpose of archiving or custom reporting. Starting with release 8.5.011.22, Genesys Info Mart supports Data Export in on-premises deployments as well.

The export job, Job_ExportGIM, exports data from fact and dimension tables that are part of the Genesys Info Mart dimensional model and creates a .zip archive containing individual .csv files, one file per database table. The .csv files are formatted in accordance with RFC 4180 (https://www.ietf.org/rfc/rfc4180.txt).

The output data files are encoded using the UTF8 format by default. On-premises customers can specify a different character encoding for exported files (see Schedule and other export job settings).

File/directory structure

The export is incremental and uses special audit keys to identify changes in data since the last export. At each export, a chunk of exported data is written into a separate folder that is named according to the following naming convention: export_XXX

where XXX consists of:

  • an audit key identifier (audit key high-water mark)
  • the maximum date of data contained in all previous exports and this export, in GMT time zone, written in the YYYY_MM_DD_HH_MI_SS format.

The output folder contains several .zip files, as follows:

  • export_XXX.zip — zip file with exported data. Each table is stored in a separate file with a file name in the format <table-name>.csv—for example, interaction_fact.csv. Within a .csv file, a header line identifies the table column names. Note that, within the exported .csv files, nulls and empty strings are represented as empty fields.
  • export_XXX.zip.sha1 — checksum for export_XXX.zip. The checksum can be validated by sha1sum program (https://en.wikipedia.org/wiki/Sha1sum) and is used to verify that the .zip file is complete on the receiving side.
  • export_XXX.extracted.xml — metadata about export_XXX.zip.
Important
The subfolder .gim is reserved for internal use.

Checksums are also generated for each individual table .csv file. If a table does not have any changes since the last export, nothing is written for that table.

Export metadata file

The export_XXX.extracted.xml metadata file includes information about the export file, as shown in the example below.

Example

<info>
<created-ts>1521091600</created-ts>
<gim-schema-version>8.5.009.15</gim-schema-version>
<gim-version>8.5.009.20</gim-version>
<hwm-from audit-key="13" created-ts="1520919983"/>
<hwm-to audit-key="200074" created-ts="1520995485"/>
<max-data-ts>1521006157</max-data-ts>
</info>

Where:

  • created-ts — The UTC timestamp, in seconds since January 1, 1970, for the execution of the export.
  • gim-schema-version — The version of the Info Mart database schema used to populate the tables; if export views are used, this schema version is not necessarily the same as the schema version reflected by the export views and actually used for the export.
  • gim-version — The version of Genesys Info Mart Server that created the export files.
  • hwm-from — The starting point of the data in the export by audit key and the create time, in UTC seconds, of that audit key.
  • hwm-to — The ending point of the data in the export by audit key and the create time, in UTC seconds, of that audit key.
  • max-data-ts — The maximum time, in UTC seconds, of the data contained in all previous exports and this export.

The hwm-to and hwm-from values must match between successive export runs. Use them to verify that no intermediate export file has been missed on the receiving side. For example, the next export following the example .xml file above is supposed to have hwm-from audit-key = 200074.

The maximum time span of data in any single export file is one day. For example, if historical reporting was not available for two days (because, for instance, the server or database has been down), the export will continue from the last exported high-water mark and move ahead one day in the data. The next export will continue from there, exporting no more than one day at a time, until the export has caught up with the current data.

Target database

Genesys provides an SQL script to assist you in creating a target schema into which to import the exported Info Mart data. (The script is update_target_gim_db.sql, update_target_gim_db_partitioned.sql, update_target_gim_db_multilang.sql, or update_target_gim_db_multilang_partitioned.sql in the sql_scripts folder in your Genesys Info Mart installation package.) Execute the script against your target database to create a schema consistent with the Info Mart schema. Be sure to use an update_target_*.sql script from the Genesys Info Mart installation package that is currently installed or that you are about to deploy.

You can also use the script to migrate your target database if the Info Mart database schema changes after you have set up your target database, and either you are not using export views or your export views have been updated to include the schema changes. The update_target_*.sql script enables you to migrate your target database directly from any Info Mart schema version to any later schema version, by updating the target schema with new tables or columns if they are missing.

When to run the update_target_*.sql script to migrate your target schema following an Info Mart migration depends on your business needs, import processing, and consumption queries, as well as on whether you are using export views.

  • If you are not using export views, you might need to update your target schema and/or modify your import and other consumption queries almost immediately, before you try to import the next batch of exported data.
  • If you are using export views, you can choose whether you want your export to include new data available in the Info Mart database. If you do, you can continue to export data using the existing export views, while you prepare your consumption queries (for example, you can test adjusted queries against the migrated Info Mart database).
    When you are ready, migrate your target schema by executing the update_target_*.sql script from the Genesys Info Mart installation package that is currently installed. Then run the migration job to refresh your export views, as described above.

Custom user-data tables—limitation

While the export job does export custom user-data tables, the update_target_*.sql script does not include custom tables. You must create or migrate custom user-data tables separately in your target schema.

Consumption

The exported table data typically contains a mix of created and updated rows. For this reason, you should merge newly exported data with existing data loaded from prior exports. For example, first, load the export files into a temporary table and then use an SQL merge statement based on the primary key for the table to merge the data into a permanent target table that holds the cumulative data from prior exports.

Process the export folders in order by folder name.

If necessary, you can restart the export data stream from the beginning or from a fixed date. Also, you can re-export a time span backwards from the most recent export.

Data decoding

The data is exported into .csv files that are formatted in accordance with RFC 4180 (https://www.ietf.org/rfc/rfc4180.txt). The exported data must be decoded properly before it is imported into the target database for custom reporting or archiving purposes. Customers should perform decoding of the exported .csv files according to the guidelines in RFC 4180. Properly decoded data is expected to fit into the target schema that is created by Genesys-provided scripts without the need to increase field sizes.

Handling Unicode Characters

Special considerations are required for data that includes unicode characters. By default, Genesys Info Mart encodes the exported data using Eight-bit Unicode Transformation Format (UTF8) character encoding. However, to accommodate unicode characters in respective database fields, both the original Info Mart database and the target database must be set up for UTF8 encoding when the database is created:

  • For Microsoft SQL Server, specify NVARCHAR(N) column type
  • For Oracle, specify character set AL32UTF8, which creates VARCHAR2(N CHAR) column types
  • For Postgre SQL, choose the default character set, UTF8, which creates VARCHAR(N) column types

Subsequently, a special "multilanguage" version of an SQL script is required to create both the Info Mart schema and a target schema with the fields that store data with Unicode characters.

On-premises customers are advised to keep the default value of utf8 for the gim-export/output-files-encoding configuration option, to ensure that Unicode characters for exported files (see Schedule and other export job settings).

Finally, when exported data is decoded before being imported into the target database, .csv file decoding must be done using UTF8 encoding.

Following the above guidelines will help to avoid issues such as data corruption due to data not being decoded properly or data import due to data length being larger than the column size in the target database.

Comments or questions about this documentation? Contact us for support!