What is a dataset?

What is a dataset?

A gather360 dataset is a core entity that users can customise within the platform. It contains detailed settings and requirements for your specific data need. These settings define three core things about your data:

  1. Data Requirements: What data is needed and how it should be structured

  2. Mapping, Filter and Validation Rules: How data must be governed and quality assured to ensure it meets requirements

  3. Transformation and Enrichment Rules: How data should be formatted to meet internal needs

These settings allow gather360 to process, govern and quality assure data to meet your requirements. It also records all processing activities in an individual audit log for every row of data.

Typically, an organisation will have multiple datasets for different data needs in their business. Each of these datasets may have different configurations and rules. These datasets can accept data uploads from contributing suppliers or users when published. When data uploaded are submitted into a dataset, gather360 manages the data upload in accordance with your settings.

A dataset has three possible states: live, draft or archived. Users can build new datasets or amend existing datasets.

Data Requirements

Data requirements are a collective term for the settings in a dataset.

Data requirements include the data fields required, the structure of the data table, validation rules and transformation rules to be applied, and any required mappings.

Data requirements also explain the business purpose of the data set and/or data product required.

  1. A text description of the purpose of the data required and the end data product

  2. A text description of the current challenges faced in getting the data for this data requirement

  3. A list of the fields required and the expected data type, ideally with sample values

Target Field

A target field is a column within a dataset. Target fields have four possible formats:

  • Integer

  • String

  • Date

  • Number

gather360 automatically verifies that fields in data uploads are in the same format as the target field.

Mapping, Validation and Filter Rules

These rules govern the quality and scope of your data. The user configures these rules to ensure data meets the desired target state.

Mapping Rule

A mapping rule defines how source fields relate to target fields.

Filter Rule

A filter rule enables the user to define which rows of data should enter the data store.

Validation Rule

A validation rule specifies logical conditions that data must meet to enter the data store.

Data Error

A data error happens when uploaded data does not meet the conditions listed in a validation rule. gather360 flags data errors to the supplier or user uploading the data. There are two types of data error, a Warning and a Critical Error.

  • Warning: gather360 flags the data error to the uploader but won't prevent submission. The error can be optionally resolved and re-tested. If the user chooses not to fix this error, the data submission will be flagged and made available for analysis in the data layer.

  • Critical Error: gather360 flags the data error to the uploader and will prevent the data upload from being submitted. The user must resolve the error and repeat the validation test before they can submit data.

Transformation & Enrichment Rules

Transformation Rule

Transformation rules change data fields to ensure data is in the correct format. These rules can split or concatenate values and change field formats.

Enrichment Rule

Enrichment rules can pull fields from other datasets to enrich your data output. These can replace a field or be created as an additional field.

    • Related Articles

    • Creating a dataset

      How to build your first dataset in gather360 To enable gather360 to prepare and request data for your organisation, you must define the target state for the data that you need. To do this, you must create a dataset. A dataset specifies the target ...
    • Allowed file upload type

      Upload .xlsx, .json, and .csv data files into gather360 enables users or data suppliers to submit data to a dataset. This can be completed manually by uploading a file, or via an automated data connection. For both upload methods, data must be ...
    • Allowed file upload types

      Upload .xlsx, .json, and .csv data files into gather360 enables users or data suppliers to submit data to a dataset. This can be completed manually by uploading a file, or via an automated data connection. For both upload methods, data must be ...
    • Rules coverage metric

      https://help.gather360.io/en/articles/5857014-rules-coverage-metric Rules coverage metric Learn more about the 'Rules coverage' metric in your workspace KPI dashboard. The Rules coverage KPI is a calculated score that indicates the overall status of ...
    • gather360 Introduction

      Get to know the main features and functionality of the gather360 platform. ​ Workspace A workspace is the name of your organisation's main dashboard. This is where your team can collaborate to manage data sets, data suppliers, data schedules and data ...