Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Data Dictionary:

View file
nameSmart_Insider_Retial_Quantitative_Specification_Sa.pdf

Data assets:

smartinsider_YYYYMMDDHHMM.ttx - data files are delivered in a tab delimited (.ttx) format. Initially, multiple files were sent per day. However, in an attempt to help D&B combine data into a single schema, SmartInsider has opted to provide single daily refreshes only.

...

  • Data files contain historical data (from August 6, 2021 to present).

  • All files should be ingested and they all belong to a single schema.

  • The work_stream_name (for source record ID) is smartinsider_altdata

  • Update type: delta/incremental

  • Update frequency: daily

  • Data ingestion approach: TBD

    • Please skip the fields “LastSignalUpdate” and “DeliveryDateTime” if you see them in any raw data files. These fields should not be included in our Snowflake tables.

    • Files should be ingested in the order they appear on the SFTP.

    • As duplicates can still be found in the recent data files, we need to discuss whether we need to drop records before appending them. And if dropping is required, what fields should constitute as unique records.

Character encoding:

Files are in UTF-8 encoded. Please make sure you read files with this encoding.

...

Match Logic:

  • Match approach: TBD. D&B mentions that the new match logic for SmartInsider will involve exclusion criteria (additional input parameters to be sent to the API).

Snowflake Tables to Share:

...