SQL Insert or Update
Maintains up-to-date records in a table
Processing
For every input row the step ensures that the table contains a record corresponding to the given keys and data fields.
If there is no record for the given keys yet, a new record is inserted.
If there is a current record for the given keys, the contents of the table record are compared with the data in flow. If the contents are identical, no changes are made. If data fields differ, the table record is updated to reflect the new values.
Settings
Name | Type | Description |
---|---|---|
Connection |
dict |
The database connection to use. Evaluated for each input row |
Schema |
string |
The database schema of the table. Evaluated for each input row |
Table |
string |
Name of the dimension table in the database. Evaluated for each input row |
Commit Size |
long |
Number of rows after which data is committed.
Evaluated once when step initializes |
Fields | ||
Keys |
Key fields of a record. Evaluated for each input row |
|
Data fields |
The fields and values making up the changing attributes of a record. When a field value in the flow is different from the value in the corresponding table record, the table record is updated. Evaluated for each input row |
|
Insert fields |
When a new record is inserted, these values are evaluated and included in the insert. | |
Update fields |
Whenever records are updated, these values are evaluated and are included as fields in the update. | |
Cache | ||
Use Cache |
boolean |
Use a in-memory cache to hold table rows. Evaluated once when step initializes |
Cache Size |
long |
Number of records cached
Evaluated once when step initializes |
Pre-load Cache |
boolean |
If Evaluated once when step initializes |
Pre-load SQL Filter |
string |
If non-empty, this query text is appended as an AND condition to the cache pre-loading query. It should contain a predicate to append to the where clause of the query, for example: This is useful to selectively pre-warm the cache with a specific subset of records. Evaluated once when step initializes |
Pre-load SQL Parameters |
list |
If the pre-load cache filter contains Evaluated once when step initializes |
Bloom Filter | ||
Use Bloom Filter |
boolean |
If During processing, the bloom filter is then used to determine whether any records with given business keys are present in the table. The filter is most useful during initial loads, when no table records are present for most or all incoming rows. It allows skipping the corresponding lookup query. Evaluated once when step initializes |
Bloom Filter Size |
long |
The expected number of distinct keys the bloom filter should be able to handle. This number is used to ensure the bloom filter operates with an expected false positive rate of 5%. That is 5% of the time the filter might indicate that a set of keys might be present in the table, when in fact it is not. This number should be close to the number of distinct keys in the fully loaded table. Evaluated once when step initializes |
Results
This step provides no results