Close

Reguła automatyzacji Jira podczas scalania pull requestu

Posted by: AJ Welch

MySQL provides a number of useful statements when it is necessary to INSERT rows after determining whether that row is, in fact, new or already exists.

Below we’ll examine the three different methods and explain the pros and cons of each in turn so you have a firm grasp on how to configure your own statements when providing new or potentially existing data for INSERTION.


Nie ograniczanie się do Agile


Using INSERT IGNORE effectively causes MySQL to ignore execution errors while attempting to perform INSERT statements. This means that an INSERT IGNORE statement which contains a duplicate value in a UNIQUE index or PRIMARY KEY field does not produce an error, but will instead simply ignore that particular INSERT command entirely. The obvious purpose is to execute a large number of INSERT statements for a combination of data that is both already existing in the database as well as new data coming into the system.

For example, our books table might contain a few records already:

mysql> SELECT * FROM books LIMIT 3;
+----+-------------------------+---------------------+----------------+
| id | title                   | author              | year_published |
+----+-------------------------+---------------------+----------------+
|  1 | In Search of Lost Time  | Marcel Proust       |           1913 |
|  2 | Ulysses                 | James Joyce         |           1922 |
|  3 | Don Quixote             | Miguel de Cervantes |           1605 |
+----+-------------------------+---------------------+----------------+
3 rows in set (0.00 sec)

If we have a large batch of new and existing data to INSERT and part of that data contains a matching value for the id field (which is a UNIQUE PRIMARY_KEY in the table), using a basic INSERT will produce an expected error:

mysql> INSERT INTO books
    (id, title, author, year_published)
VALUES
    (1, 'Green Eggs and Ham', 'Dr. Seuss', 1960);
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

On the other hand, if we use INSERT IGNORE, the duplication attempt is ignored and no resulting errors occur:

mysql> INSERT IGNORE INTO books
    (id, title, author, year_published)
VALUES
    (1, 'Green Eggs and Ham', 'Dr. Seuss', 1960);
Query OK, 0 rows affected (0.00 sec)

Nie ograniczanie się do Agile


In the event that you wish to actually replace rows where INSERT commands would produce errors due to duplicate UNIQUE or PRIMARY KEY values as outlined above, one option is to opt for the REPLACE statement.

When issuing a REPLACE statement, there are two possible outcomes for each issued command:

  • No existing data row is found with matching values and thus a standard INSERT statement is performed.
  • A matching data row is found, causing that existing row to be deleted with the standard DELETE statement, then a normal INSERT is performed afterward.

For example, we can use REPLACE to swap out our existing record of id = 1 of In Search of Lost Time by Marcel Proust with Green Eggs and Ham by Dr. Seuss:

mysql> REPLACE INTO books
    (id, title, author, year_published)
VALUES
    (1, 'Green Eggs and Ham', 'Dr. Seuss', 1960);
Query OK, 2 rows affected (0.00 sec)

Notice that even though we only altered one row, the result indicates that two rows were affected because we actually DELETED the existing row then INSERTED the new row to replace it.

More information on using REPLACE can be found in the official documentation.

Nie ograniczanie się do Agile


The alternative (and generally preferred) method for INSERTING into rows that may contain duplicate UNIQUE or PRIMARY KEY values is to use the INSERT ... ON DUPLICATE KEY UPDATE statement and clause.

Unlike REPLACE – an inherently destructive command due to the DELETE commands it performs when necessary – using INSERT ... ON DUPLICATE KEY UPDATE is non-destructive, in that it will only ever issue INSERT or UPDATE statements, but never DELETE.

For example, we have decided we wish to replace our id = 1 record of Green Eggs and Ham and revert it back to the original In Search of Lost Time record instead. We can therefore take our original INSERT statement and add the new ON DUPLICATE KEY UPDATE clause:

mysql> SET @id = 1,
    @title = 'In Search of Lost Time',
    @author = 'Marcel Proust',
    @year_published = 1913;
INSERT INTO books
    (id, title, author, year_published)
VALUES
    (@id, @title, @author, @year_published)
ON DUPLICATE KEY UPDATE
    title = @title,
    author = @author,
    year_published = @year_published;

Notice that we’re using normal UPDATE syntax (but excluding the unnecessary table name and SET keyword), and only assigning the non-UNIQUE values. Also, although unnecessary for the ON DUPLICATE KEY UPDATE method to function properly, we’ve also opted to utilize user variables so we don’t need to specify the actual values we want to INSERT or UPDATE more than once.

As a result, our id = 1 record was properly UPDATED as expected:

mysql> SELECT * FROM books LIMIT 1;
+----+------------------------+---------------+----------------+
| id | title                  | author        | year_published |
+----+------------------------+---------------+----------------+
|  1 | In Search of Lost Time | Marcel Proust |           1913 |
+----+------------------------+---------------+----------------+
1 row in set (0.00 sec)

More information can be found in the official documentation.


Następny temat