How to INSERT If Row Does Not Exist (UPSERT) in MySQL
Posted by: AJ Welch
MySQL provides a number of useful statements when it is necessary to INSERT
rows after determining whether that row is, in fact, new or already exists.
Below we’ll examine the three different methods and explain the pros and cons of each in turn so you have a firm grasp on how to configure your own statements when providing new or potentially existing data for INSERTION
.
Using INSERT IGNORE
Using INSERT IGNORE
effectively causes MySQL to ignore execution errors while attempting to perform INSERT
statements. This means that an INSERT IGNORE
statement which contains a duplicate value in a UNIQUE
index or PRIMARY KEY
field does not produce an error, but will instead simply ignore that particular INSERT
command entirely. The obvious purpose is to execute a large number of INSERT
statements for a combination of data that is both already existing in the database as well as new data coming into the system.
For example, our books
table might contain a few records already:
mysql> SELECT * FROM books LIMIT 3;
+----+-------------------------+---------------------+----------------+
| id | title | author | year_published |
+----+-------------------------+---------------------+----------------+
| 1 | In Search of Lost Time | Marcel Proust | 1913 |
| 2 | Ulysses | James Joyce | 1922 |
| 3 | Don Quixote | Miguel de Cervantes | 1605 |
+----+-------------------------+---------------------+----------------+
3 rows in set (0.00 sec)
If we have a large batch of new and existing data to INSERT
and part of that data contains a matching value for the id
field (which is a UNIQUE PRIMARY_KEY
in the table), using a basic INSERT
will produce an expected error:
mysql> INSERT INTO books
(id, title, author, year_published)
VALUES
(1, 'Green Eggs and Ham', 'Dr. Seuss', 1960);
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
On the other hand, if we use INSERT IGNORE
, the duplication attempt is ignored and no resulting errors occur:
mysql> INSERT IGNORE INTO books
(id, title, author, year_published)
VALUES
(1, 'Green Eggs and Ham', 'Dr. Seuss', 1960);
Query OK, 0 rows affected (0.00 sec)
Using REPLACE
In the event that you wish to actually replace rows where INSERT
commands would produce errors due to duplicate UNIQUE
or PRIMARY KEY
values as outlined above, one option is to opt for the REPLACE
statement.
When issuing a REPLACE
statement, there are two possible outcomes for each issued command:
- No existing data row is found with matching values and thus a standard
INSERT
statement is performed. - A matching data row is found, causing that existing row to be deleted with the standard
DELETE
statement, then a normalINSERT
is performed afterward.
For example, we can use REPLACE
to swap out our existing record of id = 1
of In Search of Lost Time by Marcel Proust with Green Eggs and Ham by Dr. Seuss:
mysql> REPLACE INTO books
(id, title, author, year_published)
VALUES
(1, 'Green Eggs and Ham', 'Dr. Seuss', 1960);
Query OK, 2 rows affected (0.00 sec)
Notice that even though we only altered one row, the result indicates that two rows were affected because we actually DELETED
the existing row then INSERTED
the new row to replace it.
More information on using REPLACE
can be found in the official documentation.
Using INSERT ... ON DUPLICATE KEY UPDATE
The alternative (and generally preferred) method for INSERTING
into rows that may contain duplicate UNIQUE
or PRIMARY KEY
values is to use the INSERT ... ON DUPLICATE KEY UPDATE
statement and clause.
Unlike REPLACE
– an inherently destructive command due to the DELETE
commands it performs when necessary – using INSERT ... ON DUPLICATE KEY UPDATE
is non-destructive, in that it will only ever issue INSERT
or UPDATE
statements, but never DELETE
.
For example, we have decided we wish to replace our id = 1
record of Green Eggs and Ham and revert it back to the original In Search of Lost Time record instead. We can therefore take our original INSERT
statement and add the new ON DUPLICATE KEY UPDATE
clause:
mysql> SET @id = 1,
@title = 'In Search of Lost Time',
@author = 'Marcel Proust',
@year_published = 1913;
INSERT INTO books
(id, title, author, year_published)
VALUES
(@id, @title, @author, @year_published)
ON DUPLICATE KEY UPDATE
title = @title,
author = @author,
year_published = @year_published;
Notice that we’re using normal UPDATE
syntax (but excluding the unnecessary table
name and SET
keyword), and only assigning the non-UNIQUE
values. Also, although unnecessary for the ON DUPLICATE KEY UPDATE
method to function properly, we’ve also opted to utilize user variables
so we don’t need to specify the actual values we want to INSERT
or UPDATE
more than once.
As a result, our id = 1
record was properly UPDATED
as expected:
mysql> SELECT * FROM books LIMIT 1;
+----+------------------------+---------------+----------------+
| id | title | author | year_published |
+----+------------------------+---------------+----------------+
| 1 | In Search of Lost Time | Marcel Proust | 1913 |
+----+------------------------+---------------+----------------+
1 row in set (0.00 sec)
More information can be found in the official documentation.