No More Repeats!
Removing duplicate records from a Microsoft Access table.
Introduction
Occasionally, you may have a table in Access that contains duplicate records. Maybe
it was imported from one or more sources or perhaps it was not setup with the necessary
key to ensure unique records. Whatever the reason, you want to make sure that the
records in the table are unique. Fortunately, Access offers a couple of options
that you can use to remove the duplicates from your table quickly and easily.
This article will explain two methods for removing duplicate information from tables.
Please remember that you should always backup your database before
making changes to database objects or deleting large amounts of data.
A sample database with the tables and queries shown here is available for download.
Download sample for Access 97
Download sample for Access 2000 and 2002
Throughout this article, I use links to various screenshots which will be displayed
in a separate window. An index of these screenshots is available at the bottom of
this page.
As the example, I'm using a small table of fictional names, addresses and telephone
numbers. Figure 1 shows a screenshot of the original table which contains
duplicate records as shown by the highlighting. For the purpose of this example,
I've only included thirteen records and it would be simple to pick out the duplicates
manually but if this table contained hundreds or even thousands of records, that
would probably not be an option.

Figure 1 - Original Contacts table with obvious duplications highlighted
Find Duplicates Query Wizard
The first way to remove duplicates from a table is to use the Find Duplicates Query
Wizard which is available by clicking on the New button in the database window or
by selecting "Insert" and "Query" from the menu bar. (Figure 2)

Figure 2 - Accessing the Find Duplicates Query Wizard
After selecting a table or query in which to find duplicate records, the wizard
will prompt you to select a field in which the data should be unique for each record
(Figure 3). In this case, the home phone number field is the best available. While
you could use the address field to narrow the table by household, the phone number
is shorter and is stored in a standard format. It is much less likely to contain
errors or abbreviations that will prevent the wizard from finding a duplicate. You
can also select a combination of fields if there is no one field that will ensure
a unique record.

Figure 3 - Query Wizard: Selecting unique field
The next screen (Figure 4) will enable you to select any other fields you want to
include in your query. In this illustration, all of the fields in the table are
selected but you can select as many or as few as necessary. These fields will not
affect the identification of duplicates but they will make the query results more
understandable. If you do not select any additional fields, the wizard will add
a field of its own that will count the number of duplications for each value it
finds in the field that you selected earlier.

Figure 4 - Query Wizard: Selecting accompanying fields
The next screen will enable you to enter a name for the query to be saved under
and will give you the option of viewing the data or changing the design of the query.
When you run the query, the results will look something like Figure 5.

Figure 5 - Query Wizard: Find Duplicate Query results
You will notice in this example that the query has actually found three sets of
duplicates. In addition to the two duplications which were highlighted in
Figure 1, the table also contains the same phone number for Harrison and Geneva
Raines. As I mentioned earlier, the wizard will allow the use of more than one field
while searching for duplicates. In this case, using the first and last names in
addition to the phone number would have eliminated the two highlighted records.
With a database like this, you may or may not want to allow more than one contact
per household, depending on your use of the data. Either way, the query has found
the duplicated information in the table according to the specified settings.
Depending on the size of the table, this method may be all that you need. If the
query finds no more than a few dozen duplications or if you want to remove both
the original and duplicate record, it's easy to delete the necessary records from
the query. The records are automatically removed from the table on which the query
was based. On the other hand, for especially large tables where you only want to
remove duplications and leave the original record in the table, this may not be
enough.
Using an Append Query
The other method of removing duplicate records is a two-step process that involves
creating a copy of the table and once again selecting a field or combination of
fields to determine if each record is unique.
The first step is to create a copy of the table structure. To do this, select the
table in the database window and then select "Edit" and "Copy" from the menu bar
or right-click the table and select "Copy" from the pop-up menu that appears. Then
paste the table back into the database window, once again using either the menu
bar or the pop-up menu. You will see the dialog box displayed in Figure 6. After
entering a new name for the table, select "Structure Only" from the paste options
shown. This will create a copy of the table structure but will not transfer the
data.

Figure 6 - Copying the table structure
To select the unique field, open the new table in design view and select the appropriate
field from the list shown. Right-click the field and select "Primary Key" from the
pop-up menu or use the same option on the menu or toolbar (Figure 7). A table's
primary key is used to uniquely identify each record. Access will not allow duplicate
information to be entered in a field marked as the primary key. In this example,
I'm still using the phone number as the unique field but a combination of fields
can also be set by holding down the SHIFT or CTRL key, selecting more than one field
and setting it as the primary key using either the pop-up menu, menu or toolbar.
Save your changes to the table structure and then close the table to return to the
database window.

Figure 7 - Choosing the new primary key
The next step is to create an append query that will transfer the data from the
original table to the new table structure. An append query must follow the same
rules set by the primary key as you would if entering data manually. This means
that it will not be able to copy duplicate records to the new table. Only one copy
of each record will be transferred.
To create the append query, create a select query from the original table, including
all records in the table as shown in Figure 8 below. Right-click in the top half
of the query design area and select "Query Type" from the pop-up menu and then select
"Append Query". This will change the query type while retaining the table and fields
that you've inserted.

Figure 8 - Creating the append query
You will be asked for the name of the table to append the data to (Figure 9). Select
the name of the empty table that you just created and click OK.

Figure 9 - Selecting the table for the append
Notice that the rows displayed in the bottom half of the query design grid have
changed. (Figure 10) There is now an "Append To:" row that lists the names of fields
in the new table as they correspond to the fields in the table from which the data
is being transferred. If the new table had different field names, you would have
to select them manually but because the structure is exactly the same, the names
are already filled in. Even so, it is a good idea to double-check these field
names before running the query.

Figure 10 - Completed append query
When you run the query, Access will display the warning shown in Figure 11. Not
all of the records from the Contacts table can be appended to the new table because
of key violations. Again, you can double-check the reasons stated in the message
and then click "Yes" to proceed with the query.

Figure 11 - Key violations message
An append query will not display a datasheet view as a select query will. It will
just perform the append. After the query is run, you can save it if you might want
to use it again later or close it without saving it. When you open the new table,
it should contain all of the information from the old table, minus the duplicate
records (Figure 12).

Figure 12 - Results of append query
Notice that Geneva Raines is still listed in the table but Harrison Raines has been
removed. This is because only the phone number was specified in the primary key
and Geneva Raines' record was listed first according to the table's sort order.
After transferring the data between the tables, you can rename the tables and change
the primary key of the new table as necessary. You may find that it's best to retain
the old table for a period of time until you are absolutely sure that you no longer
need it. This is the basic procedure for using an append query. If you
want, you can also add extra criteria to the append query to determine which records
will be transferred.
For more information on append queries, data management or other operations with
Microsoft Access, you can consult Microsoft's searchable online support section
at
http://support.microsoft.com. If you're comfortable with Usenet,
the newsgroup
comp.databases.ms-access is also an excellent resource. The complete
newsgroup archive is available online at
http://groups.google.com.