Microsoft Excel is a powerful data entry and analysis tool. However, when working with large spreadsheets, especially those integrating data from various sources or shared among teams, duplicates can easily occur. These duplicates can compromise the accuracy of your results and clutter your reports. Fortunately, Excel offers numerous features to efficiently identify and eliminate duplicates.
This guide will introduce you to the most effective methods for removing duplicates in Excel. Whether you’re a beginner or regularly handle large datasets, you’ll learn step-by-step techniques to quickly clean your data.
Before diving into the methods, let’s explore why duplicate values can be problematic:
Regularly cleaning your data by eliminating duplicates ensures consistency, accuracy, and professionalism in your work.
The simplest and most common method for removing duplicates in Excel is through the built-in “Remove Duplicates” tool. Here’s how to use it:
Step 1: Open your Excel spreadsheet and select the range where you suspect
duplicate data might exist.
Step 2: Go to the Data tab on the ribbon.
Step 3: In the Data Tools group, click on Remove Duplicates.
Step 4: A dialog box will appear. Choose the columns to check for
duplicates. If your data has headers, ensure the My data has headers box
is checked.
Step 5: Click OK. Excel will process the data, remove duplicates, and
display how many duplicates were deleted and how many unique values remain.
Suppose you have a list of customer names and email addresses. A customer might appear twice if imported from different platforms. You can select both columns and use the Remove Duplicates tool to clean it up.
If you’re not ready to delete duplicates and prefer to visually identify them first, conditional formatting is an excellent method.
Step 1: Select the range you want to check.
Step 2: Click on the Home tab.
Step 3: Go to Conditional Formatting > Highlight Cells Rules > Duplicate
Values.
Step 4: Choose a highlight color.
Step 5: Click OK.
Excel will highlight all duplicated values in the selected range, allowing you to review and manually decide which rows to delete or keep.
This method is especially useful when the same name or ID might be entered with slight differences, enabling you to spot them visually before taking action.
If you want to view only unique entries without permanently removing duplicates, use Excel’s Advanced Filter option.
Step 1: Select your dataset.
Step 2: Click the Data tab.
Step 3: In the Sort & Filter group, click Advanced.
Step 4: In the pop-up window, choose:
Step 5: Check the box Unique records only.
Step 6: Click OK.
You’ll now see a clean list of unique records in the desired location.
For more control or to flag duplicates instead of deleting them immediately, formulas are a smart choice.
Suppose you have data in columns A, B, and C, and you want to check if the combination of these values is duplicated.
Step 1: Create a helper column (say, column D) and concatenate values using a formula like:
=A2&B2&C2
This combines the values into a single string.
Step 2: In column E, use a formula to count how many times each string appears:
=COUNTIF($D$2:D2, D2)
Step 3: Filter column E to show only rows where the count is greater than
You can now delete or review these rows manually.
If you’re dealing with large datasets or want to automate the process, Power Query is an excellent option.
Step 1: Select your range and click on Data > From Table/Range (in the
“Get & Transform” group).
Step 2: In the Power Query Editor, right-click the column(s) you want to
check and select Remove Duplicates.
Step 3: Click Close & Load to return the cleaned data to a new sheet.
You can also remove duplicates across the entire table or only on selected columns. The best part? Power Query allows you to refresh your cleaned data whenever the source data changes.
Want to highlight only the unique values? Here’s how:
Step 1: Select your data range.
Step 2: Click on Home > Conditional Formatting > New Rule.
Step 3: Choose “Use a formula to determine which cells to format.”
Step 4: Enter this formula:
=COUNTIF($A$2:$A$100, A2)=1
Step 5: Pick your formatting style and click OK.
Now only the unique entries will be highlighted.
Cleaning duplicate data in Excel doesn’t have to be challenging. Whether you prefer quick tools like Remove Duplicates, visual aids like Conditional Formatting, or advanced automation with Power Query, Excel provides multiple solutions.
By removing duplicates:
Start with small datasets to test out these methods. Once you’re confident, apply them to your larger projects for smoother workflows.
Discover the essential books every data scientist should read in 2025, including Python Data Science Handbook and Data Science from Scratch.
Learn what Alteryx is, how it works, and how it simplifies data blending, analytics, and automation for all industries.
Generative Adversarial Networks are machine learning models. In GANs, two different neural networks compete to generate data
Hadoop vs. Spark are two leading big data processing frameworks, but they serve different purposes. Learn how they compare in speed, storage, and real-time analytics.
Discover six AI nurse robots revolutionizing healthcare by addressing resource shortages, optimizing operations, and enhancing patient outcomes.
Discover how Generative AI enhances personalized commerce in retail marketing, improving customer engagement and sales.
Use Google's NotebookLM AI-powered insights, automation, and seamless collaboration to optimize data science for better research.
Monte Carlo Simulation in Excel helps model uncertainty through random sampling. This guide explains how to use it for predictive modeling and smarter decisions.
Discover how to measure AI adoption in business effectively. Track AI performance, optimize strategies, and maximize efficiency with key metrics.
Exploring AI's role in revolutionizing healthcare through innovation and personalized care.
Explore the top GitHub repositories to master statistics with code examples, theory guides, and real-world applications.
AI-driven identity verification enhances online security, prevents fraud, and ensures safe authentication processes.
Insight into the strategic partnership between Hugging Face and FriendliAI, aimed at streamlining AI model deployment on the Hub for enhanced efficiency and user experience.
Deploy and fine-tune DeepSeek models on AWS using EC2, S3, and Hugging Face tools. This comprehensive guide walks you through setting up, training, and scaling DeepSeek models efficiently in the cloud.
Explore the next-generation language models, T5, DeBERTa, and GPT-3, that serve as true alternatives to BERT. Get insights into the future of natural language processing.
Explore the impact of the EU AI Act on open source developers, their responsibilities and the changes they need to implement in their future projects.
Exploring the power of integrating Hugging Face and PyCharm in model training, dataset management, and debugging for machine learning projects with transformers.
Learn how to train static embedding models up to 400x faster using Sentence Transformers. Explore how contrastive learning and smart sampling techniques can accelerate embedding generation and improve accuracy.
Discover how SmolVLM is revolutionizing AI with its compact 250M and 500M vision-language models. Experience strong performance without the need for hefty compute power.
Discover CFM’s innovative approach to fine-tuning small AI models using insights from large language models (LLMs). A case study in improving speed, accuracy, and cost-efficiency in AI optimization.
Discover the transformative influence of AI-powered TL;DR tools on how we manage, summarize, and digest information faster and more efficiently.
Explore how the integration of vision transforms SmolAgents from mere scripted tools to adaptable systems that interact with real-world environments intelligently.
Explore the lightweight yet powerful SmolVLM, a distinctive vision-language model built for real-world applications. Uncover how it balances exceptional performance with efficiency.
Delve into smolagents, a streamlined Python library that simplifies AI agent creation. Understand how it aids developers in constructing intelligent, modular systems with minimal setup.