You can use your favorite social network to register or link an existing account:
Or use your email address to register without a social network:
Sign in with these social networks:
Or enter your username and password
Forgot your password?
Yes, please link my existing account with for quick, secure access.
No, I would like to create a new account with my profile information.
Tips
How-to
News
Videos
Stories
Linda in Washington State wrote to me with juicy problem involving combining data from two different Excel 2007 files that have different columns but a common key. Right off-hand (and even later, on-hand) I don't have the answer to this. I told her I'd forward her issue along to a couple of Excel experts I know (me, not being one of them) but in the meantime, I'm guessing there are some Crabby readers out there who may also have had this issue in the past and have found a graceful solution to share with us.
Can someone please help Linda out (and in turn me and everyone else who reads your comments)?
Linda writes:
"Dear Crabby,
I very much value and enjoy your blog. I do have an Excel problem and would like to know if you have a better way for me to deal with it.
I often need to combine data from two files that have different columns but have a common key. (Typically one file comes from the normal PC world and the other from GIS spatial world.) There are often records for values of the common key in one file but not the other, and vice versa. I need to match up the common records so that all columns for a given key are on the same line. Currently it is a very laborious process. When sorted, it would be very helpful if I could put the cursor in any column and click a feature to highlight the duplicates or another one to highlight the uniques. I don't want to reduce the list to a unique instance of all records (that feature does exist). I would need to be able to sort based on the highlights. This way I would be able to temporarily separate the duplicates and uniques for further processing and then can recombine them when finished. Because of the disparity in the files, I don't know that it would buy anything to use a join function. So far I haven't found a join in Excel to try it out."
So, have a solution for Linda? Use the comment field below to take a crack at it. Although this thorny issue is about Excel 2007 please post any possible solution for any version of Excel—we can figure the rest out.
Comments: (23) Collapse
My instinct would be to use access to help this process along as it is really good at taking two sets of information (tables) and JOINing them with a common key.
Access also has really great tools for importing data in from excel. She could then export her data out of access or just link her spreadsheet back to her db.
This also has the added benefit of being future flexible, such as being able to store this data on a server and having other users access it.
But this is just my approach and I realize it may not be very helpful if her version of Office doesn't have access (shame about that product segmentation but that is a whole different topic...)
My first go at this would be to use a "Lookup()" function. It would look up the matching "key" in one sheet and return the value from the specified column in the other sheet.
vlookup in both files would identify the values that were on one sheet but not the other. Combining the data with vlookup to put all the common row data into one file would also help. You can put in a conditional format too to make the highlights more obvious.
I would do it the same way as sue.
And to filter you can make an extra colum te check if the value is the same.
=IF(C1=D1;1; 0)
Then autofilter on that colum 1= duplicate; 0 unique
i am also think use of access...
Use MATCH to find the index of the key column on both sheets and then use INDEX to pull in the data you want. MATCH returns #VALUE if it does not find a match so you can test for that, and if it's an error, just leave those cells blank.
Using VBA, you can autofilter the combined list by, say, double clicking a cell. Or you can combine MATCH and COUNTIF with OFFSET to pull in data matching a specified value in a specified column, but the list would have to be sorted by the specified column in order to find all of the duplicates.
You may need to combine your key columns first, which will be easier if you just copy paste them into the key column of your combined sheet and then use the remove duplicates feature. It can be done using worksheet functions, but it has the potential to be really slow and involves recursive functions; I'd recommend using VBA to automate that rather than worksheet functions, especially if your keys are not numbers.
Also, I don't feel real confident that I understand what she's trying to do. It would help if I had some sample data.
Perhaps this will help? It's a new post about VLOOKUP by Anneliese Wirth, on the Excel blog. Here's the link: blogs.msdn.com/.../vlookup-uhh-now-what.aspx. Its title is: =VLOOKUP (uhh...now what?).
I agree with Rosewood. However, another solution would be to save the files as .csv files and then import them into Excel. Excel is more an accounting, spreadsheet type of a solution. Access is the way to go for data management especially involving keys for a company of 500 or less. Then you would want to try DB2 (or is it 3 now?) or Oracle.
I agree with Rosewood. Excel is more of a mathematical, accounting type application. Access is the way to go for data management. However, I would recommend converting the files into .csv format and then importing them into excel.
Thank you everyone for your comments. I've never used Access but am eager to get started with it. I'm also going to try all of the Excel suggestions so I see how they work.. Let the fun begin...
I have some vbscript files that do this for our accounting team. The code queries the workbook as if it were a database and performs a join on the two sheets. The results are saved to a second workbook. I've never tried, but maybe the same could be accomplished with MS Query?
Microsoft just released the perfect solution to your problem. PowerPivot for Excel is a free 2010 add-in for Excel that lets you import and integrate data from virtually any data source, including other spreadsheets. Once you integrate your two data sources you can extend them by adding new calculated measure, and can build pivot tables and charts based on the combined sources.
There is also a SharePoint component called PowerPivot for SharePoint 2010 that lets you upload the file and schedule it to refresh from each of your data sources. Users can view and interact with the report simply using their web browser.
Download PowerPivot for Excel 2010 from Microsoft here: http://bit.ly/c4RcoH
@aporter and @LINDA: Yes! I should have thought of PowerPivot! I even wrote a post about it: blogs.office.com/.../powerpivot-for-excel-2010.aspx
In fact, MANY of my blog colleagues wrote about PowerPivot: blogs.office.com/.../SearchResults.aspx
Linda -- give it a try and get back to me?
Everyone: I'll post the solution Linda uses either here or in a post. Stay posted...
Comments: (loading) Collapse