Back
Excel

Pulling RSS data into Excel (or: Using Excel to Search Craigslist) – Part 1

image

 

This blog post is brought to you by Dan Battagin a Lead Program Manager on the Excel team.

OK, so I’m going to talk a bit about a relatively unknown feature in Excel: XML data import. It was introduced in Excel 2003, but we’ve done a pretty good job hiding it since Excel 2007 by putting it on the Developer tab of the Ribbon. I’m going to make the topic even a bit more geeky by using it in conjunction with VBA.

In exchange, I’m going to produce, at the end, a nice little solution that you can use to search Craigslist…across multiple Craigslist sites, all from a single Excel sheet. Oh, and I’ll give you the workbook too, in case that’s all you want. Cool ehh?

Right, so let’s set the stage. Craigslist is an amazingly cool marketplace – everything’s free, and so both buyers and sellers can get a pretty good deal. A problem that I’ve run into though is that the local selection isn’t always great, and so I find myself searching a few different sites for the item I want. Sound familiar?

Let’s build a sheet that looks something like this, that allows us to search as many (or few) craigslist sites as we want for a given item:

clip_image002

Searching the entirety of craigslist – nice.

To do this, we’ll take a few overall steps:

  1. Using XML import (for RSS), setup the results table for the workbook (for a single site).
  2. Add the “Search box” to parameterize the XML import.
  3. Add the ability to search multiple sites in one fell swoop.
  4. Add the “Status indicator” so we know how much longer the search will take.

With that, let’s get ready to rumble!

Using XML Import to Setup the Results Table

To get started, let’s setup the results table – this is actually quite easy – it just entails us getting the URL to the RSS results for a search. Something like this:

http://seattle.craigslist.org/search/?areaID=2&subAreaID=&query=delta+jointer&catAbb=sss&format=rss

To get the data into Excel, follow these steps:

  1. Enable the Developer ribbon, if it’s not already by clicking File | Options | Customize Ribbon and then check the “Developer” checkbox in the right-hand listbox.
  2. Select cell B5 in your sheet (I’m positioning us here due to future steps in this blog post)
  3. Click Developer | XML > Source, then click XML Maps… | Add… from the task pane that opens
  4. Paste in the URL above and click Open, then OK the XML Schema warning that is shown, then click OK again.
  5. Drag the node labeled “ns3:item” into B5 of your sheet. You should now be looking at something like this:

clip_image004

XML Mapping in Excel – probably a new feaure for you!

Whew, we’ve got our “data connection” into Excel now, and we just need to clean up the Table a bit. Again, this isn’t hard, but there are a few steps to take…

  1. We only need a few columns. Rename them as follows by just typing in the table header, and reorder them if you want (I put Date after Linky):

    • ns3:title > Title
    • ns3:link > Linky
    • ns3:description > Description
    • ns3:date > RawPostDate
  2. Delete all the other columns in the table by right-clicking on them and choosing to Delete Table Column
  3. Add a couple new new table columns wherever you want in the table. Call them “Date” and “Location”
  4. For each of the new columns, enter the following formulas in the first row under the table headers:

(by the way – you’ll see two things: Firstly, the new “@” notation for referring to the current table row, which we added in Excel 2010 to make working with tables easier. Secondly, you may get some #VALUE!’s in cells – don’t worry about those for now)

 

Column<:o:p>

Value<:o:p>

Date<:o:p>

=VALUE(LEFT([@RawPostDate], SEARCH(“T”, [@RawPostDate])-1))

Location<:o:p>

=LEFT(RIGHT([@Linky], LEN([@Linky])-LEN(“http://”)), (SEARCH(“.”, [@Linky])-1)-LEN(“http://”))

5. OK, now we’re ready for some data. Just right-click on the table and choose Refresh. Ignore any errors, they are meaningless for our purposes, and we’ll take care of them later.

6. Next, let’s get the table looking a bit better by:

  1. Select a single cell in the table | right-click | XML | XML Map Properties… and
    1. uncheck “Adjust column width”
    2. select “Append new data to existing XML tables”
  2. Select the entire table and click Home | Cells | Format |Row Height, and set the height to 15.

7. Lastly, a couple cool formatting “tricks.”

  • Select the Linky column and right-click | Format Cells. Select the Number tab | Custom, and use the following format, exactly as shown, which tells Excel to always display the text “link” in the cells (oooooo, aaaaaahhhhh):
    ;;;”link”
  • Select the Date column and right-click | Format Cells. Select Date, and choose a format to your liking.

And just like that, we’ve got a search results table. Cool beans.

clip_image005

RSS data in Excel!

Adding the Search Box to Parameterize the XML Import

OK, the search results are cool, but I mean really…it’s just a baby step in the right direction. After all, it’s hard-coded to a single search term right now. To change that, we’ll add a search box, and let the user update the results for the table. That’s going to involve a bit of VBA…

  1. Select cells B2:E3 and merge them.
  2. Select the merged cell and click Formulas | Define Name, and call the name rngSearchTerm.
  3. Insert an image or shape or button that will be the “start” search button.
  4. Right click on the “start” button and choose “Assign Macro”, then name the macro “RunSearch” and click Edit in the dialog that comes up.
  5. Enter the following VBA for the RunSearch method:

 6. Close the VBA editor and search away.

Now that’s pretty cool – a parameterizable search that uses XML maps to import data from the Craigslist RSS feed. Next time, we’ll look at updating the example so that we pull data from multiple Craigslist sites, and we’ll add a little progress meter.