So this whole example comes about because of this tweet by Kate Gardiner
So in all of Twitter-dom, there is seriously no tool that archives a day’s tweets in a sortable fashion?
which I seem to see a lot on Twitter. Being pretty tight with the Twitter API I always forget that others might not know how easy it is to get data off Twitter to use yourself. So I threw together this example real quick and figured I’d share for others with similar problems. Feel free to copy this Google Doc I’m sharing and use how you please. If you think of anything cool that I might have missed please share in the comments.
So I asked Kate if a spreadsheet was cool for the data which she was so I figured okay lets just grab the data from her account as an XML and load into a spreadsheet. At first was going to use Openoffice but decided instead to use Google Docs because then I could easily share the results (and I think it actually turned out easier.) Here is the example Google Doc Spreadsheet I created that you can look out while reading the rest.
First we needed to get the data. By looking at the Twitter API for your timeline we can see all of the flags we can use to get all the tweets we want and in the way we want. There isn’t a good way to limit by date (they want a tweet id that you want to start from instead) I just decided to grab 200 tweets (the max it will allow) and for 1 page (you can get a total of 3200 tweets by increasing the page if you like.) For my own account the url for the timeline looks like this –
I went with the XML version of the output because it was the easiest to work with in Google Docs. You can look at that link as a reference later for the XPath portion used later. Now that we have the data we can import it into the spreadsheet using the importXML() function. You can find more info about it and it’s variants in it’s documentation.
Now to make it really simple for others to just use the spreadsheet without having to understand what to change in the URL I created the side portion where you could easily change a couple values and get the results you need. They just go to make the URL in the end used by the function. You can add and change this per the above documentation if you are looking for something else.
The other part I created was the ability to generate each of the different columns based on which fields of the tweet you want. By looking at the above example XML you should be able to figure out what the other XPaths you would need to get your desired results. If someone wants me to go into more detail on that I can but I think it’s pretty self explanatory if you look it over.
Of note, this will only work for parts of the API that don’t need authentication to access as there isn’t an easy way to get that information into the spreadsheet without having log in elsewhere to get it. So you are pretty limited in grabbing some data like your home feeds and such. Another that isn’t protected is the Search API which also allows you to grab data easily. I created an example of how to do that on the second sheet which you can also use. This would let you create something along the lines of a poor man’s RowFeeder. There are some slight differences to that sheet that I’ll explain in a later post but you can see it now if you want to play.
Hope this helps some people and again if you have any questions feel free to ask below.Comments: 2 Tags: