Twitter (tweets) backup

Today someone came to this blog searching for a twitter backup facility. I never did post such an application/script so I figured I’d share my way of backing up my tweets.

I actually never backup my tweets (nothing of value would be lost), and never intend to, but for the sake of posting something I’ve said I’d give it a go.

I’m gonna perform the backup from the command line (like a true magician) without using the twitter API, so only public tweets will be backed up.

Ok, first we need to know the number of tweets we are going to back up (in my case 630) and divide that number by 20 (the number of tweets per page) rounding up the result. In simple math:

630 / 20 = 31.5 ~ 32

Now we know my tweets are distributed on 32 pages.
Next we retrieve the 32 different pages via wget. First variant is from a Windows terminal:

for /L %i in (1,1,32) do @wget http://twitter.com/username?page=%i

The for trick is something I learned from the command line kung fu blog. It will iterate from 1 to 32 and store the value in the %i variable. Oh, and before you hit enter don’t forget to replace the username with your actual twitter username.

The Linux version is just a transcription of the above command:

for (i=0; i<32; i++); do `wget http://twitter.com/username?page=$i`; done

Hopefully I nailed it; didn't actually test the Linux command but it should work. If not, just leave me a comment with the correction.

As the rest goes, it's just simple regular expression matching and replacing (format a bit the end result).

grep -o -P "<span class=\"entry-content\">(.*?)</span>" * > brute.html
sed "s/<span class=\"entry-content\">//g" brute.html | sed "s/<\/span>/<br><br>/g" > final.html

After executing the last two commands you should have your tweets stored in the final.html file.




5 Responses to “Twitter (tweets) backup”


  1. David says:

    Interesting. Thanks for the share.

  2. dblackshell says:

    @sanitybit that uses the twitter API.

  3. Darknet says:

    Actually you can just use Twitter Tools plugin if you are using wordpress, mine is automatically archived every Saturday morning into a hidden category.d

  4. Mingyi says:

    The Linux version command should be:
    for ((i=0; i<32; i++)) ; do `wget http://twitter.com/username?page=$i` ; done


Leave a Reply