Auto Tweeting User’s Actions on Your Site (Part 1)

This entry is part 2 of 4 in the series Auto Tweeting
Auto Tweeting Series Index

Automated tweets are the worst, right?

What about intelligently automated tweets with a splash of randomness, some spinning as well as name and link dropping? You with me now? Let’s take a ride through a bit of back story.

Looking for Code?

If you are just looking for code you are in the wrong place. We are talking concepts first; then comes the code.

Back Story (& Panda Rant)

I have a project site with an ~$6cpm that took a Panda hit but bounced back with the update. Before the update I’d almost lost hope, I got rightfully hit by Panda; 98% of my content was aggregated from other sources with a scant 2% unique content wrapping it. Why did I bounce back you ask? Because while my content was aggregated it wasn’t scraped, it wasn’t farmed it wasn’t shady or black hat; it was what it was: authorized aggregate content from a 3rd party presented in a (semi-)unique manner. I think it also helped that I was not (and am still not) running ANY AdSense units on the site.

After the Panda update and the bounce back in traffic I started spending more time on the site and brainstorming ideas. I’ve since introduced multiple features, Facebook integration, more SEO optimizations. The bounce rate is ~35% because of the rich interaction available with the 3rd party content.

I also cut out a few things that might have looked a bit…fishy to the Google. Including some sitemap.xml entries that were a bit grey, if you know what I mean.

So how can I get Google to acknowledge those pages and find a way for their crawlers to find them as well as drive traffic from other potential sources. 90%+ of the traffic to that site is from Google so I needed to diversify or appease the Google gods.

I looked to Twitter. Google is all over Twitter. So I took another look at my own site and tried to figure out how I could tweet things that wouldn’t end up being the same 12 tweets over and over again like the last Twitter bot I had built for the site.

There are a lot of actions on the site for a visitor to take so I split the concept into 3 buckets, surfacing products, surfacing categories and surfacing user actions. For example an action on the site is to search for products by keywords, we’ll talk about that specific concept and how I executed it into my auto-tweeter.

Logging Search Actions

Whenever a user performs a search on the site it is logged into a database. This was initially so I could get some visibility into what people were searching for. The structure was pretty basic

Field Type Desc
id int PRIMARY KEY
query varchar Search Query
datetime timestamp When the search was performed
user_agent varchar So I cound later weed out Robots or detect trends by browser/OS
referrer varchar So I knew where they came from

I had to add a couple fields to it to work with the auto-tweeter

Field Type Desc
tweeted enum(‘yes’,'no’) Has it already been tweeted yet?
is_search enum(‘yes’,'no’) Do we REALLY think this was a user search or a bot crawl

Now, could I have just launched a tweet every time a person performed a search? Yup. But I’d rather queue up the tweets and mix them in with other types. Remember this was the first step in a 1 week project of tweeting 8 types of tweets.

I used is_search to populate yes or no if the search being logged passed two challenges:

  1. Is the referrer from the same website (remember now that the search URLS are going to be tweeted you will get A LOT of no’s on that.
  2. Is there a user agent and does it look legit (aka, not like a robot)

Now that the searches were being logged I started building my tweet machine!

Queue Concepts

The query to pull the next search to tweet was pretty simple SELECT * FROM search_log WHERE is_search = 'yes' AND tweeted='no' AND datetime>= DATE_SUB(NOW(), INTERVAL 2 DAY) ORDER BY datetime DESC. Then after the tweet is posted I did a blanket update to all queries in the table within the time range I query for to make sure it didn’t get re-tweeted. UPDATE search_log SET tweeted='yes' WHERE query=$QUERY AND datetime>= DATE_SUB(NOW(), INTERVAL 2 DAY)

The real trick was to make sure that the next tweets hadn’t recently been tweeted! For example if a user searches for ‘Kittens’ at 9am and ‘Kittens’ gets tweeted at 10am everything is okay. But if another user comes and searches for ‘Kittens’ at 12pm that query hasn’t been updated with the tweeted status yes since it was logged after the tweet.

So now the tweeting machine needs an update. Let’s pull all the queries that have been tweeted in the time range and pass them to verify against in our query. Also – since we’ll be using this sub query we don’t need to blanket update the log after the tweet since we’ll pick up on the queries that have been tweeted. So those two queries get updated to: SELECT * FROM search_log WHERE is_search = 'yes' AND tweeted='no' AND datetime>= DATE_SUB(NOW(), INTERVAL 2 DAY) AND query NOT IN(SELECT query from search_log WHERE tweeted='yes' AND datetime>= DATE_SUB(NOW(), INTERVAL 2 DAY)) ORDER BY datetime DESC and after the tweet is posted UPDATE search_log SET tweeted='yes' WHERE id=$ID

Normalization

There is also a case to be made for creating a field or column in the table for storing the normalized variation of the query. For example lowercase, strip s’s and other pluralization, strip punctuation, spaces etc. I didn’t do this for my auto tweeter but I probably should consider doing that, now that I think of it. In that case you’d use you sub-query to verify against the normalized values but still use the original query when posting the tweet

Search Actions as Tweets

Now we are at the good part, you have the query that has been searched for recently (this is also a great way to hop on trends without you wasting time following all the trends!) Let’s construct the tweet!

Here’s 3 templates your twitter machine could rotate or randomize through to post the search

  • One of our visitors just checked out $QUERY products here’s what they found $LINK
  • A user just searched for $QUERY on our site, they found some neat items! $LINK
  • Someone wants more $QUERY products – maybe you should go see too $LINK

Now you’ll want to make sure those strings are English formatted to make sense with the queries you tend to see. There will always be a few that “Don’t sound good English” but it’s a bot, it’s trying it’s best, you just need to train it better when you see it mess up!


If you made it to the end of this post bless your heart. I bet you were looking for straight code and all you found were my ramblings. Well you can get the code anywhere, but now you have some concepts to adapt to your specific use-case. I continues using VERY similar concepts to also tweet categories, products and a set of predefined tweets that I’ll talk about in the next post. Then I’ll drop some code on you, promise ;)

This site runs on the Thesis WordPress Theme

Thesis Theme thumbnail

If you're someone who doesn't understand a lot of PHP, HTML, or CSS, Thesis will give you a ton of functionality without having to alter any code. For the advanced, Thesis has incredible customization possibilities via extensive hooks and filters. And with so many design options, you can use the template over and over and never have it look like the same site.

If you're more familiar with how websites work, you can use the fantastic Thesis User's Guide and world-class support forums to make more professional customizations than you ever thought possible. The theme is not only highly customizable, but it allows me to build sites with a much more targeted focus on monetization than ever before. You can find out more about Thesis below:

Leave a Comment

Previous post:

Next post: