This post is a continuation of a previous post; My Splogging Experiment (Part 1). The objective of this experiment is to find out whether splogging is a worthwhile online revenue generation method. If you haven’t read Part 1 of this experiment, I recommend that you do so before reading the rest of this post.
Day 4: Tuning the internal SEO of the splog
Seeing that my experimental splog will have no chance in getting genuine incoming links from other web sites, the next best thing that I could do is to finetune its internal linking.
WordPress already does a wonderful job in maintaining an inter-connected links for posts, pages and archives. However, this can be improved significantly by deploying two powerful WordPress plugins; Related Entries and XML Sitemap Generator.
Running these two plugins pretty much covers most of what can be done in the SEO department for a splog. Heck, it might even be bordering on futile. However, from what I’ve discovered towards the end of this experiment is that it does help in the indexing process. But that will be discussed later.
I also realized that my only method of getting incoming links was from PPS. Being dormant, yet at the same time still having a considerable amount of real human traffic makes it an ideal gateway for me to get some incoming traffic.
However, as I’ve mentioned earlier, having to manually ping it is a waste of time. I decided that I should start building a script that would automatically ping PPS and have incoming links generated for the posts in my experimental splog.
Day 5: Dissecting the PPS ping form
For the benefit of those unfamiliar with PPS, let me just do a quick overview here. PPS is an acronym for Project Petaling Street. It was started about five years ago and was the de facto Malaysian blog aggregator (or blogtal [short for blog portal] in the founders’ own words).
It operates using the trackback protocol and publishes all received trackbacks immediately onto its front page. There was only minimal moderation of content publishable on PPS. It essentially was a trackback-powered free for all web site. An alternative manual ping submission form is also available.
Over the past two years or so, PPS has essentially ran on auto-pilot. The already tiny amount of moderation it had has become non-existent. The trackback URL is no longer working. The quality of posts being published has also declined significantly. Nevertheless, PPS still remains popular (below 75k Alexa score) and even has a healthy PR5 front page.
One thing that still works though, is the manual ping submission form. It is this form that I would like to exploit to automatically submit the entries in my experimental splog.
The ping form is located on a HTTP password protected page. This would be the least of my problem as the username and password for the page can simply be supplied in the
http://user:email@example.com/ format. The challenging part would be how to automate or emulate the form submission process programmatically.
To do this, one has to understand the workings behind HTML forms. The tool I use to do just this is a Firefox addon called Live HTTP Headers. You can record, save and even replay the entire HTTP request and acknowledgment process on your browser.
I spent most of my project time on that day examining the form submission process and studying the tools that can help me to do this automatically. After I’ve shortlisted the tools I could use, I left it at that and decided to continue tomorrow.
Day 6: Building the auto-pinger
From what I’ve gathered yesterday, I figured that the best method for me to automatically ping PPS via its submission form would be by using curl. So what is curl? Taken from its web site:
curl is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP, TELNET, DICT, LDAP, LDAPS and FILE. curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, kerberos…), file transfer resume, proxy tunneling and a busload of other useful tricks.
Sounds like a marketing brochure for a Microsoft product? Sure does… but to be honest; curl does deliver. It can, in fact, do all of that!
Just from that description I already know that curl would be the perfect tool for me to automate my ping submission to the PPS form. The form used by PPS is a typical HTTP POST form which requires authentication. Nothing good old curl couldn’t handle.
So now that I know my experimental splog will be using curl to perform the automated form submission to PPS, the next question would be “Which curl?”
You see, most production PHP builds include curl features via
libcurl. Then there’s the good old command line
curl. Essentially, the functions that I would be employing for my experiment is available in either forms of curl. I chose to go with the command line version of curl because I felt that its documentation is more detailed.
Now that we got our form manipulation tool, let’s plan on how we could perform the automatic pinging. Below is my implementation plan:
- Choose a random post that hasn’t been sent to PPS
- Grab its title, excerpt and URL to be used in the PPS ping form
- Submit the data gathered above to the form while automatically supplying the login information
- Mark the post as pinged so that it won’t be used in the future
Wow, as simple as four steps! But I know that it would not be so easy to convert concept to code.
Now some of you wondered why I bother to mark posts that has already been submitted. Surely we can get more traffic just by submitting random posts ad infinitum. If I were to really be a splogger then of course I would do this! However, please remember that the main purpose of this exercise was as an experiment. Therefore, I’m obliged to respect the PPS policy which states:
Anyone found abusing this facility (e.g. multiple pings, repeatedly putting in false information, etc.) will be permanently banned. IP addresses, blog URLS and blog names can be used to identify and ban offending websites. Final discretion on the definition of “abuse” lies with the PPS Administrators.
Multiple-pings (of the same post) is against the PPS policies, so that is a no-no and I won’t do it for my experimental splog. That is why I must make absolutely sure that no duplicate pings are sent.
WordPress makes this very easy to do because there’s a field called
pinged in the
wp_posts table. This table holds all the necessary information that we would need to perform the auto-pinging.
Once I’ve worked out the SQL statement that would grab the post title, excerpt and URL, it’s time to pass this information to
curl for it to do its magic. I decided to use PHP as the main language for this purpose as I would like the freedom to run it from the command line and browser.
This exercise took me almost the whole day because of my unfamiliarity with the curl command line options, allocating time between the project and my day to day duties; and other normal daily distractions. I’m proud though that at the end of the day, I finally got it somewhat working.
I won’t publicly disclose the source code of my auto-pinging script because:
- It’s too easy to write, and anyone with a decent grasp of PHP can do it in less than a day
- I’ve already proven that it can be done and have nothing more to prove
- It can cause more unnecessary grief to the PPS folks
Day 7: Testing and deploying the auto-pinger
By today, the number of posts in the experimental splog had risen to about 400. I have more than enough data to experiment with my auto-pinging script.
Testing and fine-tuning the script taught me a lot of things that curl is capable of. It’s an understatement to say that it is a good URL manipulation tool. It can do virtually anything that the HTTP/HTTPS protocol supports; so this makes it more of a web protocol Swiss army knife!
I found that I can fake a referring URL and UserAgent data among other things. Although my intentions on learning more about curl was to automate an experiment, the experience has shown me that curl would definitely be a good tool to study for future projects.
After about 20 or so rounds of trial and error testing, the script finally works as it should. I set up a cron job to run the script every 15 minutes (PPS policies state that only one ping is allowed within a 7 minute period).
I was done with the getting traffic portion of this experiment and now I’ll just let the whole thing run on auto-pilot. After about two hours, it looks like the script has been performing well. So I decided to just let it do its job for the rest of the day.
Day 8: Obvious problem with the scheduling
I woke up at 5:00am and decided to check out how my experimental splog’s ping has been performing. Lo and behold, half the page was covered with pings from my splog! This is unacceptable because it’s an obvious sign that the pings were automated.
It was entirely my fault though. Even as I exercised due care as to not violate the PPS ping policy, I failed to take into account the average “waking hours” of Malaysians. I have been sending out four pings every hour… even in ungodly hours of the day!
So it’s back to the crontab. Changed the schedule of running my script so that it runs only from Monday to Saturday; at xx:18 and xx:38; where xx is from 9am to 12 midnight.
After 60 days: The conclusion
After running the experimental splog for 60 days from birth, I’m sure you’d be wondering if it had made me a couple of million dollars. I’m happy to tell you that… it didn’t!
I made less than $10 (USD) from my experiment. I can make that much within days from proper web sites, so I’m pretty much disappointed with the results. I’m not disappointed that I didn’t make much money from it… I was really disappointed that people would resort to splogging just for a few dollars.
Maybe I didn’t do the experiment “right”. Maybe my chosen “niche” was not a well-paying one… But from what I’ve seen, splogging was definitely not worth it.
Another surprising thing I discovered was that among the top three search engines, Google indexed the most of my splog’s pages (600 plus pages). Yahoo! was second (indexed 20+ pages), and Live Search only indexed two pages. I’m very suprised with this result especially after all the media coverage Google has been receiving on its war against web spam.
So in the end, all I got was less than $10 and people cursing me for spoiling PPS. Well, I’m sorry that I ruined your PPS experience; but if there’s another conclusion of my experiment, it would be that PPS definitely needs more tender loving care.