I thought that it would be useful to post a couple SQL statements that could easily be used to find records in your PADDATA table that are copies of others...
Finding duplications when title or company is changed
The hijackers sometimes just submit their new version of the PAD under a completely new company name or title - thus, getting around the addpad.php code. This creates duplicate listings on your site. You can find the duplicated listings by running an exhaustive query on the descriptions. I am not sure if there is a better way or better SQL to maximize the efficiency of the query, but here goes...
Be warned that this may take quite a while to execute on a large database - and that this caused my SQL server to be unresponsive to other queries!!! You may want to do this with an offline copy of your database rather than a live website database.
- Code: Select all
select descs.description, descs.padfile, copies.description, copies.padfile
from (select description, padfile from paddata) as descs
inner join (select description, padfile from paddata) as copies
on (descs.description = copies.description) and (descs.padfile <> copies.padfile)
Finding records with carriage returns ("/n") in themSome hijackers change the title so that the code in addpad.php doesn't think it is the same company; since the "/n" character is not displayed, the title looks normal to the end-user.
- Code: Select all
select *
from paddata
where title like '%\n%'
Once you've found some hijacked records, you can either delete them or you can disable them. To disable them requires code changes as well as adding a field to the paddata table. If your site is indexed, it may be better to disable the records and display a custom 404 page for each. We chose to do this at BetterWindowsSoftware; if you land on one of the hijacked PAD pages here, you will get our special 404 page that enables the user to find the
original content that was duplicated.
Dealing with overwritten dataThere is no known way of dealing with this that doesn't involve additional code and some kind of addition to the database. PADKit webmasters have wrestled with this issue since the first hijacker overwrote somebody's data.
A new table could be created named PADDATA_MIRROR to mirror simply the "title/company" + "padfile" values of initial submissions. With this new table, it would be possible to scan for any records that don't match your PADDATA_MIRROR data.
- Code: Select all
CREATE TABLE `paddata_mirror` (
`title` VARCHAR( 50 ) NOT NULL ,
`company` VARCHAR( 50 ) NOT NULL ,
`padfile` VARCHAR( 128 ) NOT NULL ,
`submittime` TIMESTAMP NOT NULL ,
PRIMARY KEY ( `padfile` )
);
Open addpad.php - insert this code after:- Code: Select all
if (!mysql_query("replace into paddata ($flist) values($vlist)", $link_id)) die("Error updating paddata table");
- Code: Select all
if (!($replace)) {
$flist = "company, title, padfile, submittime";
$vlist = "'$company','$title','$padfile',now()";
if (!mysql_query("insert into paddata_mirror ($flist) values($vlist)", $link_id)) die("Error updating paddata_mirror table");
}
We aren't doing anything special at BetterWindowsSoftware with respect to these overwritten records. If we find any, we manually import the original PAD. Furthermore, we aren't planning to write a MOD to do this, but we may incorporate the PADDATA_MIRROR table to make things a bit easier when it comes to restoring the initial data.