Imagine a situation when your web page is somehow deleted from the Internet. It may either happen accidentally or can be deliberately done by someone. So you just feel frustrated as all your diligence goes in vain. If you are a victim of this kind of accident, then this post is for you!
Whenever you will try to open a deleted web page in the Internet, the browser will show a “404 error”. It means your web page is missing in the server or it has been moved to another location. Basically there are two processes, by which you will be able to retrieve your website pages.
All the search engines around the world keeps a snapshot of a web page in their cache. Whenever you do search with a keyword millions of results come out. In the bottom of every results, beside the URL, you may have noticed there is an link “Cached”. Clicking on that link opens the snapshot of that page, taken in a particular date and time.
So, to retrieve the lost page, search on any of the major search engines (Google,Yahoo,Windows Live Search etc) with the proper keywords and by clicking on the link “Cached”, you can get back your web page.
If all the search engines fails to provide the snapshot, you can try with Wayback Machine. It is a very large online Internet archive, which have a capacity to conserve the snapshots of web pages over a number 10 billion. So just click on the hyperlink, type the proper URL of your lost web page and you will be presented with the list of the snapshots archived in Wayback.
Here it should be noted that Wayback Machine uses to keep snapshots of the pages, after 6-18 months they were archived or modified. So, in case if your lost site is quite recent, search engine cache will be a better option.
If your site is a quite large one, then the aforesaid process may be very time consuming and a worth of a hard work. So, if there exists a process, which can automatically recover all the snapshots of your site, obviously it will be convenient for you.
Warrick is a free utility, which helps you to rebuild your site after the accidentally deletion of the same. You just have to go to Warrick and fill the information needed. After a particular time, it will mail all of the snapshots of your site to you and you will be able to set up your site once again. You may go for the entire site as well as a single page.
Frank McCown developed this tool at Harding University. The algorithm of Warrick is quite satisfying. It scans and collects all the missing resources from the most well known repositories like Internet Archives, Google, Bing and Yahoo, gathers them, and mails the most recent snapshots taken. One thing I cannot skip to tell you that, try to start recovering your web pages as early as possible after the deletion of the site. The reason behind it is, the search engines crawl and re-crawl a web page after a particular interval. If a page is deleted and in the meantime if it is re-crawled by the search engines, then there will be a risk of losing that page permanently because in that case, the empty page is taken as the snapshot. So you should be aware of not getting late in recovering the pages.
For some large sites, Warrick might take a fairly long time to gather all the missing resources, but undoubtedly it saves your labor as well as time. Not only it is a online tool, it can be run locally too in your PC. For this you have to download some Perl Source file and have to install them.
So, don’t get shattered after losing your web pages, hold your breath and start recovering them. And the most important thing, this time be careful!