Wednesday, November 4, 2009

Open Source Link Checking Tools

It is very important to make sure that there are no broken links on your site. A link may be active when your first added it but as the days, months, years go by the link may end up being broken.

Below are some open source tools that can help you find those broken links.

  • Bugkilla – J2EE Functional Test Suite: Bugkilla will be a set of java tools for the functional test of J2EE Web Applications. Specification and execution of tests will be automated for web front end and business logic layer.

  • DLC (Dead Link Check): It can generate an HTML output for easy checking of the results, and can process a link cache file to hasten multiple requests (links life is time stamp enforced). Written in Perl

  • ht://Check: Outputs a simple report. It can retrieve information through HTTP/1.1 and store them in a MySQL database. Most of the information is given by the PHP interface which comes with the package and that is able to query the database built by the htcheck program. Requirement: You need a Web server to use it, and PHP with the MySQL connectivity module.

  • InSite: A site management tool written in Perl. Requirement: Linux. Requires libwww and MIME::Lite (available at any CPAN mirror).

  • Jenu: A multithreaded, Java 1.3 (swing) based Web site URL Link checker. It’s a copy of a nice multi-threaded link checker for the PC called Xenu. Requirement: Java 2 (1.3) runtime.

  • JSpider: A Web spider engine. It is a robot that will generate web traffic, just like you would do when you are browsing the internet. You can control and configure the robot’s behaviour to adapt it to your needs. On it’s way through sites, it will gather all kinds of information you might be interested in. You can use a web spider for different purposes: searching dead links (404 errors) on your website, testing your site’s performance under havy load, copying an entire site to your harddisk, etc … Requirement: Linux, Solaris, Windows, and other Java-enabled Operating Systems.

  • LinkChecker: A link management solution integrated into Plone. Requirement: Plone 2.0.5, 2.1, and 2.5 beta

  • Linklint: Perl program that has the ability to check local-file and HTTP site checking. Creates a report of which URLs have changed since the last check.

  • Link Page Generator: Automatic link management program with -check option for marking/eliminating bad links (in cron job). Written in Perl.

  • LinkVerify: Checks a set of hypertext files whether all references to external resources are valid. In HTML this applies mostly to hyperlinks and embedded images. Style sheets will be checked too. Requirement: Java 1.1 is required

  • SourceForge – LinkChecker: With LinkChecker you can check your HTML documents for broken links Requirement: Python 2.2.1. For HTTPS support you need to compile Python with the SSL _socket module.

  • W3C Link Checker: Checks that all the links in your HTML document are valid. There is a command-line interface and an online version. The Link Checker can easily be installed on one’s server

  • Xenu’s Link Sleuth: checks Web sites for broken links. Link verification is done on “normal” links, images, frames, plug-ins, backgrounds, local image maps, style sheets, scripts and java applets. NOTE: This one is free but NOT Open Source

Please comment below on your thoughts about any of the tools listed above and/or if you know of any other tools that should be added.