Crawljax

Crawling Ajax-based Web Applications

Fork me on GitHub

Release 3.0 Is Out

Release 3.0 is out. This is a major release, which contains many key updates and renovations. The release contains several bug fixes and loads of enhancements. The code base has been split up into modules, the API has changed a little, and the crawl overview plugin has been completely renovated.

New overview plug-in

Most notable in the new release is the new overview plugin. The plugin shows an interactive state graph of the crawl and some statistics. Make sure you check out a demo or try one of our examples yourself.

"The new overview plugin"

New command line interface

This release brings more support for command line configuration. Once you download the zip, running Crawljax can simply be done using the command:

java -jar crawljax-cli-version.jar http://your.site.com outputfolder

Crawljax will Crawl that site with the new Crawl overview plugin enabled. You can run java -jar crawljax-cli-version.jar to see a list of possible configurations for the crawl.

The zip is downloadable from the central Maven repository.

Other important updates

  • Crawljax is now configured using a configuration builder. You start your configuration using CrawljaxConfiguration.builderFor("http://your.website.com");.
  • The project has been split up in three versions: core, cli and examples. The cli modules contains the command line interface. The core module can be included in any project as a jar to run Crawljax programmatically. The examples module is the easiest way to try out several configurations of Crawljax in your favorite IDE. Check out our updated documentation for more details.
  • You can configure the crawler to crawl all found href attributes. Even if the elements are not visible because they only show up when the crawler hovers on another element.
  • You can now configure the crawler so that it does not click any children of a certain element using a short syntax like dontClickChildrenOf("LI").withId("dontClickMe");
  • Major performance and stability improvements.

You can view all closed issues or the full diff on Github.

Release 2.1 Is Out

Release 2.1 contains bug fixes and browser updates. You can find the solved issues here or look at the full diff here.

We’re already working on release 2.2. If you need anything fixed in Crawljax, make sure you file your issue!

We’ve Moved to GitHub!

Crawljax is alive and kicking again! We’ve started off by moving the Crawljax project to Github. This makes contributing to the project much easier using Git’s decentralized approach and Github’s pull requests.

All issues from Goole Code have been imported into Github’s issue tracker, and the wiki has moved as well. With this new website we will keep everyone informed about Crawljax’s development.

Feel free to file an issue or fork our repo and generate a pull request!

The Crawljax Team