any23
Apache Any23 Project
Apache Anything To Triples (Any23) is a library and web service that extracts
structured data in RDF format from a variety of Web documents.
Any23 documentation can be found on the website
Online Documentation
For details on the command line tool and web interface, see here
For a guide to using Any23 as a library in your Java applications, see here
Javadocs is available here
Community
You can reach our and connect with our community on our mailing lists
Build Any23 from Source Code
The canonical Any23 source code lives at
https://github.com/apache/any23.git
Be sure to have the Apache Maven v.3.x+ installed and included in $PATH.
Clone the source:
git clone https://github.com/apache/any23.git
Navigate and build:
cd any23
mvn clean install
From now on the above directory any23 is referred to as $ANY23_HOME
This will install the Any23 artifacts and its dependencies in your
local Maven3 repository.
You can then extract the compiled code and use the command line interface
Please note you will need to change the version to the tar or zip you are extracting.
tar -zxvf $ANY23_HOME/cli/target/apache-any23-cli-${version-SNAPSHOT}.tar.gz
Run the Any23 Commandline Tools
Any23 comes with some command line tools. Within the directory you just extracted, you can invoke:
Linux
$ANY23_HOME/cli/target/apache-any23-cli-${version-SNAPSHOT}/bin/any23
# Provides the main Any23 use case: metadata extraction on a file or URL source.
Windows
$ANY23_HOME/cli/target/apache-any23-cli-${version-SNAPSHOT}/bin/any23.bat
# Provides the main Any23 use case: metadata extraction on a file or URL source.
The complete documentation about these tools can be found here
The bin scripts are generated dynamically during the package phase.
To ensure the package generation, from the top level directory run:
mvn package
You can void extracting the archive files by going to the cli generated bin folder
cd $ANY23_HOME/cli/target/appassembler/bin/
and finally invoke the script for your OS (UNIX or Windows):
bin$ ./any23
usage instructions will be printed out.
Generate the Documentation
To generate the project site locally execute the following command from $ANY23_HOME:
cd $ANY23_HOME
MAVEN_OPTS='-Xmx1024m' mvn [-o] clean site:site
You can speed up the site generation process specifying the offline option [-o],
but it works only if all the involved plugin dependencies has been already downloaded
in the local M2 repository.
If you’re interested in generating the Javadoc enriched with navigable UML graphs, you can activate
the umlgraphdoc profile. This profile relies on graphviz that must be
installed in your system.
cd $ANY23_HOME
MAVEN_OPTS='-Xmx1024m' mvn -P umlgraphdoc clean site:site
Munging of Any23 code to ASF
When it was decided that the Any23 code be brought into the Apache Incubator, the existing code was migrated over to the ASF infrastructure and documented/managed via a number of Jira tickets e.g, INFRA-3978 INFRA-4146 and ANY23-29.