sparkwarc: Load WARC Files into Apache Spark

Load WARC (Web ARChive) files into Apache Spark using 'sparklyr'. This allows to read files from the Common Crawl project <>.

Version: 0.1.6
Imports: DBI, sparklyr, Rcpp
LinkingTo: Rcpp
Published: 2022-01-11
DOI: 10.32614/CRAN.package.sparkwarc
Author: Javier Luraschi [aut], Yitao Li ORCID iD [aut], Edgar Ruiz [aut, cre]
Maintainer: Edgar Ruiz <edgar at>
License: Apache License 2.0
NeedsCompilation: yes
SystemRequirements: C++11
Materials: README
CRAN checks: sparkwarc results


Reference manual: sparkwarc.pdf


Package source: sparkwarc_0.1.6.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
macOS binaries: r-release (arm64): sparkwarc_0.1.6.tgz, r-oldrel (arm64): sparkwarc_0.1.6.tgz, r-release (x86_64): sparkwarc_0.1.6.tgz, r-oldrel (x86_64): sparkwarc_0.1.6.tgz
Old sources: sparkwarc archive


Please use the canonical form to link to this page.