The World
Dataset CC-MAIN-2014-10
35,6TB
Analyzed data
2 346 781
PDF files found
471
Size in gigabytes

1. The results

  • Processed 55 700 warc.gz files approx. 685MB each
  • Found 2 346 781 PDF files
  • The total size of PDF files is 471 gigabytes

The following reports were created in the form of charts::

  • Number of generated PDF files according to producer's software
  • Number of generated PDF files by creator (generating program)
  • The number of PDF documents based on specification version
  • The number of PDF documents with a breakdown of the number of pages

The graphs present information for producer/creator where the number of files is above 10 000.

Go to the charts