The World
CC-MAIN-2014-10 - Urls
26 250 788
Unique PDF Urls
16 857 835
Number of downloaded files
23 143
Size in gigabytes

1. The results

  • The total size of analyzed files is 16 857 835
  • The total size of downloaded PDF files is 23 143 gigabytes

The following reports were created in the form of charts::

  • Number of generated PDF files according to producer's software
  • Number of generated PDF files by creator (generating program)
  • The number of PDF documents based on specification version
  • The number of PDF documents with a breakdown of the number of pages

The graphs present information for producer/creator where the number of files is above 50,000.

Go to the charts

2. Conclusion

Looking at the numbers it is easy to see the loss of nearly 10 million files, which contain the:

  • 404 errors(Not found), 500(Server error) etc.
  • multiple urls to the same PDF file