10 best mechanical keyboards for gaming, typing, and working (2021)

ArchiveBox, a solution to create our own custom miniature Archive.org



Preserving content online is one of the great challenges of today's digital societyWe generate a greater amount of information than at any previous time in history, but much of it can be lost at the same devilish rate at which we produce it.



That is why initiatives such as Archive.org have emerged, which seeks to create a content repository capable of reflecting changes in the content of the WWW. But why settle for a single centralized repository when any user could create a reduced version to safeguard the websites most relevant to him?



And as an answer to this question, ArchiveBox was born, a self-hosted archiving solution (that is, we will have to install it on our own work computer or server) and developed in Python (specifically, it will require that we have Python 3.7 or higher installed) that we can use on systems:








Our plan to preserve the websites of years ago is failing: from Archive.org to the broken links of the Internet





According to its instructions for use (here is everything you need to install and use it), "It only takes about 5 minutes to get the ArchiveBox up and running".



And once installed its operation is simple: we provide you with the URL addresses of the pages we want to archive, and the software itself is responsible for saving it in the appropriate format in relation to the content of the site and its own configuration.



So, for each URL added, ArchiveBox saves various types of HTML snapshots and PNG and PDF screenshots to disk, as well as the relevant files in each case (compressed, multimedia, text) ... it is also capable of replicating complete GIT repositories.




ArchiveBox I



To that is added that we can configure it to automatically extract URLs from other sources, such as bookmarks in our browser, a list of RSS feeds, services like Pocket or Instapaper, saved posts from Reddit, etc.



And once downloaded, the archived content is viewable from the browser and navigable through the folders in the file system; We can also manage (update, delete, etc.) said contents through the command line. In fact, once ArchiveBox is installed, to add a URL to the program we will only need to execute a command like



$ archivebox add https://www.youtube.com/watch?v=M41k0SSfqa8






Topics
  • Tools

  • Storage