Skip to main content


Showing posts from June, 2021

Nutch crawler and integration with Solr

Before moving ahead with this article, I assume you have Solr installed and running. If you would like to install Solr on windows, mac or via docker, please read Setup a Solr instance . There are several ways to install nutch which you can read from Nutch tutorial , however I have written this article for those who would like to install nutch using docker. I tried finding help on google but could not find any help for nutch installation using docker and spent good amount of time fixing issues specific to it. Therefore I have written this article to help and save time of other developers. Install nutch using docker- 1. Pull docker image of nutch using below command,      > docker pull apache/nutch 2. Once image is pulled, run the container,      > docker run -t -i -d --name nutchcontainer apache/nutch /bin/bash 3. You should be able to enter in the container and see bash prompt,      > bash-5.1#  Let's setup few important settings now- 1. Goto bin folder,       > bash-5.