Home / Solr Tutorial / Solr – Indexing

Solr – Indexing:

In this tutorial, we will learn about Indexing in Solr.

Indexing in Solr is nothing but adding the content to the Solr. So, the same content that will be searchable through Solr index again. A Solr index can get this data through various ways like XML, CSV files, directly from tables in the database and data from rich document format like Microsoft word and PDF.

There are four ways of indexing the data into Solr:

Indexing the Rich Documents like Microsoft Word, PDF kind of binary files can be done using the Solr cell which was built on Solr Tika.
General XML, CSV files can be indexed just by sending the HTTP requests to Solr server or through the Solr admin user interface we can select the type of documents and execute the program to index the data within the file.
We can also use the Java Client API with which we can build the JAVA Application to ingest the data.
Indexing the table data from a database can do by configuring the schema and Solrconfig.xml files.

Post Tool:

Post Tool is a command line utility from Solr which can be used to post different types of content. This tool can be used on both Linux and windows environment for posting the content.

In windows systems, we can use the post tool by using .jar file as below

java -jar /example/exampledocs/post.jar -h

In UNIX systems, the usage can be done as below in the terminal:

bin/post -c filminfo example/films/films.json

In the above command collection/core is mandatory. You can also check the correct usage using the below command

bin/post -help

or

bin/post -h

Use the below command to Index only XML files which have .xml extension into particular collection or core with name filminfo through port 8983.

bin/post -c filminfo -p 8983 *.xml

If you want to index only CSV files, use the below command.

bin/post -c filminfo -p 8983 *.csv

In same way if you want to index only CSV files, use the below command.

bin/post -c filminfo -p 8983 *.json

Use the below example to Index Rich documents like Microsoft word, PDF, HTML.

bin/post -c filminfo filming.pdf

It can even able to index all the documents with in a directory as mentioned below.

bin/post -c filminfo filmfolder/

If you want to index only ppt files from that mentioned directory, use the below command.

bin/post -c filminfo -filetypes ppt filmfolder/

Solr – Indexing:

Top Tutorials

Recent Posts

How to Shorten an Essay Effectively (with An Essay Shortener)

A Complete Guide To How To Make A Food Ordering Website Know the Market, Features, And Process To Build A Food Ordering Website

Benefits of Creating a Blog to Market Your Product or Service

Machine Learning and Predictive Analytics

Ethics in Digital Marketing: Building Trust in a Transparent World

Work with us

Contact Us

Subscribe to Newsletter

Jobs