Solr – Architecture:

We understand that Solr is a powerful full-text search engine from our previous Solr introduction tutorial. In this Solr architecture tutorial, we will learn about the way of communication with the browser request and response style. Solr is easy to integrate with web applications, Rich client or browser silent applications and mobile devices.

Solr resides and runs aside of server application as shown in the below figure. Let’s assume an e-commerce web application. Whenever there is a request from application, the data will be processed through the data source which communicates with the database. The database might be updated with the list of products available through some store admin page. All that information can be indexed in the Solr and Search engine can be configured in the application end  where the Solr serves the purpose of auto suggestions and faceted kind of searches. Solr can process or index the Rich documents as well as regular supported XML, JSON, CSV documents.

How Solr Search Data Model Works?

  1. First we need to define the schema which let the Solr know about what all documents needs to be indexed. We will further discuss the related Documents, Fields, and Schema Design in the upcoming tutorials.
  2. Deploy the Solr in the server.
  3. Ingest the documents into Solr for which users will search.
  4. Implement the Search Functionality in the front end application.

As shown in the above figure, we can understand how the data communication happens between the end user and Solr. We know that Solr receives the browser based requests which is a simple HTTP request through URL and the Solr process the request and send the response as a structured document that may be JSON, CSV, XML or other rich document formats.

Scaling up the Solr is quite flexible. When you have large amounts of data which receives bunch of requests, a single Solr server could not handle the entire workload. By using the Solr Cloud, we can easily scale up where the data is distributed across many servers and requests will split. Single Solr will have a Core, whereas Solr Cloud will have a collection which is split into multiple logical pieces called Shards. We can also implement the replication by adding servers which hosts the copies of similar collections. Handling of queries would become easy by spreading the requests across multiple servers. Sharding and Replication which are not mutually exclusive would make the Solr powerful and scalable.