/  Technology   /  Important files required for each web site
Site developer(i2tutorials.com)

Important files required for each web site


There are several standard files that are required for each Web site, but in many cases they are ignored by the site. Most of these files are related to the contract, not the technical requirements, but if they are not available, the site creation will go astray. In addition to the URL can be tried by guessing, it is often difficult for users to find other things they want by guessing. This article will briefly describe these standard documents one by one.

How a given resource is provided depends on the web server layer and web application layer used. In “traditional”, near-static servers such as Apache, these resources are likely to be literal files on the server. But in different configurations, they may also be some entries in the database, some lines in the configuration file, some classes in the server process, and so on. This article focuses on what the user ultimately sees, not how it should happen.


When users use your Web site, they inevitably look for resources that don’t exist. This type of search is more due to the misspelling of the URL than other reasons, but the links are outdated, the backend is misconfigured, and the URLs of different points are not to be underestimated. When resources are not available, it’s a good idea to provide some kind of revolving page to help users navigate to other useful pages. An ordinary “not found” allows users to know that resources are not available, but it does not help them solve the “how to do next” problem.

Warning: To create a custom 404.html (or any other mechanism that the web server uses to publish custom “not found” messages), too many Websites are incorrectly configured to send “soft 404” messages. In other words, they will send a page with a regular “200 OK” title, which simply means that somewhere in the text is “not available”, and perhaps (but not often) there is a “404 Error”. This should be avoided. Instead, users (and their web browsers and other tools) should be saved to use the exact state headers.


So why do you want to create a Website? That’s right, you need to use a home page to answer this question. But more likely, the home page doesn’t provide this type of information, it just allows users to log in, highlight the site’s “selling points,” display some fancy content, and more. Perhaps the need to let the user to “About” page and, if so, be sure to make this information from home to be able to navigate from http://mysite.example.com/ about.html get. Some people are used to looking for this type of information from this page.

A good about.html page should be able to provide an overview of the site’s capabilities, the intent to create it, and why users should follow the site, and there may be several links that help users navigate back to the site’s core functionality. This page is not required, and usually should not be very gorgeous. Just keep it pragmatic and accurate so that users can take advantage of all the features that the site can offer.


So how do you contact you? With about.html, users can get this information by multiple clicks on an existing home page. Do not make little effort to find this information: placing in http://mysite.example.com/ contact.html . Also use contacts.html for the same page. Please introduce the .htm extension. The name is easy to use. Of course, you can leave this information at the end of the series of navigation screens generated by these clicks; but it is also good to provide redundancy for resources.


Who owns the copyright of the website? It is possible that the content belongs to you, but who are you? personal? the company? partner? Government agencies? If the content is in the public domain or within the scope of free content licensing, then you may need to inform the user of this. At the moment, almost any content has its own copyright attribution: if your content complies with different principles, then please inform the user. But there aren’t enough websites that are currently providing such information, but why not add it to your own website? Because there will always be some users who will pay attention to this information.

Obviously, different pages or resources may have different copyright information. Please use this page to provide users with information on how to determine those individual differences. If you have a trademark issue, please provide it together.

Index.html (and index.htm)

Not every web server actually uses the index.html file to describe its home page. Depending on the settings, there may be URL rewriting, dynamic generation by path name, and so on. But users don’t care about these details! Just let http://mysite.example.com/ index.html point home, even for this purpose must be easy to use HTML redirection.

That’s right, if that’s the case, then simply let the old .htm extension take effect. If you still feel that it is not enough, you can do the same for index.cgi.


Through RSS A lot of web content is available. While this does not apply to all Websites, it is still effective for most sites. It is extremely reasonable to have RSS content independent of user-specific configuration options, login, or pay for specific information. Because RSS can’t be all-inclusive.

Even so, if something can be provided as an RSS, please do so. Perhaps, what is given in index.rss is nothing more than “advertising” content, and sometimes it provides clichés about how to take advantage of the advantages of RSS feeds. Sometimes it might be a note about why RSS is not relevant to your Website.


Once you want to collect user information (even if you only have a username or traffic log), tell the user what you plan to do with it. The legal issues surrounding the powers and responsibilities of the creators and/or users of the Web site are complex—I’m not a lawyer, and I can’t solve your legal problems. However, if you can take into account the user’s personal privacy, users will still feel it. And perhaps you would be at this time with a lawyer to discuss about how to handle user data.


If you don’t want all the resources on your Web site to be indexed by automated tools, please describe them in the robots.txt file. But if you really want to index your content, please do so. The Robots Exclusion Standard directive does not force the user to: If you do not want something to be visible, do not put it on the site, or make sure there is sufficient license protection behind it. However, all major legitimate web crawlers will follow the requirements in robots.txt. So please be as clear as possible about your intentions.


The use of security.html is not mandatory. But if the site has security issues (for example, collecting any sensitive information from the user), it’s a good idea to document the security process (at least give a rough overview). Please provide contact information on this page in case the user has any questions or would like to give suggestions on how to improve. Finding this information should follow the overall organization of the site navigation options. In this case, you may wish to put this resource in this URL as well.

Site map:

How to display a map of an entire Web site is not fully standardized. Some of the things that are provided for creating a site map are always useful, but the extent to which these things are detailed depends on how dynamic (or dynamic) the site is. Moreover, the content that you want to display for the user also depends on the intent of the site. For example, if the user does not have access to resource X, then letting the user know that the existence of resource X may not be appropriate at all. Please try to provide something based on your own judgment and specific circumstances .

For many sites, providing a site map is nothing but support and friendliness for automated mechanisms such as search engines. Google has released a new convention based on the robots.txt convention. In summary, you can create an XML file that gives you all the resources that the site provides. This is a bit like an “include list” that acts as a complement to the “exclude list” of robots.txt.

Leave a comment