Chapter 15 Apache Server -Web 伺服器

本章將包含下列內容:

壹.基本認識
貳.
下載 apache 套件
參.
安裝 apache
肆.
apache 的基礎操作
伍.設定 httpd.conf 與進階操作
 


壹.基本認識

在你一安裝好 Linux 時,基本上WWW 伺服器就安裝好了, 在 RedHat 中預設的 WWW 伺服器是很有名的 Apache .


貳.下載 apache 套件

你可以到各個套件的發展處去下載:

 

參.安裝 apache

安裝的方法如下:


肆.apache 的基礎操作

 

Apache 的主要目錄
     不同版本的 apache及Linux,會產生不同的檔案目錄安排的方式,下列以apache 2.0為例, 

 

基本設定:


伍.設定 httpd.conf 與進階操作


設定 httpd.conf 這個檔案,可以用 vi 來編輯。

        [root @free conf]# vi httpd.conf

httpd.conf檔案中的重要項目如下:

1. Listen

The Listen directive tells the server to accept incoming requests only on the specified port or address-and-port combinations.

If only a port number is specified in the Listen directive, the server listens to the given port on all interfaces.

If an IP address is given as well as a port, the server will listen on the given port and interface.

Multiple Listen directives may be used to specify a number of addresses and ports to listen on. The server will respond to requests from any of the listed addresses and ports.

For example, to make the server accept connections on both port 80 and port 8000, use:

Listen 80
Listen 8000

To make the server accept connections on two specified interfaces and port numbers, use

Listen 192.170.2.1:80
Listen 192.170.2.5:8000

2. UserDir

On systems with multiple users, each user can be permitted to have a web site in their home directory using the UserDir directive. Visitors to a URL http://example.com/~username/ will get content out of the home directory of the user "username", out of the subdirectory specified by the UserDir directive.

The UserDir directive specifies a directory out of which per-user content is loaded. This directive may take several different forms.

If a path is given which does not start with a leading slash, it is assumed to be a directory path relative to the home directory of the specified user. Given this configuration:

UserDir public_html

the URL http://example.com/~rbowen/file.html will be translated to the file path /home/rbowen/public_html/file.html

If a path is given starting with a slash, a directory path will be constructed using that path, plus the username specified. Given this configuration:

UserDir /var/html

the URL http://example.com/~rbowen/file.html will be translated to the file path /var/html/rbowen/file.html

If a path is provided which contains an asterisk (*), a path is used in which the asterisk is replaced with the username. Given this configuration:

UserDir /var/www/*/docs

the URL http://example.com/~rbowen/file.html will be translated to the file path /var/www/rbowen/docs/file.html

Using the syntax shown in the UserDir documentation, you can restrict what users are permitted to use this functionality:

UserDir enabled
UserDir disabled root jro fish

The configuration above will enable the feature for all users except for those listed in the disabled statement.

You can disable the feature for all but a few users by using a configuration like the following:

UserDir disabled
UserDir enabled rbowen krietz

See UserDir documentation for additional examples.

附註: 記得目錄的權限要打開, 別人才能夠進入.

附註:寫入第一個個人首頁, 假設以 s9054401 這個帳號為例,我們可以這樣進行:

在你的目錄中,亦即 /home/s9054401/public_html 當中,建立一個檔名為 index.html 的 HTML 檔案,例如:在 IE 的網址列打入 http://你的網站名稱/~s9054401/ ,  則 apache 會自動將 IE 的訊息傳到 /home/s9054401/public_html 這個目錄中,並搜尋檔名為 index.html 或 index.htm 或 index.php 的檔名!所以說, index.html 是 apache 第一個找尋的檔名!也就是你的首頁!

 

3. AddDefaultCharset:將網頁預設的語系改成Big5

This directive specifies the name of the character set that will be added to any response that does not have any parameter on the content type in the HTTP headers. This will override any character set specified in the body of the document.

A setting of AddDefaultCharset Off disables this functionality.

AddDefaultCharset On enables Apache's internal default charset of iso-8859-1 as required by the directive.

You can also specify an alternate charset to be used. For example:  AddDefaultCharset Big5

 

4. Allow and Deny

Allow Directive:

The Allow directive affects which hosts can access an area of the server. Access can be controlled by hostname, IP Address, IP Address range, or by other characteristics of the client request captured in environment variables.

The first argument to this directive is always from.

The subsequent arguments can take three different forms. If Allow from all is specified, then all hosts are allowed access. To allow only particular hosts or groups of hosts to access the server, the host can be specified in any of the following formats:

  4.1. A (partial) domain-name

Example:

Allow from apache.org

Hosts whose names match, or end in, this string are allowed access. Only complete components are matched, so the above example will match foo.apache.org but it will not match fooapache.org. This configuration will cause Apache to perform a double reverse DNS lookup on the client IP address. It will do a reverse DNS lookup on the IP address to find the associated hostname, and then do a forward lookup on the hostname to assure that it matches the original IP address. Only if the forward and reverse DNS are consistent and the hostname matches will access be allowed.

  4.2. A full IP address

Example:

Allow from 10.1.2.3

An IP address of a host allowed access

  4.3. A partial IP address

Example:

Allow from 10.1

The first 1 to 2 bytes of an IP address, for subnet restriction.

  4.4. A network/netmask pair

Example:

Allow from 10.1.0.0/255.255.0.0

A network a.b.c.d, and a netmask w.x.y.z. For more fine-grained subnet restriction.

  4.5. A network/nnn CIDR specification

Example:

Allow from 10.1.0.0/16

Similar to the previous case.

 

Deny Directive:

This directive allows access to the server to be restricted based on hostname, IP address, or environment variables. The arguments for the Deny directive are identical to the arguments for the Allow directive.

 

5. Order Directive

The Order directive controls the default access state and the order in which Allow and Deny directives are evaluated. Ordering is one of

Deny,Allow
The Deny directives are evaluated before the Allow directives. Access is allowed by default. Any client which does not match a Deny directive or does match an Allow directive will be allowed access to the server.
Allow,Deny
The Allow directives are evaluated before the Deny directives. Access is denied by default. Any client which does not match an Allow directive or does match a Deny directive will be denied access to the server.
Mutual-failure
Only those hosts which appear on the Allow list and do not appear on the Deny list are granted access. This ordering has the same effect as Order Allow,Deny and is deprecated in favor of that configuration.

Keywords may only be separated by a comma; no whitespace is allowed between them. Note that in all cases every Allow and Deny statement is evaluated.

Example1: all hosts in the apache.org domain are allowed access; all other hosts are denied access.

Order Deny,Allow
Deny from all
Allow from apache.org

Example2:  all hosts in the apache.org domain are allowed access, except for the hosts which are in the foo.apache.org subdomain, who are denied access. All hosts not in the apache.org domain are denied access because the default state is to deny access to the server.

Order Allow,Deny
Allow from apache.org
Deny from foo.apache.org

On the other hand, if the Order in the last example is changed to Deny,Allow, all hosts will be allowed access. This happens because, regardless of the actual ordering of the directives in the configuration file, the Allow from apache.org will be evaluated last and will override the Deny from foo.apache.org. All hosts not in the apache.org domain will also be allowed access because the default state will change to allow.

Example3: The presence of an Order directive can affect access to a part of the server even in the absence of Allow and Deny directives because of its effect on the default access state.

<Directory /www>
Order Allow,Deny
</Directory>

will deny all access to the /www directory because the default access state will be set to deny.

 

6. <Directory> and </Directory>

<Directory> and </Directory> are used to enclose a group of directives that will apply only to the named directory and sub-directories of that directory. Any directive that is allowed in a directory context may be used. Directory-path is either the full path to a directory, or a wild-card string using Unix shell-style matching. In a wild-card string, ? matches any single character, and * matches any sequences of characters. You may also use [] character ranges. None of the wildcards match a `/' character, so <Directory /*/public_html> will not match /home/user/public_html, but <Directory /home/*/public_html> will match.

Example1:

<Directory "/var/www/html/class">
 

Order allow, deny

Allow from all

Deny from 163.17.1.1 140.113.1.1 giga.net.tw cn

 

</Directory>

Be careful with the directory-path arguments: They have to literally match the filesystem path which Apache uses to access the files.

Order指定 allow和 deny的順序, 此例會先讀取allow再讀取deny的設定值, 若兩個值抵觸, 使用後面的設定值.此例拒絕四個位址的存取.

Apache會向DNS查詢 source IP address對應的Domain name, 如果該IP address並未登記Domain name, 則此項設定失效.

Example2:

Extended regular expressions can also be used, with the addition of the ~ character.

<Directory ~ "^/www/.*/[0-9]{3}">

would match directories in /www/ that consisted of three numbers.

Example3:

Note that the default Apache access for <Directory /> is Allow from All. This means that Apache will serve any file mapped from an URL. It is recommended that you change this with a block such as

<Directory />
Order Deny,Allow
Deny from All
</Directory>

and then override this for directories you want accessible. See the Security Tips page for more details.

7. error log - ErrorLog directive and LogLevel directive

The server error log, whose name and location is set by the ErrorLog directive, is the most important log file. This is the place where Apache httpd will send diagnostic information and record any errors that it encounters in processing requests. It is the first place to look when a problem occurs with starting the server or with the operation of the server, since it will often contain details of what went wrong and how to fix it.

The error log is usually written to a file (typically error_log on unix systems and error.log on Windows and OS/2).

The format of the error log is relatively free-form and descriptive. But there is certain information that is contained in most error log entries.

Example:

[Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/htdocs/test

The first item in the log entry is the date and time of the message.

The second entry lists the severity of the error being reported. The LogLevel directive is used to control the types of errors that are sent to the error log by restricting the severity level.

The third entry gives the IP address of the client that generated the error.

The fourth entry indicates that the server has been configured to deny the client access. The server reports the file-system path (as opposed to the web path) of the requested document.

 

During testing, it is often useful to continuously monitor the error log for any problems. On unix systems, you can accomplish this using:

tail -f error_log

8. access log - CustomLog directive and LogFormat directive

The server access log records all requests processed by the server. The location and content of the access log are controlled by the CustomLog directive. The LogFormat directive can be used to simplify the selection of the contents of the logs.

Of course, storing the information in the access log is only the start of log management. The next step is to analyze this information to produce useful statistics. Log analysis in general is beyond the scope of this document. For more information about this topic, and for applications which perform log analysis, check the Open Directory or Yahoo.

The format of the access log is highly configurable. The format is specified using a format string that looks much like a C-style printf(1) format string. Some examples are presented in the next sections. For a complete list of the possible contents of the format string, see the mod_log_config format strings.

Common Log Format

1. directives的設定

A typical configuration for the access log might look as follows.

LogFormat "%h %l %u %t \"%r\" %>s %b" common
CustomLog logs/access_log common

This defines the nickname common and associates it with a particular log format string. The format string consists of percent directives, each of which tell the server to log a particular piece of information. Literal characters may also be placed in the format string and will be copied directly into the log output. The quote character (") must be escaped by placing a back-slash before it to prevent it from being interpreted as the end of the format string. The format string may also contain the special control characters "\n" for new-line and "\t" for tab.

The CustomLog directive sets up a new log file using the defined nickname. The filename for the access log is relative to the ServerRoot unless it begins with a slash.

2. 檢視log的內容

The above configuration will write log entries in a format known as the Common Log Format (CLF). This standard format can be produced by many different web servers and read by many log analysis programs. The log file entries produced in CLF will look something like this:

127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326

Each part of this log entry is described below.

127.0.0.1 (%h)
This is the IP address of the client (remote host) which made the request to the server. The IP address reported here is not necessarily the address of the machine at which the user is sitting. If a proxy server exists between the user and the server, this address will be the address of the proxy, rather than the originating machine.
- (%l)
The "hyphen" in the output indicates that the requested piece of information is not available. In this case, the information  is the RFC 1413 identity of the client determined by identd on the clients machine. This information is highly unreliable and should almost never be used except on tightly controlled internal networks. Apache httpd will not even attempt to determine this information unless IdentityCheck is set to On.
frank (%u)
This is the userid of the person requesting the document as determined by HTTP authentication. The same value is typically provided to CGI scripts in the REMOTE_USER environment variable. If the status code for the request (see below) is 401, then this value should not be trusted because the user is not yet authenticated. If the document is not password protected, this entry will be "-" just like the previous one.
[10/Oct/2000:13:55:36 -0700] (%t)
The time that the server finished processing the request. The format is:

[day/month/year:hour:minute:second zone]
day = 2*digit
month = 3*letter
year = 4*digit
hour = 2*digit
minute = 2*digit
second = 2*digit
zone = (`+' | `-') 4*digit

It is possible to have the time displayed in another format by specifying %{format}t in the log format string, where format is as in strftime(3) from the C standard library.

"GET /apache_pb.gif HTTP/1.0" (\"%r\")
The request line from the client is given in double quotes. The request line contains a great deal of useful information. First, the method used by the client is GET. Second, the client requested the resource /apache_pb.gif, and third, the client used the protocol HTTP/1.0. It is also possible to log one or more parts of the request line independently. For example, the format string "%m %U%q %H" will log the method, path, query-string, and protocol, resulting in exactly the same output as "%r".
200 (%>s)
This is the status code that the server sends back to the client. This information is very valuable, because it reveals whether the request resulted in a successful response (codes beginning in 2), a redirection (codes beginning in 3), an error caused by the client (codes beginning in 4), or an error in the server (codes beginning in 5). The full list of possible status codes can be found in the HTTP specification (RFC2616 section 10).
2326 (%b)
The last entry indicates the size of the object returned to the client, not including the response headers. If no content was returned to the client, this value will be "-". To log "0" for no content, use %B instead.

 

延伸閱讀:

1. httpd.conf的其它指示參數(directory)的用法:  http://httpd.apache.org/docs-2.0/mod/quickreference.html

2. Apache的其他文件: http://httpd.apache.org/docs-2.0/

3. Webalizer 是一個好用的Web Server 紀錄檔分析軟體: http://www.mrunix.net/webalizer(新版Linux有內建)