Svar: [AccessD] A2003:Test Voracity of URLS

Stuart McLachlan stuart at lexacorp.com.pg
Thu Nov 18 07:29:27 CST 2004


On 18 Nov 2004 at 11:26, Foote, Chris wrote:

> 
> Gustav
> 
> I'm not a guru but the "home page" for a website could be any of the
> following list:
> 
> "index", "default"
> 
> with the extensions htm, html, shtm, shtml, cfm, asp, php and so on
> 
> I'm sure there are others!
> 

And if there is no recognisable "home page", I think you will find it shows 
a directory listing.

Basically, the solution is to issue a HTTP "Get" (or preferably a "Head" so 
that you don't pull the full contents back) request and check the status 
code of the response:

RFC 2068:
<quote>
6 Response

   After receiving and interpreting a request message, a server responds
   with an HTTP response message.

       Response      = Status-Line               ; Section 6.1
                       *( general-header         ; Section 4.5
                        | response-header        ; Section 6.2
                        | entity-header )        ; Section 7.1
                       CRLF
                       [ message-body ]          ; Section 7.2

6.1 Status-Line

   The first line of a Response message is the Status-Line, consisting
   of the protocol version followed by a numeric status code and its
   associated textual phrase, with each element separated by SP
   characters.  No CR or LF is allowed except in the final CRLF
   sequence.



       Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

6.1.1 Status Code and Reason Phrase

   The Status-Code element is a 3-digit integer result code of the
   attempt to understand and satisfy the request. These codes are fully
   defined in section 10. The Reason-Phrase is intended to give a short
   textual description of the Status-Code. The Status-Code is intended
   for use by automata and the Reason-Phrase is intended for the human
   user. The client is not required to examine or display the Reason-
   Phrase.

   The first digit of the Status-Code defines the class of response. The
   last two digits do not have any categorization role. There are 5
   values for the first digit:

     o  1xx: Informational - Request received, continuing process

     o  2xx: Success - The action was successfully received, understood,
        and accepted

     o  3xx: Redirection - Further action must be taken in order to
        complete the request

     o  4xx: Client Error - The request contains bad syntax or cannot be
        fulfilled

     o  5xx: Server Error - The server failed to fulfill an apparently
        valid request

</quote>




-- 
Stuart





More information about the AccessD mailing list