[AccessD] HTML code stripper

Bruce Bruen bbruen at unwired.com.au
Thu Nov 29 07:04:13 CST 2007


.Innertext will only return the complete html of the tag it is invoked on.
This includes all tags, scripting and booofle contained therein.

I have never yet been successful at html "scraping".  At best you can locate 
and extract [bold]well constructed[/bold] and [bold]for want of a better 
phrase, "well formed"[/bold] information.

However, the problems are:
1) the html page syntax can change, almost daily.
2) the paucious (specious?) html specification means that tag (mis)matching 
breaks the syntax parsing continually.

Much better to investigate whether there is an xml feed equivalent.

regards
bruce

 



More information about the AccessD mailing list