The Ada Resource Association
Information about the AdaIC search engine
Information about the AdaIC search engine
The AdaIC search engine provides a way to search many Ada-related web sites in a single search. Since only Ada-related sites are included, you won't get piles of unrelated pages, and you won't have to limit your search so much that you can't find the information you need.
Tips for using the AdaIC search engine
Matching on the AdaIC search engine
About the AdaIC Site Search
About the AdaIC search engine
Search Ada sites on the Web
  • Usually, you will want to put your search terms into the All of the words box, as this returns the most relevant pages.
  • If you want to look for an exact phrase, put it in double quotes ("). "access discriminants" will only return pages containing that phrase, while access discriminants in the All of the words box will return pages containing both words.
  • If your search term includes non-alphanumeric characters, it must be quoted; regular searches ignore most non-alphanumeric characters.
  • Be sure to include the archive sites (by checking them) if older information would be relevant. (All of the information on the archive sites is older than 1998).
  • Restrict the sites searched only when necessary. The categories of sites are not exact, and many sites would fit into multiple categories, but appear in only one category.
  • The AdaIC search engine searches for words. However, very common words (the, in, is, and so on) are not indexed. If your search includes only these words, it may fail or take a very long time to complete.
  • A word to the AdaIC search engine includes letters, numbers (including embedded dots ('.'), and embedded ' and _ characters. The latter make it possible to search for possessives, contractions, and Ada identifiers without false matches. Other punctuation characters are ignored unless they are given in quoted text.
  • Searches on the AdaIC search engine include related words (plurals, possessives, etc.). Thus, it isn't necessary to include all forms of a word. A search for discriminant will also find discriminants and discriminant's. If you don't want related words included, quote the word or phrase.
  • The case and white space does not matter to matching.
  • Quoted text ("like this"), also known as a phrase, is matched exactly with the following exceptions:
    • White space is ignored, except that there must be some white space between words ("some thing" does not match "something", but does match "some  thing");
    • Case is ignored ("some thing" matches "Some THING").
The Ada Site Search is based on search indexes created for all of the relevant Ada-related sites that we know about (these are the sites listed in Links). Redundant sites have been eliminated, as well as sites that use character sets very different from Latin-1. (If you know of a relevant site not included in Links, please send us the URL of the site so we can include it in the future.)
As of this writing, about 25,000 pages are included in all of the indexes. Text and HTML pages are indexed; other file types are not indexed. Generally we trust the site's web server to tell our indexer the type of a page (as do many web browsers). That occasionally means we'll misidentify a page so some HTML markup will appear.
The Robots.Txt file is the standard way for webmasters to tell search engines which pages not to index. A few sites ask that most of the contents of the site not be indexed. The Ada Site Search obeys these directions, thus some sites have little or no material included.
With exception of a few major sites, sites have been categorized into Vendor sites, Organization sites, Source Code Library sites, and Other sites. Many sites could fit into multiple categories. However, each site appears only in one category. Thus, the categories should be used only as a broad guide. For instance, source code appears on many vendor and organization sites, as well as source code library sites. So we recommend searching all of the sites unless too many results are returned.
Since many vendors sell products for many programming languages, vendor sites are pruned to pages mentioning Ada or known Ada-specific products. Other types of sites are not pruned unless they contain substantial non-Ada material. For large vendors, pruning eliminates a large amount of irrelevant material, but also might lose some valuable material. Our experiments show that more than 90% of the relevant prose pages contain a form of the word Ada; most relevant pages that don't contain the word Ada are Ada source code.
Pages matching the criteria given are primarily scored based on the number of matches for each word. Bonus scores are given to pages which are relatively new. Bonus scores also are given based on the site from which the page comes. Sites which are very trusted (such as AdaIC and AdaPower) are given the largest bonuses, while archival sites are given the smallest bonuses. ARA member sites are given larger bonuses than other vendors (thus giving ARA members better positioning in search results), but results from all vendors are returned.
Once a set of matching pages is determined, redundant pages are removed from the set. This is done at lookup (rather than from the individual indexes) because many separate sites may have the same pages posted. For instance, many sites have the Ada Reference Manual and the Ada 95 Rational posted. By removing redundant copies of these pages, more relevant results can be shown. Redundant page removal is done by comparing word counts of the pages. It's possible, but very unlikely, for two significantly different pages to have the same word counts; so there is a very small chance of a non-redundant page being removed by this check.
The result page links are encoded versions of the real link; clicking on it will take you immediately to the correct page. This allows us to record which pages you found most relevant for particular search words. In the future, we'll use that information to improve the scoring of our result pages. If the indexes have been updated since the search was performed, these links may not work properly. Thus saving results pages is not recommended. However, the actual URL is always given in the results; you can paste it directly into your browser if necessary.
The AdaIC search engine was created between December 2002 and March 2003 by Tom Moran and Randy Brukardt for use on the AdaIC web server. The programs are all written in Ada 95, and were primarily created out of existing programs and components. The indexer was based on the link checker web crawler Finder. Most of the new code was devoted to the Words-Files index, the page storage and abstracting, and page scoring.
The lookup program is called directly from the Ada Server, and streams the output directly to that server's HTTP file transfer stream. Both Finder and Ada Server are built on top of Claw Sockets (and thus are Windows programs), although neither does much Windows-specific. When Claw Sockets is ported to other operating systems, both programs should port easily.
Ada Server is a medium performance, reliable web server. It is written in Ada, and only interfaces to the operating system (there is no foreign language code). Because of this, programming errors generally cause exceptions, which are logged and the task in question is reset. Only the request with the problem fails; all others continue normally. Reliability and security are enhanced further by the exclusion of foreign code of unknown reliability. Since the server is created as a single program, attacks which try to trick the server into running another program must fail (it never runs another program, so it cannot be fooled into running the wrong program). Of course, the server could be compromised by an attack on the underlying operating system or another server running on the computer. However, the primary aim of computer security is to make the system secure enough so that potential attackers look for easier systems to attack -- essentially, you just have to be more secure than you neighbor.
Sponsored by the following ARA member companies:
ARA Members AdaCore Praxis Critical Systems Sofcheck
 
Valid HTML 4.01! Valid CSS!