Methods and technologies of information retrieval. Effective Search Techniques. Features and indexing procedures

The technology of searching for information on the Internet.

Every year, the volume of the Internet is increasing at times, so the probability of finding necessary information  increases dramatically. The Internet unites millions of computers, many different networks, the number of users is increasing.

To find the information you need, you need to find its address. To do this, there are specialized search engines (index robots (search engines), thematic Internet directories, meta search systems, people search services, etc.). In this master class, the basic technologies for finding information on the Internet are revealed, general features of search tools are provided, structures are considered search queries  for the most popular Russian-language and English-language search engines.

Search tools are special softwarewhose main goal is to provide the most optimal and high-quality information search for Internet users. Search tools are hosted on dedicated web servers.

The methods of work used when working with certain search tools are almost the same. Before proceeding to their discussion, we consider the following concepts: The interface of the search tool is presented in the form of a page with hyperlinks, a query string (search string) and query activation tools. Search Engine Index is information basecontaining the result of the analysis of web pages, compiled according to certain rules. Request is keyword  or a phrase that the user enters into the search bar. For the formation of various queries, special characters ("", ~), mathematical symbols (*, +,?) Are used.

The information retrieval scheme is simple. The user types a key phrase and activates the search, thereby receiving a selection of documents for a formulated (given) request. This list of documents is ranked according to certain criteria so that at the top of the list are those documents that most closely match the user's request. Each of the search tools uses different criteria for ranking documents, both in the analysis of search results and in the formation of the index (filling the index database of web pages).

Thus, if you specify a query in the search bar for each search tool of the same design, you can get different search results. It is of great importance for the user which documents will appear in the first two or three dozen documents according to the search results and how much these documents meet the user's expectations.

Decision

Laboratory work No. 1

INFORMATION SEARCH TECHNOLOGY

1. Purpose of work
   Practical development of technology for effective information retrieval.

2. General information
   2.1. Information Search Tools
   You can find almost any information you need on the Internet. Informational resources  The Internet is characterized by the immense amount of materials accumulated over decades of existence. computer systems. They contain text files, programs, pictures, music, films; constantly updated and growing like an avalanche. Internet resources are widely used in almost all spheres of human activity. They play an ever-increasing role in learning.
   Skills in professional information retrieval technology are essential for an IT specialist who is incredibly vast and dynamic. Professional search allows you to not only minimize the chance of missing the information you need, but also significantly reduce the time and financial costs of searching for it.
  To search for information on the Internet are used: search engines, meta search tools, indexed catalogs, online encyclopedias and reference books. Modern search portals not only contain a set of the above search tools, but also provide additional services, such as free addresses email, places to place home Web pages, etc. For an effective Internet search, you need to know the principles of the functioning of search tools and be able to correctly form a search query.
Search engines, constantly scanning available Internet sites, download the pages found in the database and form a special database that stores indexed information about the downloaded pages (see, for example, the principles of the search engine Rambler). When a query arrives, the search engine, using indexed information, provides a list of documents ranked by the location of the keywords in the search query, their frequency in the text, and other parameters. Having a similar operating principle, search engines, however, differ in the algorithms used and the search principles, which are also constantly being improved; therefore, the search results for different machines are different.
   Currently, there are a large number of search tools. The most popular among several hundred different types of search engines are the search tools of the following portals.
  Google ( http://www.google.com/) the world leader in terms of indexed documents and query processing speed. Google is the most popular search system, provides search regardless of source language, easy to use, has a good query language and simple interface.
   Yandex (http://www.yandex.ru/) is the largest Russian portal offering users numerous services, including search and information  (12 services). Yandex search engine robot constantly scans the Internet, automatically tracking changes; search results are ordered in accordance with established relevance criteria (degree of compliance of the searched and found). The search system provides advanced search capabilities that allow you to refine multiple search options.
   Rambler (http://www.rambler.ru/) Russian search portal. As a professional search engine Rambler has existed since 1996. It provides various types of search, including an advanced search by a combination of various parameters, taking into account the morphology of the Russian language.
   Aport! (http://aport.ru/) The Russian search portal Aport has advanced capabilities for formulating queries; provides search for graphic images and multimedia files.
AltaVista (http://www.altavista.com/). The search engine AltaVista appeared at the end of 1995 and before the triumph of Google was the world leader in search (the system index already contained more than 20 million pages). Designed to search for information on the Internet, regardless of the region of the world; provides search for graphics, audio and video files, as well as the ability to translate into the main European languages; one of the first to support the search in Russian.
   Yahoo! (http://www.yahoo.com/). As a search engine, it is the most authoritative directory of Internet resources. Issues by search topic  the maximum number of foreign sites; provides search in Russian.
   To search for information, meta-search systems are also used. The metaposk system provides a search on several search engines at once (up to several dozen). However, it makes sense to use metasearch mainly in cases of searching for a specific document or on very narrow topics.
   2.2. Search Query Language
  A search query can generally consist of one or more words, logical operators, and punctuation marks. Simple queries do not require knowledge of the language of search queries, so if you type in search string  several words without punctuation marks and logical operators, then documents containing all these words will be found, and at an arbitrary distance from each other. Knowledge of the query language of a particular search engine and its correct application allows you to make the search fast and efficient.
   Query Language Operators
   The AND operator (logical AND; the abbreviation "&") form a complex query that only finds documents that simultaneously contain both operator arguments. For example, for the query: “information AND technology” only those documents will be found that contain both the word “information” and the word “technology”. A similar result will be obtained with such a request: “information & technology”.
   Note. The AND operator is used by default, so the request is: " information technology"Will give the same result as the query:" information AND technology. "
  The OR operator (OR; the abbreviation "|") form a query by which all documents that satisfy at least one of the operator arguments will be found. Upon request: “information OR technology”, documents will be found that contain at least one of the words “information” or “technology”, or both of these words.
The NOT operator (AND NOT, the abbreviation "&!") Forms a query, by which documents will be found that satisfy the left side of the query and do not satisfy the right. Upon request: “information NOT technology”, documents will be found that contain the word “information” and do not contain the word “technology”.
  Note. If a complex request includes several statements, then it will be executed according to the traditional priorities of these statements. You can change the order of execution by using parentheses.
  Quotes
  You can use double quotation marks to search for quotes or words in a given form. Words in that part of the query, which is enclosed in double quotation marks, are searched in documents exactly as they are represented in these quotation marks. For example, the request “information technology“ apply ”” is satisfied by a document that contains the text “... information technology will be applied ...”, but does not satisfy a document containing “... information technology is applied ...”. And for the search query “” information technology is applied ”,” only documents containing exactly the same combination of words in a row will be returned, while the number of selected documents will sharply decrease.
  Note. This is how double quotes are interpreted by all search engines named above search portals  with the exception of Aport, in which the operator double (or single) quotes allows you to find the phrase specified in them, or close to it, so that Aport in the search query with double quotes will not distinguish between the words "apply" and "apply".
  Parentheses
   Brackets can be used to build nested queries, change the scope of operators, and also to change the default priorities of logical operators. When using parentheses, the part of the query that is enclosed in parentheses is interpreted as a query, so that the rules of the query language apply to it.
  For example, for the query “data AND information OR signal”, the search engine will find documents containing either “data” and “information” or “signal”. And for the query “data AND (information OR signal)”, “data” and one of the words “information” or “signal” will be found.
   Distance limit
With a simple query, documents will be found that contain all the query words that do not exceed the default distance limit (for example, in Rambler it is equal to a distance of 40 words). So for the request “information data” the document will be issued only if the words “information” and “data” are no more than 40 words in this document, including these two words.
   The value of the distance constraint can be changed, for example, with the Rambler construct (n, query), where n is a positive number, query is a valid search query. For example, at the request “(2, information data)”, only those documents will be issued in which the words “information” and “data” appear next to each other at least once.
  Other search engines usually use different distance restriction operators. Search engines can also allow you to set this parameter in the advanced search menu (see, for example, Yandex advanced search help).
   The above elements of the language of search queries are used, as a rule, by all search engines. At the same time, search engines can use other designs, including metacharacters and special operators, so for details you should refer to the help system of a particular search engine.

3. The order of the work

3.1. Read the material in paragraph 2 of this work.
   3.2. Turn on the computer assigned to you and receive an individual assignment from the teacher.
   3.3. Get acquainted, using hyperlinks, with the capabilities of the search engines of the portals specified in clause 2.1.
   3.4. Design your search query options in accordance with your assignment.
   3.5. Carry out the search for the required documents in accordance with the search query options.
   3.6. Analyze the results.
   3.7. Make a report and protect the laboratory work.
   3.8. Turn off the computer and tidy up the workplace.

5. Security questions
   1. How will Yandex search engine interpret the query “information technology”?
   2. Will Aport return to the search query “information technology” documents containing the words information technology but not containing the word information technology?
   3. How can I change the scope of logical operators in a search query?
   4. List the basic logical operators of the query language.
   5. What is the difference between metasearch engines and search engines?

Technology for finding information on the Internet. Types of Search Tools

1. Introduction

2. Search technology

2.1 Search tools

2.2 Search Engines

2.3 Directories

2.4 Link Collections

2.5 Databases of addresses (addresses database)

2.6 Search the Gopher archives (Gopher archives)

2.7 FTP File Search System (FTP Search)

2.8 Usenet News Conferencing Search Engine

2.9 Meta-search systems

2.10 People Search Systems

3. Conclusion

Application . Brief information  about search engines

1. Introduction

Every year, the volume of the Internet is increasing at times, so the probability of finding the necessary information increases dramatically.

The Internet unites millions of computers, many different networks, the number of users increases by 15-80% annually. And, nevertheless, more and more often when accessing the Internet the main problem it turns out not the lack of the required information, butthe opportunity to find her. As a rule, an ordinary person, for various reasons, cannot or does not want to spend more than 15-20 minutes searching for the answer he needs. Therefore, it is especially important to correctly and competently learn, it would seem, a simple thing - where and how to look in order to receive the DESIRED answers.

To find the information you need, you need to find its address. To do this, there are specialized search engines (index robots (search engines), thematic Internet directories, meta-search systems, people search services, etc.).

The main technologies for searching for information on the Internet are disclosed, the general features of search tools are provided, and the structure of search queries for the most popular Russian and English search engines is considered.

2. Search technologies

Web technology World Wide Web (WWW) is considered a special technology for the preparation and placement of documents on the Internet. WWW includes web pages, electronic libraries, catalogs, and even virtual museums! With such an abundance of information, the question arises sharply: “How to navigate in such a huge and large-scale information space?” People come to the rescue in solving this problem search tools.

2.1 Search Tools

Search tools are a special software, the main purpose of which is to provide the most optimal and high-quality information search for Internet users. Search tools are hosted on special web servers, each of which performs a specific function:

1. Analysis of web pages and recording the results of the analysis at one or another level of the search server database.

2. Search for information at the request of the user.

3. Providing a convenient interface for searching information and viewing the search result by the user.

The methods of work used when working with certain search tools are almost the same.

First, consider the following concepts:

1. The interface of the search tool is presented in the form of a page with hyperlinks, a query string (search string) and query activation tools.

2. Search Engine Index- This is an information base containing the result of the analysis of web pages, compiled according to certain rules.

3. A query is a keyword or phrase that the user enters into the search bar. To form various queries, special ("", ~) and mathematical symbols (*, +,?) Are used.

Information retrieval scheme is simple. The user types a key phrase and activates the search, thereby receiving a selection of documents according to the formulated request. This list of documents is arranged according to certain criteria so that at the top of the list are those documents that most closely match the user's request. Each of the search tools uses different criteria for ranking documents, both in the analysis of search results and in the formation of the index (filling the index database of web pages).

Thus, if you specify a query in the search bar for each search tool of the same design, you can get different search results. It is of great importance for the user which documents will appear in the first two or three dozen documents according to the search results and how much these documents meet the user's expectations.

Most search tools offer two search methods - simple search (simple search) and advanced search (advanced search) using a special request form and without it. Consider both types of search using the example of an English-language search engine.

For example, AltaVista is convenient to use for arbitrary queries, “ Something about online degrees in information technology”, While the search tool Yahoo allows you to receive world news, information about exchange rates or weather forecasts.

Mastering the criteria for refining the query and advanced search techniques allows you to increase the efficiency of the search and quickly find the necessary information. First of all, you can increase the search efficiency by using the logical operators (operations) Or, And, Near, Not, mathematical and special characters. Using operators and / or symbols, the user associates keywords in the desired sequence to get the most relevant search result. English request forms are given in table 1.

Table 1

Simple request

Extended request

Advanced

using mathematical

characters

internet merchant account and

Internet + merchant + account

merchant account

internet ~ merchant ~ gov *

internet merchant account

internet merchant near gov *

internet ~ merchant ~ governor

"merchant account"

internet merchant near education

Internet ~ merchant ~ (governor

"internet merchant account"

A simple request gives a certain number of links to documents, as the list contains documents containing one of the words entered upon request, or a simple phrase (see table 1). The and operator allows you to specify that all keywords should be included in the content of the document. However, the number of documents may still be large, and their viewing will take enough time. Therefore, in some cases it is much more convenient to use the context operator near, indicating that the words should be located in the document in sufficient proximity. Usingnear greatly reduces the number of documents found. The presence of the symbol "*" in the query string means that the search for a word by its mask will be carried out. For example, we get a list of documents containing words starting with "gov", if we write "gov *" in the query string. It can be the words government, governor, etc.

The most developed service for searching for Russian-language information is provided by the Yandex search server.

In Yandex, you can simply write a phrase in Russian that describes what you want to find, and the system will analyze and process your request, and then try to find everything that relates to a given topic.

You can, using special operators, make a line explaining to the search engine what information your interest should meet. Some of the Yandex query language operators can be viewed here: http://help.yandex.ru/search/?id\u003d481939

The equally popular search engine Rambler maintains statistics on the attendance of links from its own database, supports the same logical operators AND, OR, NOT, metacharacter * (similar to the * character extension in AltaVista), coefficient symbols + and -, to increase or decrease the significance words entered into the query.

Let's look at the most popular technologies for finding information on the Internet.

2.2 Search engines

Web search engines are servers with a huge database of URLs that automatically access WWW pages at all of these addresses, examine the contents of these pages, generate and write keywords from the pages into their database (indexes pages).

Moreover, search engine robots follow the links found on the pages and reindex them. Since almost any WWW page has many links to other pages, with this kind of work the search engine can theoretically crawl all sites on the Internet.

This type of search tools is the most famous and popular among all Internet users. Everyone has the names of well-known web search engines (search engines) – Yandex,

Rambler, Aport.

To use this type of search tool, you need to go to it and type in the search term your keyword.

To make your search more effective, pay attention to the following points in advance:

determine the subject of the request. What exactly do you want to find in the end?

pay attention to language, grammar, the use of various non-letter characters, morphology . It is also important to correctly formulate and enter keywords. Each search engine has its own form of compiling a request - there is only one principle, but the symbols or operators used may vary. The required request forms also vary depending on the complexity of the search engine software and the services they provide. Anyway, each search engine has a "Help "(" Help "), where all syntax rules, as well as recommendations and tips for searching, are easily explained (screenshot of search engine pages).

use the capabilities of different search engines . If not found on Yandex, try on Google. Use advanced search services.

to exclude documents containing certain terms, use the "-" sign   before every such word. For example, if you need information about Shakespeare’s work, with the exception of Hamlet, enter the query in the form: Shakespeare Hamlet.   And in order to include certain links in the search results, use the symbol "+ ": links about the sale of cars specifically - request" sale + car ".

each link in the list of search results contains a snippet - several lines from the found document, among which your keywords are found. Before clicking on the link, evaluate the snippet's compliance with the query subject. Following the link to a specific site, carefully look around the main page. As a rule, the first page is enough to understand whether you came to the address or not. If yes, then further searches necessary information enter on the selected site (in the sections of the site), if not, return to the search results and try the next link.

remember that search engines do not produce independent information (except for explanations about themselves). Search system

it is only an intermediary between the owner of the information (site) and you. Databases are constantly updated, new addresses are entered in them, but the lag behind the information that actually exists in the world still remains. This is simply because search engines do not work at the speed of light.

The most famous web search engines include Google, Yahoo, Alta Vista, Excite, Hot Bot, Lycos. Among the Russian-speaking can distinguish Yandex, Rambler, Aport.

Search engines are the largest and most valuable, but far from the only sources of information on the Web.

Internet Search Technologies

Every year, the volume of the Internet is increasing at times, so the likelihood of finding the necessary information increases dramatically. The Internet unites millions of computers, many different networks, the number of users increases by 15-80% annually. And, nevertheless, more and more often when accessing the Internet, the main problem is not the lack of information sought, but the ability to find it. As a rule, an ordinary person, for various reasons, cannot or does not want to spend more than 15-20 minutes searching for the answer he needs. Therefore, it is especially important to correctly and competently learn, it would seem, a simple thing - where and how to look in order to receive the DESIRED answer. To find the information you need, you need to find its address. To do this, there are specialized search engines (index robots (search engines), thematic Internet directories, meta-search systems, people search services, etc.). This master class reveals the basic technologies for searching for information on the Internet, provides the general features of search tools, examines the structure of search queries for the most popular Russian and English search engines.

Web technology World Wide Web (WWW) is considered a special technology for the preparation and placement of documents on the Internet. WWW includes web pages, electronic libraries, catalogs, and even virtual museums! With such an abundance of information, the question arises sharply: “How to navigate in such a huge and large-scale information space?” Search tools come to the rescue of this problem. Search tools are special software whose main purpose is to provide the most optimal and high-quality information search for Internet users. Search tools are hosted on special web servers, each of which performs a specific function:

  Analysis of web pages and recording the results of analysis at one or another level of the search server database.

Search for information at the request of the user.

Providing a convenient interface for searching information and viewing the search result by the user.

The methods of work used when working with certain search tools are almost the same. Before proceeding to discuss them, consider the following concepts:

The search tool interface is presented in the form of a page with hyperlinks, a query string (search bar) and query activation tools.

A search engine index is an information database containing the result of the analysis of web pages, compiled according to certain rules.

A query is a keyword or phrase that the user types in the search bar. For the formation of various queries, special characters ("", ~), mathematical symbols (*, +,?) Are used.

The information retrieval scheme is simple. The user types a key phrase and activates the search, thereby receiving a selection of documents for a formulated (given) request. This list of documents is ranked according to certain criteria so that at the top of the list are those documents that most closely match the user's request. Each of the search tools uses different criteria for ranking documents, both when analyzing search results and when creating an index (filling the index database of web pages). Thus, if you specify a query in the search bar for each search tool of the same design, you can get different results search. It is of great importance for the user which documents will appear in the first two or three dozen documents according to the search results and how much these documents meet the user's expectations. Most search tools offer two search methods - simple search (simple search) and advanced search (advanced search) using a special request form and without it. Consider both types of search using the example of an English-language search engine. For example, AltaVista is conveniently used for arbitrary queries, “Something about online degrees in information technology”, while the Yahoo search tool allows you to receive world news, information about exchange rates or weather forecasts.

Mastering the criteria for refining the query and advanced search techniques allows you to increase the efficiency of the search and quickly enough to find the necessary information. First of all, you can increase the search efficiency by using the logical operators (operations) Or, And, Near, Not, mathematically x and special characters in queries. Using operators and or symbols, the user associates keywords in the desired sequence to get the most relevant search result. [ 9 ]

Internet resources in the global network are becoming more and more and more difficult to find the necessary information every day. Therefore, all market participants in modern search engines already have the impression that today's search technologies are outdated and that the concept of search itself needs to be changed. AT currently Google is still the undisputed leader in search - 47% of all Internet users choose this service, then Yahoo! and MSN - 21 and 13% of hits, respectively, that is, in general more than 80% of the world's inhabitants prefer these particular search engines. One to none of the three main search servers  cannot boast of a high degree of loyalty among its regular users: almost 71% of those who searched on Yahoo! also sometimes visit one of the other two services - Google or MSN Search, 70% of those who searched on MSN, also Tried luck in one or another competitive search engine. Seeing such dissatisfaction with search results, the creators of search engines try to improve their search engines and try to apply new search technologies. So, the so-called self-constructor was launched on the Google portal, where users of this search engine can customize the search process as they wish. For example, if the user is interested in the weather, he can see the weather widget in his city. And the way to display news, display stock exchanges and many other useful things can be customized in accordance with your interests. Naturally all similar settings  the user can only use while he is authorized on the site of the search engine. With the advent of this technology, Google’s site was in many ways ahead of its competitors - the oldest portals Yahoo Internet  ! and MSN.

Search  - a process during which, in one sequence or another, the sought one is correlated with each object stored in the array.

In terms of the use of computer technology " information search "- a set of logical and technical operations with the ultimate goal of finding facts, data, documents relevant to the consumer's request.

Relevant Document  - This is a document containing the required information.

Search tools

  1. Search engines (search engines);
  2. Thematic catalogs (categories);
  3. Specialized catalogs (online encyclopedias andreference books);
  4. Metasearch systems.

Thematic catalogs

Thematic catalogs are a systematic collection (selection) of links to other Internet resources. Links are organized in the form of a thematic rubricator, which is a hierarchical structure, moving along which, you can find the information you need.

Specialized Catalogs

Specialized catalogs or directories are created for specific industries and topics, for news, for cities, for email addresses, etc.

Metasearch Tools

When using meta-search tools, a query is carried out simultaneously by several search engines. The search result is combined into a general list ordered by degree of relevance.

Search engines

Search engines (the most advanced search tool on the Internet) are automatic systemspolling servers connected to the global network and storing in their database information about the data available on the servers.

Search engines consist of three parts: a robot, an index, and a query processing program.

A robot (Spider, Robot or Bot) is a program that visits web pages and reads (in whole or in part) their contents.

Index  - This is a data warehouse in which copies of all pages visited by robots are concentrated.

Request Processing Program  - This is a program that, in accordance with a user’s request, “scans” the index for the availability of the necessary information and returns links to the documents found.

Search engines work in four stages:

1.Web space scan

A search engine around the clock using robots scans the available Web space and copies to itself all the pages it encounters.

2. Resource indexing

Pages discovered by search robots are processed by the request processing program and a special database called a pointer is compiled from them. The purpose of indexing is to obtain an index file with which the client’s request is processed almost instantly.

3.Search by request

The search engine receives a request from the user in the form of keywords and does not go to the Network, but to its database. The number of pages found can be very large, so before the results are sent to the client, the search results are ranked.

4.Formation of the resulting page.

The system generates a dynamic web page of framed search results.

Today, a fairly large number of search engines are known.

http: //site/uploads/posts/2013-11/1385453618_12.jpg

The largest and first most popular search engine, which processes 42 billion queries per month, indexes more than 25 billion web pages, can find information in 195 languages. Supports document search pDF formats, RTF, PostScript, Microsoft Word, Microsoft Excel, Microsoft PowerPoint and others.

The fastest and most reliable way to search for information on the Internet is to search by URL (Universal Resours Locator).

For quick access  to resources, it is enough to launch a browser program and type the familiar URL in the address bar.

For example, by typing bolohovomt.ru in the address bar, you can get to the site of the Bolokhov Engineering College

One of the most common types of searches is keyword search.   Consider this type of search as an example search google systems  (see the video “Information Search” .mp4).

To search by keywords, you need to enter a word or several words to be searched in a special window and click on the Find button. The search engine will find in its database and display documents containing these words.

The speed of obtaining the result depends on the characteristics of the communication channels, the features of the organization of the search engine, and on the "quality" of building the request.

If the user cannot directly influence the operation of search engines, then the quality of the search query is entirely in his competence.

Simple Search Techniques

1.Word Group Search

The words “open” or “education” will give a single search for a large number of diverse links related to completely different topics, and hardly related to “open education”. Therefore, it is recommended that you add one or two keywords related to the topic you are looking for. For example, “open education” or “open education technology”. It is also necessary to narrow the scope of the question. If you need to find information about the legal system of the Guarantor, then the query “legal system of the Guarantor” will produce more suitable documents than just the “legal system”. The number of words in a group is not limited.

2.Word Form Search

In most cases, the search engine by default searches for all word forms of the language. However, you can tell the search engine not to iterate over all word forms of words from the query when searching. Many systems use an exclamation mark for this. For example, the query “! Computer” will find pages with this word without taking into account word forms

3.The role of capital letters

If the user entered as a query a keyword with capital letter, the search engine will not find pages containing this word starting with lowercase letter. therefore capital letters  in the request it is recommended to use only in proper names. For example, "the city of Moscow", "Mark Thulius Cicero."

4.Wildcard Meaning

When there is no certainty that the search system correctly processes word forms (that is, when it comes, for example, to proper names or words of foreign origin), search engines allow the use of wildcards. Most often, this is the symbol "*" instead of any number of any characters to the end of the word. For example, if the user wants to find pages containing the words "Republic of Tatarstan", but also suits the Tatar Republic, then you need to submit the request "Republic of Tatar *

5.Accounting for reserved words

Reserved words (stop words) are those words that are not taken into account in the search. Usually they include all short words that include less than 4 letters (prepositions, conjunctions, etc.). For example, the query “we are in Italy” will find documents that include the word “Italy” or its word form.

6.Contextual Search Tools

If the keywords are in quotation marks, then the search engine should find documents in which the given phrase is present literally (search for a quote).

Advanced Search Techniques

For faster and more successful searches in search engines  various logical operators are used in conjunction with keywords. Thanks to this, it is possible to construct a query so that it will not find sites on a topic of interest, but specific pages and even individual documents. Rules for making complex queries on one search engine  may differ from those on another, but in any case, the following basic operators will be used:

1.AND operator

Using this operator, two or more words are combined so that they are all present in the searched document. Often instead of AND they use & or +. Example: at the request of a lawyer. And the program will find documents containing both words.

2.Operator OR (OR)

Provides a search on any of the group’s words. Example: for education OR training, documents containing the word education or training will be found.

3.Logical brackets

They are used when it is necessary to control the order of logical operators. Example: at the request of Lomonosov OR (Mikhail I Vasilievich), documents containing the words Lomonosov or Mikhail I Vasilievich will be found.

4.NOT operator

It is used when it is necessary to exclude any keyword from the search results, for example, at the request of lawyers, NOT lawyers will find information about lawyers who are not lawyers.

Do you like the article? Share with friends: