Search through text files. WinGrep for searching text in files

To say that in our time of information technology and the infinite growth of the amount of data available both to an individual person and to society, there are many problems with the processing of information and its search - this is blasphemy. Who only does not raise this topic. And in order not to load you with subjective and, partly, objective judgments gleaned from various information sources regarding the problem, I will go directly to its solution. Today let's talk about the search. That is, about programs and serious information systems that search for the documents and data we need.

Direct Search Upgrade

Not so long ago, when the trees were large, and there was not much information even in the local network of the enterprise, any search was carried out by banal search of a handful of available files and sequentially checking their names and contents. Such a search is called direct, and programs (utilities) that use direct search technology are traditionally present in all operating systems and tool packages. But, even the power of modern computers is not enough for a quick and adequate search in gigantic amounts of data with a direct search. Searching for a couple of hundreds of documents on disk and searching in a huge library and several dozen mailboxes are two different things. Therefore, direct search programs today clearly go by the wayside - when it comes to universal tools.

Of course, in the corporate sector this type of search has not been in demand for a long time. Volumes are not the same. And, therefore, for already a year, and recently definitely, technologies capable of quickly and accurately searching for documents of various formats and from various sources are more than relevant. Not so long ago, Microsoft's “dad” Bill Gates, envying, apparently, the phenomenal success of Google’s Internet search engine, at one of the press conferences announced the desire of the software (and not only) in every possible way to contribute, develop and deepen the creation of search engines and technologies. But it’s too early to create any phenomenally working program from Microsoft or a competitive Internet server (MSN still doesn’t reach Google). Therefore, we turn to existing developments. Index, query, relevance

At the heart of modern technology are two fundamental processes. Firstly, it is indexing the available information and processing the request with the subsequent output of the results. As for the first, then any program (whether it is a desktop search engine, corporate information system or the Internet search engine) creates its own search area. That is, it processes documents and forms an index of these documents (an organized structure that contains information about the processed data). In the future, it is the created index that is used to work - to quickly obtain a list of the necessary documents as requested. Further, though by no means simple in terms of technology, it is quite understandable to the average user. The program processes the request (for a keyword phrase) and displays a list of documents that contain this keyword phrase. Since the information is contained in a structured index, the query processing is much (tens and hundreds of times!) Faster than in the case of direct search (documents are selected not by enumerating files, but by analyzing textual information in the index).

The program displays the found documents in the resulting list according to relevance - the document matches the query text. In various technologies, of course, there are various methods for searching and determining the relevance of a document (the number of "occurrences" of a word and its frequency of mention in the document, the ratio of these parameters to the total number of words in the document, the distance between the words of the query phrase in the desired files, and so on). Based on these parameters, the "weight" of the document is determined and, depending on it, this or that file appears in the list of results at a certain position. In the case of Internet search, the situation is even more complicated. Indeed, in this case, one must take into account many other factors (Page Rank Google is an example of this). But this is a topic for a separate article, so we won’t touch the Internet. Search engine review

This article discusses the capabilities of several popular search programs, which boast both decent speeds and good functionality. But to show off in brochures is one thing, but to withstand the gaze of an expert is quite another. And there were no more experts, no less than a complete office for fans of picking software for its usability. A set of programs was installed on the experimental computer (Athlon 2.2 MHz, with 1 GB RAM, 160 gigabyte ID Seagate 7200 rpm hard drive and Windows XP system): dtSearch Desktop, Sniffer Prof Deluxe, Google Desktop Search, SearchInform , Copernic Desktop Search, ISYS Desktop. For the tests, a text database of documents in the doc, txt and html formats was compiled with a total size of neither more nor less, but 20 gigabytes. A group of comrades, led by your humble servant, tested, compared and shared their subjective impressions for each software. Read the summary of the data below. dtSearch Desktop

The program, claiming, according to developers, the fastest, most convenient and best search engine. Like, in general, everyone else from this review. The dtSearch interface is quite simple, but some windows or tabs are somewhat overloaded with elements, which creates the impression of complexity of use. But in fact, there are no special difficulties. The only really unpleasant moment is the lack of support for the Russian language software (despite the fact that the program can search for documents in several languages, its interface is exclusively English).

But dtSearch is one of the few programs that can index web pages to a user-specified "depth" (though, given the "additional purchase" in the dtSearch Spider add-on kit). This is in addition to supporting files on the disk of various text formats and emails from the Outlook mailbox. At the same time, the program does not know how to work with databases, which are such a tidbit for search engines because of the large amounts of information contained in them and the wide distribution in companies, and therefore in corporate networks. The speed of indexing dtSearch documents was up to standard. Looking ahead, I will say that this program coped with indexing a given amount of information at a level with another competitor - iSYS - and shared with it the second place in the list of fastest systems. Test 20 gigabytes of information dtSearch indexed in 6 hours 13 minutes, creating an index of 7.9 GB for the needs of a subsequent search.

As for the search capabilities, here they are up to standard. Firstly, dtSearch has a morphological search (word search in all its morphological forms). Using this opportunity, you free yourself from, say, such thoughts as "in what case was a word used in the document I needed?". The use of morphological search is almost always justified, therefore it should be present in any professional search engine.

Sound search is a non-standard feature even for professional search engines. Its essence is that the program will look for words that sound the same as the word you entered. And best of all, this feature also works for the Russian language! For example, typing the word “ear” in a search query, you will see not only the words “ear”, but also “ear” as a result.

Error correction search is a very important function. It is used to search for words containing syntax errors - these can be either typos or errors in documents received using character recognition systems, for example. A simple example is you are looking for the word keyboard. Some document contains the word "keyboard", it is obvious that in fact this word is "keyboard", it’s just that a person was typed up when typing. So, search with error correction, it will detect and include a document with the word "keyboard" in the result. DtSearch also has a setting that allows you to determine the degree of possible erroneous characters.

Search using synonyms. This feature uses a list of synonyms for various words. So, for example, entering the word “fast”, the program will also find the words “fast” and others, which are synonyms for the word “fast”, if any, of course, are present in the list of synonyms. A ready-made list of synonyms is not supplied with the dtSearch program, however, it is possible to use lists on the Internet (accordingly, a connection is required, which is not always convenient), or you can create your own list of synonyms.

In addition to these features, dtSearch can perform searches using phrases consisting of words connected by logical operations. Each word in the query can be set to its "weight", that is, significance. A useful option is the use of a dictionary consisting of insignificant words in order not to take them into account in the search, however this dictionary is also empty and it will have to be completed independently.

Next, consider the features of the program when working on the network. In fact, dtSearch does not offer any specific features for working with the network. However, using it on a network is entirely possible. Alternatively, you can create some index and put it in a public (shared) folder. The program itself can be installed for each user on a computer, or put it also on a folder open for public access, and in a special way create shortcuts for each user separately using command line parameters, the purpose of which is described in the help file that comes with the program. Also, it is possible to automatically install the program on the network using an MSI file. In this case, the settings for each connected user will be taken into account.

In general - a good program from the category of professional search engines. It can claim a good mark, but gaining trust and respect from users can be difficult for dtSearch due to some factors (not everything is smooth with the interface, Russian users are deprived, there are no bright features for working with the network). As for the direct search for documents, the program did not have overlays with Russian text. As there was none of them with the declared morphology, nor with a fuzzy search. The system quite adequately found the necessary documents both by a simple request in one word and by using as a key phrase a pair of paragraphs or a document.

Official site:
Distribution size: 23 Mb

Based on the name, one can guess that there is support for the Russian language in this program. This is already nice. As for the interface, in general, it is somewhat unusual, but it looks very attractive. Convenience is another matter. A very controversial criterion, but still, probably, a multi-window solution is not the most successful option (the query is entered in one window, the result is displayed in another and the like).

The snooper uses all the same indexes to perform a quick search, but indexing is much slower than other programs. This is very strange, especially considering that her search query processing capabilities are very weak, which means that the structure of the index is not complicated. Most likely, the point here is in non-optimized algorithms. This program turned out to be a clear outsider of indexing and search speeds: the time taken to create the index is six times that of the same dtSearch and iSYS. Indexing 20 gigabyte texts for a bloodhound resulted in 38 hours and 46 minutes of work. And the created "search area" on the hard drive took the same size as the original data with a slight minus - 19 gigabytes.

A snoop can be presented as an alternative to the standard search in Windows, it is hardly capable of more. The fact that the primary task of the Bloodhound is the simplest file search indicates not only a small number of functions for analyzing the text of search queries and an advanced search on file attributes, but even a results window that gives direct links to found files, as well as folders containing these files. The results window is not very informative in the sense that you can read the entire found file only by running it, that is, it does not have a built-in file viewer. But an excerpt is issued from the file where the search word was found, in general, this display scheme is very similar to Internet search engines.

Speaking about specific possibilities for processing search queries, it is worth noting that there is no such thing as “search for text,” the maximum that you can search for is a phrase, if only because there is no multi-line text input field. Nevertheless, it is possible to analyze the entered phrase and Snoop offers us a standard search set here: logical operations, mask search and citation search ... not a lot. The program contains some rudiments of morphological search, but probably so crude that it rather interferes with correct operation (during the tests, many overlays with incorrect use of morphology were noticed).

But the program allows you to specify file attributes when searching (document date, file name, folder name), and you can also use the same search set in these queries. Also, you can search for letters by specifying the parameters (From, Subject .... etc.).

So, we figured out the search itself, what else is the program interesting for, for which it received so many awards, according to information from the official website? It’s hard to say that it’s so special in it that most likely the Ischek’s interface has itself (especially externally, not to mention usability).

Operations with indexes are very standard, a pleasant moment is the ability to update indexes on a schedule. In addition, indexes can also be used on the network. From this moment it is necessary in more detail.

Despite the primitiveness of search queries, the program can be used to search for files, so its use can be justified in networks. Although it’s a stretch, as in a large network, the priority is to quickly search for data using complex search queries due to the huge amount of information - but the speed of the search and the program is clearly a problem. It must be said that the work with the network at the Bloodhound is thought out as it should. Especially for this, a separate application is designed - Snoop Server. It works in the same way as just Snoop (they have one search engine), only for documents hosted on a central server or on shared resources on a corporate network. The Server Snoop creates new indexes on shared resources, or uses previously created ones. Any user of the corporate network can connect to the Server Snooper and use it to access any document (located in the current index) using an Internet browser. You must admit that such a scheme is extremely convenient: it turns out that files on your own network can be searched in the same way as information on the Internet through, for example, Google.

Assessing all the advantages and disadvantages of this program, the conclusion suggests itself that for corporate networks its capabilities are most likely not enough (despite the Doge’s good organization of working with the network), but for a home computer or even for a home network may come up. Although neither the speed of the work nor the search capabilities inspire optimism ...

Official website in Russian:
Distribution size: 6 MbGoogle Desktop Search + GDS Enterprise

Of course, we could not ignore such a famous developer. The name Google already says a lot. The people who have used the powerful Internet search engine for years will probably decide without any doubt to install this particular search engine on the computer. Think about it: Google on your home computer! However, not succumbing to provocations with a widely promoted brand, we will try soberly, and most importantly objectively, to consider the possibilities of a "desktop" search engine from Google.

The first thing that catches your eye is the lack of its own shell for the program. Google Desktop Search is still in the browser window, respectively, the entire interface of the desktop version went to the software from an older Internet brother. Whether this is good or bad is a moot point: someone likes minimalism in the design of this search engine, and someone wants to see a full-fledged application filled with all kinds of buttons and so on.

What catches your eye immediately after design? And the fact that this very Google Desktop Search begins to index everything on the computer in a row, without any demand! And most interestingly, it’s impossible to select indexing paths using Google Desktop Search. You will have to download a separate program (TweakGDS), which will allow you to slightly expand the settings of Google Desktop, including specifying the necessary places for indexing. Although, while you’ll figure it all out, it’s already indexing the standard hard drive, so this setting is needed more when working with large data arrays, which is very important when used in corporate networks (Enterprise version). However, it is not a fact that after downloading TweakGDS, your problems will be solved. After all, she needs the Microsoft .NET Framework and Microsoft Scripting Runtime to work. Yeah ... the installation, as well as access to the settings, could be made easier, although developers can probably understand: why write something new when there is a ready-made search engine, port it to the local computer and let the user "enjoy" , and a famous name will make another masterpiece of "this". C'mon, let's end this digression and move on to the search.

As for the analysis of search queries and the issuance of results, here everything is absolutely identical to Google on the Internet: the same system for displaying results, the same standard set of logical operations for search queries. In general, Google Desktop Search, like the previous program, is intended exclusively for searching files - of course, there is no internal viewer of these files in it. The number of file formats supported by Google Desktop Search is quite sufficient, and it’s also nice that it searches the visited Internet pages, taking data from the cache. Search and index speeds are quite acceptable. True, for home use. With an impressive 20 gigabytes of texts, Google Desktop Search managed in 8 hours 17 minutes. Spending several days on processing information from the corporate network of a large enterprise does not smile at any system administrator. From the pluses: the size of the created index was at the level of (4.5 GB) with another search engine tested in this review - SearchInform.

The big advantage (or the omission is up to you) of Google Desktop Search is that it supports plugins that can change a lot for the better. Another thing is that connecting plug-ins and configuring them complicates the task of installing a search engine so much that you start to wonder if all this is necessary when you can install a normal, full-fledged program in which everything will already be present. After all, to use every opportunity you will have to install a new plugin. Even in order for the program to fully work with archives, you need a separate lotion. Fascinating and alluring free of all these additional modules. However, if you do not take into account the desktop version of the search engine, then the competent configuration of GDS Enterprise may not be possible for you - it is not for nothing that Google experts offer their services for setting up their own software for your network for only $ 10,000.

If you nevertheless master the setup and installation procedure (or pay $ 10,000 to the quick response team from the Google office), then you will understand that the complexity of the installation is more than offset by very flexible settings when used in corporate networks. An important point of Google Desktop on the corporate network is the use of group policies, which makes it possible to set preferences for each user.

To summarize, it should be said that the most reasonable application for this program is a home or work computer. Indeed, for an ordinary computer, it is enough to simply install the program - it will do the rest itself (it won’t even ask you anything).

Nevertheless, Google Desktop Search Enterprise will be acceptable in cases of urgent need for flexible network policy settings for using the search engine, while the ability to process search queries will be in second place in importance, and the time (or money) spent on setting up the program will be in the first location.

Official site:
Distribution size with TweakGDS: 1.2 MbCopernic Desktop Search

Click image to enlarge

The program interface causes extremely positive emotions - everything is done in accordance with generally accepted standards, nothing more, in a word a nice design. It’s very easy for a beginner to understand the interface of Copernic Desktop Search. Although it is somewhat confusing that the designers clearly created the program interface, given that the program will work in the standard theme of Windows XP. When using the classic theme, the program does not look so pretty anymore. But this is rather a matter of taste.

At the first start, the program offers to create indexes for the search. It seemed somewhat unusual that after selecting folders for indexing, the program does not offer to click any button, such as "Start Indexing", while indexing does not start automatically, only then it was noticed that Copernic was trying to start indexing while the computer was idle. You’ll have to rummage around in the options of the program to configure everything properly. It should be noted that there are quite wide possibilities for setting up automatic index creation: a built-in scheduler, the ability to index while a computer is idle, in the background, with low priority. Indexing was not too fast - 10 hours 51 minutes - it is slower than in other search engines (except Snoop, Copernic is still an order of magnitude faster than developing iSleuthHound Technologies.

Now about the structure of the index. In general, there is nothing special about it. It is possible to select file types, both in a generalized form and in a detailed one. That is, initially you can choose what you want to index - Documents, Images, Videos, Music. On the other tab of the options window, it will be possible to select specifically file types by extension. Additionally, you can configure the index so that, for example, pictures are not indexed, less than 16x16 in size, or sound files less than 10 seconds in length are not indexed. In addition to indexing files from folders, Copernic can work with emails and contacts from the address book of Microsoft Outlook and Microsoft Outlook Express, indexing of Favorites and History from Internet Explorer is possible.

As for the search capabilities, here they are very weak. During the tests, it was even revealed that the program does not search for documents in txt and html formats in Russian, allowing you to find them only by headings, but not by content. The only thing the program provides to increase the search efficiency is the use of a standard set of logical operations, and even then, this feature was discovered experimentally, since it was not documented. By the way, the program’s help is also not all right - it is available only through the Internet, which, you see, is very inconvenient, and there is not too much help information on the network. Apparently, the developers decided that the simple interface of the program does not imply the presence of normal help. Continuing the conversation about the search capabilities, it should be noted that, despite the weak analysis of queries, the program provides an interesting search system - the user can select the type of files (images, video, music, etc.), enter a search query and select attributes that are specific to selected file type. For example, for sound files, these can be values \u200b\u200bfrom mp3 tags (artist, album, date, etc.), for images, for example, you can choose their size (by resolution), in general, each type has its own settings. After searching for a specific type of file, the program will display a very informative list in the results window, and if other types of files come under your request, you can open them by clicking on a specific link.

We should also mention the results display window. The contents of these files are displayed below the list of found files (a similar scheme is often used in mail clients). True, viewing the text can only be done in the native format, and there is no plain text display mode, which is not always convenient, since opening a document in this case takes longer. But, given that Copernic can search for images and music, there is the ability to view these multimedia files.

The basic principles of this program are described, now let's see what Copernic Desktop Search can offer us to work with the network ... In principle, you can watch for a very long time, but you can hardly see anything. In other words, this program was not conceived as a network. Copernic Desktop Search is an exclusively home search engine.

Obviously, the only (most logical) application of this program is a home computer. Here, she will completely cope with all the simple searches of users, consisting of one two words, find the necessary information, and dividing the search by file type and supporting multimedia files, together with background indexing in low priority mode, coupled with a nice interface, only give the program strength to gain confidence among inexperienced users.

Official site
Distribution size: 2.6 MbISYS Desktop

Click image to enlarge

Very powerful program. In terms of equipment with all kinds of functions, it is located somewhere next to the next SearchInform search engine in the list. At the same time, the size of the installation file is more than 40Mb! It’s hard to say what could be thrust into such dimensions, because the same SearchInform, with similar functionality, takes up 15Mb.

The installation process here is also not too pleasant, or rather not even the installation process. Even before downloading the program, you will be asked to register, otherwise, no way. Next, the interface. It is made very nicely, nothing superfluous is evident, however - these are the impressions of a person who is already somewhat accustomed to it. Understanding where and what is located, where to click and where to finally search for a beginner will not be easy. It is highly recommended to read the help before starting work - save a lot of nerves and time. To everything else, a complete lack of support for the Russian language in the program is also added. Not good. In addition, the windows here are not overloaded with controls, but they had to pay for it with multi-modularity and the use of additional windows. For example, search queries are entered by running one program, and index management is done using another program. Search queries are also entered here in separate, appearing windows. Which is better - an overloaded interface or widespread multi-windowing - is hard to say, rather, it is a matter of taste.

As for creating indexes, the program provides opportunities to simplify the process of setting options for a new index. These features include several ready-made templates for creating indexes for the My Documents, Mail, Mail and Documents folder, a Specific Folder, a Folder with a choice of file types, etc. Such templates make it easy to create indexes on the first stage. The utility for working with indexes does not have a very good interface, which scares off some complexity (this is a very subjective assessment, in truth), however, if you look, it provides many useful options and in general its use does not cause much difficulty. ISYS Desktop can index data from various data sources, and also provides many flexible settings for such indexing. Among the additional features for indexing: support for SQL, FTP, TRIM Context, WORLDOX 2002, scripts. When creating an index, if you selected the "Folder with a choice of file types" option, you have the option to select the file types for indexing manually (by extension). I must say that there are just a huge number of supported file types, but you won’t be able to add your type (extension) to the existing list. You can also note the existence of an indexing scheduler. ISYS Desktop took 6 hours 13 minutes to create the index and process 20 gigabytes of information, ultimately showing a good time and the size of the created file - 7.9 GB.

The search capabilities of this program are not bad. What is used in ISYS is much more powerful than the usual support for logical operations. Of the advanced search capabilities, the program offers the use of synonyms, a sorting filter (by path, name and date of file creation). The set of logical operators is slightly wider than the standard set. In addition to logical operations, the program allows you to work with many other operators, which in principle are able to replace some types of search, for example, search with parsing can be completely replaced using special operators. It was very surprising that the program does not have a search using morphology. This is a serious omission, as the search efficiency is greatly enhanced when using morphological analysis. In addition, there is no list of meaningful words, but there is an extensive list of meaningless words. Search functions such as "approximate search" and "heuristic analysis" are also declared.

ISYS provides a choice of several types of search queries, namely, types - visual. This was done using different types of windows for entering search queries, however, in fact, no windows allow using technologies other than those listed above.

The search results are very informative, displayed as a list of documents sorted by relevance. A preview of the selected document is displayed below. Unlike Copernic Desktop Search, the preview here is available only in plain text, it was not possible to achieve displaying documents in the native format, whether it be Word, Html or PDF, although this is not very critical in principle. The program allows you to divide the documents found into groups according to certain criteria (by default they are divided by relevance). You can also view already found documents by selecting individual folders (this is convenient when the result produces a very large number of documents).

Using the program on a corporate network is also very justified, since it provides good opportunities for organizing network searches. The search system is based on the creation of a public index that contains indexed data from public network resources.

In fact, the program from ISYS is worthy of attention, at least familiarization with it. This program is a mature project with a huge number of functions (not always and not all, of course, they are needed, but still). The chances that the program will include some improvements from the processing of search queries are not known, but at the moment it can be recommended for almost universal use. And considering that it is still too heavy for home systems, the main places of its installation are corporate networks.

Official site:
Distribution Size: 40 MbSearchInform

Click image to enlarge

You should probably not start right away with a description of the SearchInform interface. First, you should describe the installation process, or rather one of its details: you cannot install the program without connecting to the Internet. The fact is that before the first launch, the program requires user registration (free) and sends all the entered data to the server. Apparently, the developers had to take such measures to combat piracy, but this did not have a positive effect on the ease of installation.

The program interface is made in compliance with all generally accepted rules, however, at first glance, it is somewhat cumbersome. Using the program for the first time, it seems that it is too complicated, sometimes it’s not easy to remember which menu or tab contains the desired option, however, with longer use, the interface does not seem so terribly complicated. The main thing is to read the help beforehand.

Having a little understanding of the interface, we can begin to create an index. The process itself is very simple and the indexing speed even by eye is significantly higher than all the other search engines from the review. Clear test numbers show that SearchInform is twice as fast as dtSearch and iSYS in indexing speed! The program indexed the provided data in the amount of 20 gigabytes in record time - 3 hours 17 minutes. And the size of the index created was the smallest 4.4 GB - 100 megabytes less than Google Desktop Search.

The program supports, in addition to ordinary files and folders, also indexing emails, connecting and indexing databases (!) And other external sources (DMS, CRM), immediately when indexing, you can specify a dictionary for morphological searches, and all attributes can be indexed files. After creating the index, when you try to conduct the first test search of documents, you can be confused: "there are two types of search, but which one do I need?". As mentioned earlier - the main thing is to read the help, then everything will become clear. The program really knows how to carry out two types of searches - a phrase search and a search for documents similar in content to the query text.

A description of all the basic functions for analyzing a search query was given above, so now we only list the search capabilities provided by this program. Let's start with a phrasal search: of course, morphological search, citation search, logical operations, search with parsing of a word (search at the beginning of a word, at the end, in the middle part, or complete match), mixed citation search (when all words from the query must be present in the document, but not necessarily in the entered order), search with error correction, use of synonyms, "almost quote search" (search for an entered phrase as a quote, but other words may be between the entered words), etc. Some of the listed options have their own specific settings. In addition, there is the possibility of using a dictionary of insignificant words, and the program already has a ready-made list of these words, you can also use a dictionary of priority words to search (of course, you will have to fill it out yourself).

Here, in principle, briefly ran through all the basic features of the phrase search.

Let's move on to the features of this program - the search for similar documents. The developers claim that this is by no means a simple search for text, it is precisely a “search for similar ones” - that is how they are described everywhere, but oh well, you can call it whatever you like - the main point. Short searches on the Internet can quickly provide information that the so-called “search for similar ones” is a new development in the field of text analysis. This system allows you to find texts that are similar precisely in semantic content. The most pleasant thing was that after conducting test searches, it turned out that the theory is completely consistent with practice! The program really searches for documents similar in content and displays them in a list, sorting by percentage of similarity.

Next, consider what SearchInform (in particular, its corporate version of SearchInform Corporate) offers to work on the corporate network. There are two types of applications: the server part and the user one. The server side independently processes the specified indexes, and users can use them to search, depending on the access rights assigned to them. Users can be configured automatically using Windows accounts (in professional language, SearchInform uses NTFS authentication of Windows) or manually (users will have to be added separately). Each user can be allowed or denied access to certain indexes, and users can also be grouped. In general, the settings for working on the network at SearchInform are ahead of the flexibility of Google, and the convenience and simplicity of the Snoop Server.

Official site:
Distribution size: 14.7 Mb

Search system	Indexing time	Index size
Snoop Prof Deluxe 4.5	38 hours 46 minutes	19 GB
Isys Desktop 7.0	6 hours 13 minutes	7.9 GB
DtSearch 7.0	6 hours 3 minutes	8.6 GB
Google Desktop Search Enterprise	8 hours 17 minutes	4.5 GB
Copernic Desktop Search *	10 hours 51 minutes	7 GB
SearchInform 1.5.02	3 hours 17 minutes	4.4 GB

* Most of the .html and .txt documents containing the Russian text, although they were indexed, but besides the names, it was impossible to find them.

All programs are worthy of attention.

Based on tests and a careful examination of each program presented in the review, certain conclusions can be drawn. So, Google Desktop Search Copernic Desktop Search is quite suitable for an inexperienced user as a home information search system. They do a good job with simple queries, do not load the user much with the settings, and, moreover, are completely free. Google’s attempt to enter the market of corporate search engines is not yet very justified: for full-fledged work, the program needs to be weighted with additional modules, and it’s far from simple to configure. Therefore, the spelling names of Desktop Search, that Copernic, that Google lays behind them the niche of "desktop" search engines.

True, more powerful solutions - dtSearch, iSYS and SearchInform are also not shaky and offer users their "desktop" versions. But at a reasonable price, unlike free software from Google and Copernic. Of course, you have to pay for power, speed and functionality. But the developers of dtSearch, iSYS and SearchInform make the main aim, of course, on the corporate sector. Networking, functionality, indexing and search speed - this is what distinguishes these products from their "competitors". According to the test results, the favorite was identified - SearchInform. The program provides the ability to search for similar documents, has the highest indexing and search speed, has a good set of functions.

The task of searching the contents of files is, in principle, not new - from time to time I have to search for texts or pieces of code in several files. Those who use and understand Linux are easier, because there is a special grep function for this solution. Under 7 I met some articles about expanding the capabilities of basic search by indexing the contents of files, but decided to still find a suitable program. Although, in principle, I used to do everything manually, assuming that it would take more time to learn the appropriate software.

Download WinGrep is completely free, takes only 730Kb. Almost all versions of Windows are supported: 98, 2000, XP, Vista and Windows 7. Unfortunately, I don’t know anything about the latter, because I have a "seven".

The process of finding text in files

Let us consider in more detail the process of finding text inside files. Immediately after starting WinGrep, an assistant window will appear, which will help us solve our problem in a few steps.

At the first step, you will need to determine which text you will search for and indicate the type of search: using regular expressions, fast, similar to your phrase. You can also mark case-sensitive options or search only the entire word.

It is possible to note several at once + activate a search on the contents of files inside subdirectories. The interface, of course, is not the most modern :)

In the next step, specify the file extensions that will be processed.

To speed up the work, you can mark only certain types of files that you need. If you want to include everything in the list, choose the universal value "*. *". You can add your own extensions.

Here you will see some statistics on the work done. In the toolbar with icons, you can restart the procedure, replace text in files, save and other options.

By the way, if you consider yourself an advanced user and are familiar with the Grep team, then in the Options menu you can enable Expert Mode. After that, the search settings dialog box will look a little different.

In addition, several additional options will appear in it. Beginners should not do this, and if they did, you can switch back in the same menu item Options.

Features of the search program inside WinGrep files

In addition to implementing the grep function in Windows, the program for searching for text in files has the following features:

Available for both beginners and advanced users. The former work with a step-by-step assistant, for the latter there is an extended Expert Mode.
Support for simple text files (including UNIX-style): program sources, HTML, RTF, batch files, etc.
It functions with binary files such as Word documents, spreadsheets, databases, DLLs and even EXE-shniki.
Replacing text. Right after you find matches, you can replace them with another text line (in one or all files at once). Fast and safe.
Saving and printing search results for file contents.
You can use the command line interface.
You can save your search criteria for future reference.
Multitasking is supported, you can minimize the application to tray.
Processing ZIP archives.
WinGrep integration in Windows Explorer allows you to run the utility using the context menu from any directory.
Easy installation.

All in all, WinGrep is a great solution! As I said above, you can run in Windows 7 a search for the contents of files from a regular Search, but working with the program is much simpler. Installing and understanding the interface is a matter of minutes. Distributed for free, looking fast enough. Now I will use only it to search for text in files. The only thing is not clear how the software functions on the latest version of Windows. Perhaps someone has already tried? - write in the comments.

The search for text in documents of the doc, xls, pdf format is something that I wanted to mention for a long time. But it’s not a search within a single document, it’s quite simple - Ctrl + F everyone knows that, but a search for a Russian word, for example, in 10 or\u003e documents. To open each and search manually is real, but long. And if there are hundreds of documents / files, but you just need to find everything, for example, Vasil Petrovich ... I wanted to talk about such a search in more detail.

Search for text in files (in English)

Text search (in English) in files such as * .txt, * .html can be done using, for example, Total Commander 6.53. At all Total commander - an indispensable file manager, if you do not use it yet - it costs download and start using! It provides very good navigation on your hard drive, and the two-window structure allows you to do several operations with any files at the same time. And so, you can search for a word / several words by pressing Alt + F7, and check the box “search with text” and ok! But searching in several files like * .doc, * .xls is beyond his power. You need to use another program.

Search for Russian words

The tests that I conducted with various programs for finding Russian words in files showed that FindFiles3 is worthy of attention. It is specifically designed to search for files by name and / or content. Where the standard search does not show the found text fragment - FindFiles will easily find everything.

The program searches for the desired fragment in several encodings at once. Found text shows in a separate field and highlights, etc. The program call is embedded in the Explorer context menu "Find files, contents ...".

The program interface is quite simple. You can figure it out without difficulty. In the upper left corner you set the search parameters. In the "In the folder" field, you specify the path or paths to search, i.e. in which folders to search. To specify a search mask, use the “*” symbol. For example * .doc, all doc files will be found using this mask. You can specify multiple search masks separated by commas or semicolons.

If you are looking for files, in which the specified text fragment occurs, then you need to specify this fragment in the "Text" field. Files opened by other applications may be blocked. In this case, select the "Show blocked" checkbox. These files will be marked red flat in the general list of found files. There is an opportunity to continue the search among those already found. To do this, check the corresponding box. If you are looking in the found, then the previously found files are tinted.

You can set the search term by file date, etc. After the search, you can sort the found files in any of the columns. To do this, just click on the column name. You can also continue the search by changing the search criteria.

Finding Text in Files - Practice

Download and install the program.

Now you need to specify the folder with the files in which we will search. And the file format, respectively.

For clarity, I indicated a folder with 134 files. And only one has the search word. Click "Find"

And after a few seconds, the program found the file in which this word occurs. And also a piece of text, which is very convenient!

That's all! Now you can search for Russian text in many files at once!

If you have already worked with the program, and know how best to search for the Russian text, please share your experience by writing a review. He can help someone!

There are different situations when you need to find a file among thousands of others, and only a part of the text (or code) is known. For example, when programming a site, having inspected the source code, you need to find in which file the processing and output take place. In what way search for a file by text contained inside? For search files with specific text I recommend using Total commander, since it searches for text files quickly, accurately, and has several useful search options. Let's consider in more detail how to accomplish this.

To get started, download from the official website of Total Commander by clicking on the link to download Total Commander and install it. (the official version is fully Russified and free).

Then run Total Commander. At startup, he will ask you to press one of the three digits, because the program is shareware, but has no limitations in functionality (not found).

Before us appeared two windows in which you can travel through folders. In any of the windows, select the folder in which we will look for a file with specific text. In my case, I need to find a file with the text "pagination_previous". We press the button the binoculars located in the upper panel of the program. Next, put a mark next to the inscription "With text", enter test to search in files, put marks next to the ANSI and UTF-8 encodings and click "Start Search".

After Total Commander searches, the list of files in which it found the search text appears below.

Now you need to find the text directly in the file. How to find text in a file? The most convenient way to use a notebook Notepad ++ to search for text in files. Download the latest version of Notepad ++ from the official site.

We install this wonderful notepad and open the found file through it. Press CTRL + F (two buttons at the same time). Will open file search box. In the “Find” field, enter the search text and press “Enter” on the keyboard. Notepad ++ will quickly find the text in the file and highlight it in green. If you press “Enter” again, Notepad will continue to search for text in the file further. If the same text is repeated, it will move to it and also highlight in green.