Web Crawler with Email Extractor and Image Extractor

May 31, 2014


 Web Crawler:

The WebCrawler indexes both document titles and document content using a vector space model. Users can issue queries directly to the pre-computed index or to a search program that explores new documents in real time. The database the WebCrawler builds is available through a search page on the Web
Email extracting is the process of obtaining lists of email addresses using various methods for use in bulk email or other purposes. You may need to harvest email addresses when you are conducting a marketing campaign, or when you want to find out something, or send an email to a massive, but targeted, audience. This program is a spider that will detect emails in web sites, through search engines, or just from a file saved on your computer. One of the main reasons many people use Email Extractor is to increase their email campaigning success. Email campaigning requires you to use thousands of email addresses at a time. Trying to find these email addresses manually isn’t just a chore; it can be downright impossible.
 Image Extractor:
Interest in the potential of digital images has increased enormously over the last few years, fuelled at least in part by the rapid growth of imaging on the World-Wide Web. Users in many professional fields are exploiting the opportunities offered by the ability to access and manipulate remotely-stored images in all kinds of new and exciting ways. However, they are also discovering that the process of locating a desired image in a large and varied collection can be a source of considerable frustration. The problems of image retrieval are becoming widely recognized, and the search for solutions an increasingly active area for research and development.
This project implements various new methods for the effective retrieval of images i.e. which can increase the computational complexity as well as accuracy and tries to solve the problem that was there with the earlier developed systems.

Team Member Details:

Abhinav Gupta , Nitish Parikh ,Rishabh Singh

B.Tech, CSE, 4th year students

Jaypee Institute of Information Technology