PHP Classes

PHP Google Crawler: Perform Google searches and get the result URLs

Recommend this page to a friend!
  Info   View files Example   View files View files (5)   DownloadInstall with Composer Download .zip   Reputation   Support forum (1)   Blog    
Ratings Unique User Downloads Download Rankings
Not yet rated by the usersTotal: 201 This week: 1All time: 8,471 This week: 560Up
Version License PHP version Categories
php-google-crawler 1.0.0GNU General Publi...5PHP 5, Searching, Web services
Description 

Author

This package can perform Google searches and get the result URLs.

It can send HTTP requests to the Google search Web servers to perform searches for given keywords.

The package can parse the results and extract the URLs of the search result links.

Picture of Juraj Puchký
  Performance   Level  
Name: Juraj Puchký is available for providing paid consulting. Contact Juraj Puchký .
Classes: 17 packages by
Country: Czech Republic Czech Republic
Age: 41
All time rank: 109511 in Czech Republic Czech Republic
Week rank: 52 Up1 in Czech Republic Czech Republic Up
Innovation award
Innovation award
Nominee: 6x

Example

<?php

require_once 'implementation/Google.php';

use
App\ContactCrawler\Google;

function
collectEmail($url) {
   
$s = new \App\ContactCrawler\Search();
   
$data = $s->getData($url);
   
libxml_use_internal_errors(true);
   
$emails = [];
   
$dom = new DOMDocument();
    @
$dom->loadHTML($data);
   
libxml_clear_errors();
   
$results = $dom->getElementsByTagName('a');
    foreach (
$results as $r) {
        if (
strstr($r->getAttribute('href'), 'mailto:')) {
           
$emails[] = str_replace("mailto:", "", $r->getAttribute('href'));
        }
    }
    return
$emails;
}

function
collectUrls($url) {
   
$s = new \App\ContactCrawler\Search();
   
$data = $s->getData($url);
   
// subpages
   
libxml_use_internal_errors(true);
   
$urls = [];
   
$dom = new DOMDocument();
    @
$dom->loadHTML($data);
   
libxml_clear_errors();
   
$results = $dom->getElementsByTagName('a');
    foreach (
$results as $r) {
       
$urls[$r->getAttribute('href')] = $r->getAttribute('href');
    }
    return
$urls;
}

if (isset(
$argv[1])) {
   
$google = new Google();
   
$urls = $google->search($argv[1], "cs", 10000);
   
$emails = [];
    foreach (
$urls as $url) {
       
$pages = collectUrls($url);
        foreach (
$pages as $page) {
           
$ems = collectEmail($page);
            foreach (
$ems as $email) {
                echo
$email . "\n";
               
$emails[$email] = $email;
            }
        }
    }
    foreach (
$emails as $email) {
        echo
"$email\n";
    }
} else {
    echo
"Help: <keyword>\n";
}


Details

contact-crawler

Simple Crawler for collecting emails by kewords with google.

Usage: php crawler.php "keyword"


  Files folder image Files  
File Role Description
Files folder imageimplementation (2 files)
Files folder imageinterfaces (1 file)
Accessible without login Plain text file crawler.php Example Example script
Accessible without login Plain text file README.md Doc. Documentation

  Files folder image Files  /  implementation  
File Role Description
  Plain text file Google.php Class Class source
  Plain text file Search.php Class Class source

  Files folder image Files  /  interfaces  
File Role Description
  Plain text file ISearch.php Class Class source

 Version Control Unique User Downloads Download Rankings  
 100%
Total:201
This week:1
All time:8,471
This week:560Up